Joe Jordan

the geek of hearts

One syntax to rule them all

| Comments

Ladies and gentlemen, will someone please go and inform the people flamewarring heroically on the PHP threads, interrupt the Pythonistas from their academic seminars, and the Rubists from their investor meetings, let the C++ purists out of their caves (and let them breathe real air again): We Have Found It. The Winner. The Best Language Ever – they can stop arguing and go home. Who knew? Lisp was the answer all along.

Apparently syntax has a meaning when looking at natural language. This is of course lost on computer programmers like me, who know syntax as a tangled web of which symbols are allowed before which other ones – e.g. an identifier can contain any letter, number and the underscore but must not start with a number, or that f(x) = g(x) is not allowed, because f(x) isn’t an lvalue. This arcane set of rules, which are enormous and different for all the different programming languages, gradually become a source of pride – an Arcanium to which only we the initiated know the secret password*.

This attitude is no doubt why when I first tried to learn Lisp I didn’t notice something important – it has only one syntax rule^. Sure, a + b becoming (+ a b) was all kooky and esoteric - exactly the sort of over-complication I expected from the language used exclusively (it seemed) for hyper-theoretical computer science tomfoolery. I buried myself in learning how to use (loop ...) and so on in order to bash through the first few Project Euler problems, and then got a headache from thinking backwards and forgot about it.

Skip forward to last week, and I pick up the October Hacker Monthly from my mat# and after I had excitedly consumed The Tesla Gun I saw the interview with Peter Seibel, author of Practical Common Lisp (which you can and should read online at that link.)

It took me a few hints, but once I understood what an “S-expression” was, I was agog. How had I not discovered this idea – that of representing code as a fundamental data type of the language itself – before? The jargon word for it (supplied by a good friend who actually studied computer science) is Homoiconicity, which I’m still having fun trying to guess how to pronounce.

Now obviously I’d heard the phrase ‘code as data’ before, and I understood that that’s vaguely how assembly language had to work. But this was something new - a compiled language, with readable source code, that can manipulate itself. Now I see why the US government invested so much money in the 80s into AI research in Lisp!

Common lisp takes things a little bit too far, I think, with special operators like (if ...) and similar bending the rules a little. You basically only need one syntax to rule them all:

; execute function with the tokens as the arguments:
(function token token token)

and one caveat to, in the darkness, bind them:

; don't execute this, just return the expression as data:
'(token token token) 

Note that this obsoletes most of the terrifying rules that we programmers have loathed, memorised and gradually come to wear as a sort of cloak of obscurity - Lisp tokens can contain anything which isn’t a (, ),   (a space) or '. It is common to write globals, for example, as *global*, and (+ ...) is a legal function name, not a hack$. A quick aside on Turing Completeness – obviously a language without if is not Turing Complete. I just think that Lisp syntax should be (if (test) '(then) '(else)), or something more complex to allow else ifs, rather than a special magic exception so that it looks cleaner without the 's.

Now, perhaps you need some built-in functions like (eval ...), (+ ...) and (defun ...) to be implemented by the compiler, not forgetting (if ...) too, but ultimately every program can be expressed, concisely and as “clean code”&, in the form of S-expressions and escaped S-expressions. If we also allow (defmacro ...) as a compile-time token manipulator then we can allow embedded / domain-specific languages within our code, which means we can express concepts even more tersely.

And the awesome thing about the programs we write in Lisp? We can compile them to machine code. While it also has an interactive session, this isn’t an alternative to Python – it’s an alternative to C. And it’s as old as Fortran, and there are a dozen implementations ready for download, including open source ones. Now, we can finally understand what this comic means, deep down inside.

Despite my facetious opening, the flame wars will continue. Other languages will still be used productively to do wonderful things; people using them them will make website customisation accessible to even more people, discover new scientific results, make obscene amounts of money or continue to practise the dark art of bit-wizardry. Some of their code might even be elegant. Their successes are not any less successful because they didn’t use Lisp. But languages come and go, and complicated combinations of syntax will always require experts to interpret and modify. Gradually, like Fortran and Cobol before them, the non-Lisps will fade, a few dependable libraries buried deep in the implementations of shinier, newer programming toys. In contrast, when Lisp is celebrating its 100th birthday (it’s currently 56), it will still be as relevant as Turing Completeness itself (assuming humanity hasn’t drowned itself in carbon dioxide and rubbish dumps.)

Now, if you’ll excuse me, I have to learn enough about the ridiculous compromises of Common Lisp on perfection to be able to rewrite everything I’ve ever coded in it.

* by the way, the secret password to programming is “google”.

^ arguably two, but I’ll get to that in a minute.

# it takes a week for HP to ship magcloud stuff from the States to Europe - I’m not sure why they don’t just buy a printer and an office on this continent and save the Transatlantic shipping fees…

$ When writing some physics code in C I had to switch to a C++ compiler to be allowed to use + on vector objects (and thus have readable code.) Luckily, I didn’t fall into the trap of using any of C++’s other syntax, as that way, we all know, lies madness.

& By Uncle Bob, the book has all its examples in Java (eurgh?!) but clean code is one of those concepts that transcends language.