On Beautiful Code

Hang around in tech long enough and you might notice that some programmers have taken to declaring certain pieces of code beautiful. Once, during a product demo involving some live-coding, I heard someone in the audience declare a slick one-liner “a beauty”; in college, a professor paused mid-lecture to admire the elegance of a particularly clean abstraction; and when a senior engineer at my first job—let’s call him Chris, because that’s his name—spotted a bottleneck in Scala and surgically replaced the sluggish part with some virtuously written C++, a wide-eyed junior dev blurted out, “Damn, that’s beautiful, Chris.” (Fine, the last one is me.)

I admit that seeing well-composed lines of code can be gratifying, but is “beauty” the right word? Few philosophical forays have been as inconclusive as the attempt to define beauty. (Philosopher Nelson Goodman wryly observed that theorists attempting to specify aesthetic experience are looking for “aesthetic phlogiston.”) For Kant, it was “disinterested pleasure.” Santayana called it “objectified pleasure.” A certain “formedness” for Plato and Plotinus and the “sensuous appearing of the Idea” for Hegel. Iris Murdoch saw it as “an occasion for ‘unselfing’”; Elaine Scarry wrote that it “brings copies of itself into being.” Alexander Nehamas believes “your life will be better if that is a part of it,” while Crispin Sartwell calls it “the object of longing.” For Stendhal, it is simply “a promise of happiness.” What gives?

Because little discussion exists around what makes code beautiful, it helps to look at a neighboring field that has a longer history of discourse on the subject: mathematics. Mathematicians, normally a precise bunch, have a way of retreating into that squirrelly word, beauty, when speaking of the discipline’s highest virtue. When surveying discussions of mathematical beauty, however, a fair amount of schmaltz and abstraction seems to creep in. Normally paragons of rigor, some mathematicians suddenly become romantics. Bertrand Russell once described, in oddly lascivious language, mathematics as “a beauty cold and austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeous trappings of painting or music, yet sublimely pure.” Some become mystics. Arthur Cayley, a 19th-century British mathematician, said, “As for everything else, so for a mathematical theory: beauty can be perceived but not explained.”

Purple prose abounds. Euler’s identity (e^iπ + 1 = 0) is, Stanford mathematician Keith Devlin writes, “a Shakespearean sonnet that captures the very essence of love,” which “reaches down into the very depths of existence.” Even Paul Erdős, when asked why numbers are beautiful, failed to articulate: “It’s like asking why Beethoven’s Ninth Symphony is beautiful. If you don’t see why, someone can’t tell you. I know numbers are beautiful. If they aren’t beautiful, nothing is.” Unless he’s employing a proof technique I don’t recognize, this sounds an awful lot like the hackneyed you-know-when-you-see-it obscenity defense.

A more concrete place to start is British mathematician G. H. Hardy’s 1940 essay A Mathematician’s Apology. It’s something of a sacred text enshrined in the hearts of aspiring mathematicians—with all that entails—much like Surely You’re Joking, Mr. Feynman! is for young physicists (or think Patti Smith for the mathematically inclined).

Hardy presents two classic theorems as exemplars of mathematical beauty: Euclid’s proof that there are infinitely many prime numbers, and Pythagoras’s proof that √2 is irrational. Both theorems, Hardy writes, possess “a very high degree of unexpectedness, combined with inevitability and economy.” He goes on to enumerate six criteria in total (economy, generality, depth, significance, unexpectedness, and inevitability) though he ultimately acknowledges the inherent ambiguity in what qualifies under them and doesn’t provide many concrete examples. Thus a good companion text is MIT mathematician and philosopher Gian-Carlo Rota’s The Phenomenology of Mathematical Beauty, an equally brilliant, but less well known—though perhaps more rigorous and sober-toned—exploration that both grounds and challenges Hardy’s points.

For instance, Rota discusses the theorem that there are only five Platonic solids: tetrahedron, cube, octahedron, dodecahedron, and icosahedron. As he notes, it’s not merely their generalizability that makes them remarkable but also their unexpectedness. In other words, if you examine every molecule and object in the entire universe—from subatomic particles to entire galaxies, all the way to the very edge of this expanding cosmos—not a single physical structure that violates these mathematical truths will ever appear. That’s quite something.

Or take Euler’s identity, an almost occultic formula where nature’s fundamental constants are arranged so compactly with seeming inevitability—like a neatly folded piece of origami. Some have even called it proof of God. I don’t know about that, but despite its tendency to put mathematicians in a melodramatic mood, it may hint at a coherent structure underlying the deep fabric of nature: a certain depth and inevitability.

Rota goes on to note that all mathematicians agree Picard’s theorem, with its astonishingly concise five-line proof, is beautiful—a case of economy at its finest. The theorem states that “an entire function of a complex variable takes all values with at most two exceptions.” If that sounds abstract, I’ll attempt an analogy: imagine a standard dartboard, like the ones you see in dive bars, with different sections corresponding to various scores. Now, picture a world where dartboards are stretched, twisted, and warped into wildly contorted shapes. Picard’s theorem guarantees that in this world, no matter where you throw, your darts will still pierce through nearly every possible scoring region, missing at most two.

Yet another form of beauty—call it “interconnectedness”—emerges when seemingly unrelated areas of mathematics suddenly link together, much like when a writer blends two distinct styles to create something new—take J.M. Coetzee’s Elizabeth Costello, which fuses the novel of ideas with the academic lecture, or Nabokov’s Pale Fire. Andrew Wiles’s proof of Fermat’s Last Theorem did this with elliptic curves and modular forms. More recently, Fields Medalist June Huh solved longstanding combinatorics problems by connecting them to algebraic geometry.

These mathematical schemas give us a good place to start unpacking the idea of beauty in programming. Let’s begin with economy, as I suspect that when programmers think of beautiful code, the first quality that comes to mind is its conciseness—succinct, tightly written code.

A concept in theoretical computer science that best relates to this notion of economy is “Kolmogorov complexity,” which measures the length of the shortest possible program that can reproduce a given string. For example, the string “AAAAAAAAAA” can be described by a simple program like Print ‘A’ ten times. More generally, a string consisting of N repeated ’A’s can be represented as Print ‘A’ N times, which results in low Kolmogorov complexity (and low randomness). But a random-seeming string like “uXyKcdjmc@jrFdDBh2ruEoddHBx3Te,” which has no shorter way to be described than by stating it outright, has high Kolmogorov complexity. As an analogy, think about how the most intricate and original literature is irreducible to a summary; it can only be fully understood by reading it in its entirety.¹

If two programs achieve the same result, the shorter one is often considered more economical. But I’d say there are two kinds of economy—the deep kind and the cosmetic kind. Cosmetic economy, while not mutually exclusive with the deep kind, is more common in languages like Haskell or Lisp, where syntax allows for concise expressions that would be much more verbose in other languages. For example, a sorting function that spans multiple lines in some languages can be expressed in Haskell as:

But of course, conciseness can slide into obfuscation, like code that pushes minimalism to the point of absurdity. Printing the list of all powers of 2, a simple task, can devolve into cryptic snippets like this:

This is false economy; hence false beauty.

Or take the REPL, a common interactive tool—you can see it in action simply by opening your Terminal application and running a few commands:

Python, a language known for its readability, handles this with minimal effort:

In C, the same functionality becomes more verbose:

And if you want to write it in—god forbid—Java:

Meanwhile, in Lisp:

That’s it. The usual boilerplate of other languages is stripped away, leaving the logic in its purest form.

That’s economy at the file or line level. But line-level economy is often trivial. What’s truly compelling—what holds real aesthetic value—is economy at the level of the entire codebase.

Take AWK, a language developed at Bell Labs in the 1970s by Alfred Aho, Peter Weinberger, and Brian Kernighan (hence AWK, from the first letters of their names) and maintained over the years by Kernighan himself, a key member of the original UNIX team and the person behind the “Hello, World” convention.

The language is just a few thousand lines of code (full source code is available on GitHub) but over 48 years, it has evolved, not unlike revising a long prose poem over decades—Walt Whitman continually revised “Song of Myself” across multiple versions, and if anyone in computing has a comparable status to Whitman, it is Kernighan—keeping the overall codebase lean and tightly structured, carefully refactored to follow modern conventions and expanded with new features (e.g., Unicode support). The same could be said of the Linux kernel, a kind of computational tourbillon tended by a guild of dedicated horologists, whom we call Linux maintainers.

Next, generality. Turing Machines, the very embodiment of generality in computing, are inseparable from any discussion about programming languages. What Turing did was formalize a question so intuitive yet elusive—What does it mean to compute something? Computing is something humans have done for millennia, but what does it actually mean to compute 4 + 5 = 9? The Turing Machine, put in a simplified way, provides one way to “define” computation and shows that any computation, no matter how complex, can ultimately be performed by a Turing Machine.

Notably, in The Phenomenology of Mathematical Beauty, Rota differentiates between beautiful theorems and beautiful proofs. (Consider books with a profound thesis written in turgid prose versus those with both intellectual depth and elegance in execution.) Rota also notes that elegant proofs and beautiful proofs aren’t the same: elegance is about presentation, while beauty is about truth.

Rota’s distinction is relevant here because while the Turing Machine is a beautiful theorem, its proof, stated in the landmark paper On Computable Numbers, with an Application to the Entscheidungsproblem, is not exactly an apogee of elegance. Its descriptions of the tape, head movements, and state transitions are somewhat mechanical and verbose.

But Turing wasn’t the only one to formulate a theory of computation. Alternative formulations were developed by Turing’s advisor, Alonzo Church, with lambda calculus, and by Stephen Kleene with what are called recursive functions. Without going into the details of the proofs, the important point is that what they “independently” tried to formulate was, in fact, equivalent. (This is the Church-Turing thesis.) And among these results, Church’s lambda calculus proofs may be the most elegant and concise. Kleene’s recursive function proofs, perhaps the most technical of the three, could be considered less elegant than Church’s and less intuitive than the Turing Machine’s.

Some forms of beauty in programming appear to go beyond the criteria often discussed by mathematicians. One of the more well-known remarks on code aesthetics came at an unlikely venue: the 2016 TED conference, where a perpetually irritated Linus Torvalds (creator of Linux and Git) was being interviewed by Chris Anderson.

Speaking about “good taste” and “bad taste” in code, Torvalds presented two code snippets that performed the same task—removing an item from a data structure called a linked list. Both were functionally identical, but one was structured in a way that eliminated entire classes of potential bugs—without unnecessary complexity.

This kind of beauty, I think, resembles the elegance of good industrial design. Good code, like well-engineered machinery, eliminates certain types of failures by design. Think of a dead man’s switch on a lawn mower, which stops it from turning into a runaway buzzsaw on wheels if the operator releases their grip. Similarly, well-structured software prevents entire categories of errors simply through the way it is written. Call it clarity or even safety.

If there is a quality truly unique to programming, I’d say that it’s hackiness: not “hack” in the sense of malicious exploits, but in the sense of ingenious, gratifying solutions.

A famous example—famous enough to have its own Wikipedia entry and be familiar to a non-gamer like me—is the fast inverse square root algorithm, a rogue code snippet buried in the Quake III engine. Calculating an inverse square root (e.g., for x = 9, the inverse square root, 1/ x, is 1/3) isn’t usually the most intricate mathematical operation, but finding a way to compute it repeatedly and efficiently is a different matter. In the 1990s, real-time 3D graphics relied heavily on computing inverse square roots for lighting and shading calculations. Traditional methods—based on division and floating-point operations—were too slow for the demands of fast-paced rendering, thus high-speed gameplay.

Then came this code—one that an entire generation of ’90s gamers was unknowingly indebted to—which cleverly traded a bit of accuracy for a significant boost in speed. Here’s the code, with its original comments intact:

“0x5f3759df” is where the magic happens — this algorithm treats the bits of a floating-point number like an integer, shifts them around, and subtracts that mysterious constant, which yields a good enough approximation while avoiding the expensive math.

Yet none of this answers whether beautiful code—by dint of aesthetics—is necessarily good code, or whether beauty is a quality programmers should prioritize over other considerations. A few years ago, I started to think that every action or endeavor involves a mix of—and tradeoffs between—three values: beauty, utility, and morality. Something can be highly beautiful but low in utility and more or less morally neutral (painting landscapes). Another may be beautiful and highly useful but morally troubling (designing sleek fighter jets). Yet another may offer no beauty but be supremely useful and moral (inventing a water filtration system). What I value most—and how I rank these qualities—changes over time. The only invariant is that utility is never first and morality is never last.

Different domains give weight to different values. It’s a mistake to demand utility from poetry. And investigative journalism, even when not a single sentence shines, may still push toward a more just world. Speaking for myself, when there’s a tradeoff—and there always is—I don’t think programmers should be too insistent on beauty. After all, having laid out what can be thought of as programmatic beauty, I wonder if “beauty” is too generous a word. Even the most elegant codebase does not give me the same soul-piercing jolt as reading, say, Nabokov or Rachel Cusk. In other words, good code can only be so beautiful.

The romanticization of beauty is often presented as a virtue when, in truth, it can be a telling sign—whether knowingly or unknowingly—of the neglect of other virtues. Hardy valued pure mathematics for its supposed “uselessness”—the idea that, detached from real-world applications, it could not be harmful. His disdain for applied mathematics was partly shaped by witnessing World War I, where mathematics was harnessed for practical ends that were often destructive. “I have never done anything ‘useful.’ No discovery of mine has made, or is likely to make, directly or indirectly, for good or ill, the least difference to the amenity of the world,” wrote Hardy.

Hardy’s stance, however, is problematic on two counts. First, the examples he upheld as “pure”—number theory and the theory of relativity—because they appeared to have no “warlike purpose” were, of course, later put to just such uses. Number theory became the foundation of modern cryptography, as seen in the breaking of the Enigma code. Meanwhile, relativity formed a key link in the chain that led to the atomic bomb.

Second, Hardy’s perspective smacks of a kind of class-blind snobbery—a belief that utility is somehow impure, that labor done to scrape by is beneath him—akin to a comment one might hear from a second-generation art gallerist who understands little beyond his inherited wall. Scientists or mathematicians who believe merely avoiding direct involvement in harmful applications is enough are like those who take comfort in not being Wernher von Braun—the Nazi rocket engineer—as if that alone were an accomplishment. The dismissal of usefulness and the patrician attitude toward “purity” are more about inflating one’s ego than making any meaningful statement about scholarly integrity; beauty becomes a decoy for evading moral responsibility.

What Hardy’s aristocratic obliviousness fails to acknowledge is that the pursuit of beauty—when divorced from moral considerations—is not as neutral as it seems. An amoral stance doesn’t remain neutral without a sustained ethical counterbalance, because the ground an ethical individual stands on is always slanted; before long, one inevitably finds oneself slipping down the slope of moral decay.

To put it more cynically, some mathematicians, like Hardy, want it both ways—not just as a technically rigorous discipline that showcases raw intelligence, but also as an elevated and aesthetically profound enterprise. One not only needs to be seen as a genius but as an artist—better yet, an aesthete. Yet it’s revealing that Hardy frequently condescended to other disciplines, claiming that ideas in paintings are usually “commonplace and unimportant” and that, in poetry, the importance of ideas is “habitually exaggerated.” Lacking the eye for other kinds of beauty, Hardy—if he was ever an aesthete at all—proves to be a parochial one, not a universal kind.

That programs are not a primary site of aesthetic experience is not a slight against programming but rather an acknowledgment that programmers do not need to justify their work by its beauty. Instead, they should lean into what programming does best: utility, for once, as a guiding principle. Utility is a value often viewed with contempt—for understandable reasons, given the industry’s long-standing impulse for utility maximization—and with suspicion, rightly so, since utility itself knows no morality. But when we practice programming so that beauty serves utility, and utility, in turn, serves morality, then useful programs may not always embody beauty, but if they are ever so good, they can uphold another and much needed virtue: morality.◼

^{1. Kolmogorov complexity underlies the mechanisms behind aphorisms—if you’re François de La Rochefoucauld, “No one deserves to be praised for kindness if he does not have the strength to be bad”; whereas a lesser writer might produce an overblown novel of excess.}