Superficial Simplicity

04 Jul 2022

For the last decade I’ve chased and wrestled with the ideal of “simple” software, but I’ve come to see it as a false summit and want to spend some ink on why in the hope that it can lead to a better understanding of simplicity and more intelligent conversations about complexity.

Those of you who’ve orbited around Clojure will recognize the scare quoted word from Hickey’s “Simple Made Easy” (transcript).

To summarize, Hickey differentiates between things which are “simple” in that they do one thing, those which are “complex” (he uses “complected”) which do more than one or “too many”, and those which are “easy” in that they are convenient for a task. The core thesis of the talk is that software which is “simple” is intrinsically better, easier to build and higher quality than software which is “complex”, and that we can build up “simple” solutions by “composing” solutions to “simple” subproblems.

Like grugbrain.dev, Hickey offers an ad-hoc intuitive appeal to what “simplicity” and “complexity” are. Many of the things which Hickey explicitly presents as complex have or are forms of nonlocal effects and nonlocal data dependencies, but some are just large tools. And in subsequent talks and years Hickey has leaned on the idea of “decomplecting” as having much of the same meaning as approaching problems by decomposition and trying to build tools which do one thing.

I think this is broadly useful commentary, and while imprecise it played a critical role in growing my thinking about software.

Okay so we’ve got a senses of what constitutes simplicity – doing one thing or being more focused – let’s consider what happens when we do that.

Into the T-shirt tarpits

There’s this brainworm in programming language developer circles of making “kernel” languages. A “kernel” is a minimal language or environment which can implement itself. They orbit the Turing Tarpit, and have a lot of “Give me a place to stand and with a lever I will move the whole world” energy. Aspirations to better future computing grounded on simplicity aren’t uncommon.

Kernel languages are a neat hat trick for a language author. They’re self-satisfying because they concretely demonstrate the power and utility of the language; after all, just look at it, it’s self-hosting!

Best of all these tools are simple in the sense that they have few parts and do little. Few of these languages feature generic types and fewer still feature inference – these features requiring lots of supporting machinery. More often we see interpreted languages with a small ‘kernel’ of special forms which the implementation must provide. Norvig’s “One Page Lisp” is an example of this particular school of thought, as are the examples Michael Fogus discusses in what he terms The German School of Lisp.

I don’t mean to dismiss kernel lisps as being merely intellectually pleasing toys for language developers. Simple languages have arguable benefits.

Having a simple core model for the system lets users learn the entire model. A challenge users face with large systems or systems with large specifications is that it can become difficult to learn them. As a student one must find an entry point or a thread to begin pulling on. Niklaus Wirth’s work on Pascal, Modula and ultimately Oberon is one of the realest forms of simple languages and is perhaps best understood in the context of these tools as tools for pedagogy – not perhaps as tools for software development. “The School of Niklaus Wirth” relates an infamous story of how Wirth eliminated a hash table contributed by a student to the compiler because it added complexity, which makes the most sense in this light. Simplicity in the literal school where Wirth teaches to this day doesn’t just serve an aesthetic purpose, it facilitates students.

Brodie’s “Thinking Forth” contains this incredible graphic (fig. 4.7) which is used to emphasize his claim that forth is approachable because it is simple in the sense that the reader/compiler/interpreter does only a few things.

from "Thinking Forth" by Brodie pic.twitter.com/vOGx5dGKz8
— arrdem#4301 (@arrdem) May 29, 2022

“The Cuneiform Tablets of 2015” considers one vision of computing Kay has forwarded which tries to be “simpler” in the sense of having a smaller specification. Particularly, it is a pursuit of the same idea of a kernel which kernel language pursue; one which Kay has repeatedly phrased as ‘the T-shirt computer’ with the challenge “what are Maxwell’s laws of computing?”

To “The Cuneiform Tablets of 2015”, the virtue of simplicity is as an aid to communication and re-implementation. Similarly to Shen, Kay et. all suggest that a “T-shirt computer” could be used as a vehicle for long-term experience and information conveyance. Rather than trying to design an image, sound or video format which is directly readable hundreds or thousands of years hence, they suggest it could be more practical to define and convey a “simple” computer, and then to convey media for that computer.

It’s critical that, to Kay, this computer is simple because it is of small definition or implementation. This is perhaps related to the intuitive notion of complexity which Hickey and Grug appeal to. Size of implementation is perhaps a kind of dependency, and certainly to a programmer who must conceptualize of the program it is perhaps a view of the rough dependency tree or “number of things” Hickey describe.

SectorLISP, or JonesFORTH, or SectorFORTH are really incredible examples of this train of thought of small implementations. Here you have multiple implementations of entire viable abstract computers in a tiny amount of code. What could be simpler?

Building up, not shrinking down

Steele’s talk “Growing A Language” (transcript (PDF)) begins to get at what I think is the crux of a refutation of this claim. While the presentation is dated and some of the productions Steel uses don’t pass muster it’s a really phenomenal example of what it’s like to live in one of these simple-and-yet-not environments. I’ve always enjoyed watching his face in the first nine minutes; trying not laugh on stage as he produces obvious definitions one after the other until he can explain what he’s doing.

As Steele put it so eloquently

If you want to get far at all with a small language, you must first add to the language to make a language which is more large. In some cases, we will find it convenient to add “er” to the end of a word to mean more. Thus we could take “larger” to mean “more large” or “better” to mean “more good”.

The joke and also point being that he would have used the word “better” but hadn’t yet defined the required rule of meaning and so couldn’t.

To summarize, Steele posits that a language must BOTH be “large enough” to be “useful”, and yet be “small enough” to learn. The conceit of the talk and core premise being that being of “small” size and being “simple” conflate.

Steele suggests the trick in balancing size against utility is that a language must empower users to extend (if not change) the language by adding words and maybe by adding new rules of meaning. The hope then is that the language need only grow a little bit. Users can grow a language where and when they judge best. Meanwhile the core language need only change to further enable users to grow the language, if at all. The core language then is but a common set of meanings and a framework for defining rules of meaning which allow users to begin to speak together.

Throughout the talk, Steele begs a question which the kernel language projects highlight and which reflects on the general question of defining let alone managing complexity.

How can a small (simple) language (or tool) truly be better by dint of being small if one must grow it before one can use it?

Just as Steele does not distinguish between words defined by a language and words defined by a user, neither does this question. A language which users must extend in order to use effectively is as suspect as a library so incomplete that it must be extended or wrapped to be useful. One can even extend this argument to whole programs such as UNIX tools, or software systems.

This nicely refutes the simple-future-computing-through-simplicity aspirations I’ve long held, and that kernel projects evince.

Simplicity in one component, is not enough. Simplicity as discussed so far is a local property. This specification is small and thus simple; that implementation is small and thus simple. Calling tools of small size and focused functionality simple is self-fulfilling and fails to provide a meaningful theory for whether or not things built with simple tools are themselves simple.

Perhaps we should instead be considering how languages or computers let their users say what they wish to say, and compute what they wish to compute without imposing on them undue costs.

Convenience? Or simplicity through design?

Clojure itself is an interesting case study in this trade-off between what one could perhaps call internal vs exposed or demanded complexity. Clojure is not self-hosting, leaning instead on a Java compiler, core data structures written in Java and pair of large Java classes that provide bridges from Clojure to Java. Clojure’s implementation is reasonably complex, but it uses that complexity budget to try to paper over intricacies of the JVM and present users with a cleaner slate. For instance despite the conceptual simplicity of data that doesn’t change, the concrete implementation of fast immutable data structures is considerable.

The authors of Clojure don’t think you should be using classes to represent or encapsulate data. As a result, there are good literals for writing immutable maps, sets, vectors, strings, numbers, symbols and composites of these data types without interacting with their Java underpinnings. Meanwhile it would be difficult and far more verbose to write out comparable structures of objects as one would in Java.

Limiting the state space of data through immutability helps Clojure users manage complexity. Having convenient notation for complex forms of data helps Clojure users make the “right choice” by default. Having an interactive environment where users can explore transformations on data helps users build up their programs incrementally in a functional and compositional style.

Would making the tool itself simpler (do less) enhance these properties? After a lot of experimentation, I don’t think so.

Another interesting case study comes from the other side of the “better than Java” debate – Kotlin. Kotlin’s extension functions are an incredible hack which provides enormous ergonomic leverage without introducing fancy language features. In a LISP, nobody would bat an eye at defining a new WITH-FOO macro that sets some foo context and runs a body. Take for instance a closing macro which, which runs a body “with” an closable resource.

;; ~  enunciated 'unquote' is for inserting an expression
;; ~@ enunciated 'unquote-splicing' is for inserting many expressions
;; & is Clojure syntax for accepting zero or more extra arguments
(defmacro closing [[name init-expr] & body]
   (let [~name ~init-expr]           ; evaluate the init-expr (eg. open)
      (try ~@body                    ; do the body with `name` bound
         (finally (.close ~name))))) ; close

(closing [f (open "~/scratch.txt")]
   (.write f "hello, world"))

WITH-FOO macros are incredibly convenient because, as with with x: in Python or try (File f = ...) { } try-with-resources in Java, or even let statements they create local, lexically mapped scopes for resources or state. There’s nothing a (well-behaved) WITH-FOO macro does that couldn’t be implemented by properly balanced push and pop operations, but creating a syntactic pattern for setting and unsetting context, or using and disposing of resources, is incredibly powerful.

In Kotlin, this is “just” a function which happens to take a callback (which looks like a normal {} block body) and calls it.

inline fun <T> AutoClosable.use (body: (AutoClosable) -> T): T {
    try {
        return body(this)
    } finally {
        this.close()
    }
}

// Note that {} is syntax sugar for a lambda function
// Note that .f {} is syntax sugar for .f({})
File("~/scratch.txt").use { f ->
   f.write("hello, world")
}

A tiny bit of syntax sugar has effectively opened up all much of the same design space as a macro to users who want to make their language “more large” without at the same time making the language fully self-modifying or meaningfully more difficult to analyze. It’s a bit complex to understand that .use {} is a fancy function call, but the ergonomics benefits are enormous and an IDE or compiler can still analyze it because it’s just a function not a true macro with its own rules of meaning. Furthermore this general pattern of being able to add structured, type-dispatched behavior to a type, interface or intersection of such enables a controlled form of code injection that’s generally useful and has predictable scope in comparison to some other forms of behavior injection.

The risk of giving users access to real macros is that users can and will build their own entire languages. Take the infamous Common Lisp LOOP macro as a somewhat extreme example.

Being able to write

(loop for x in '(1 2 3)
      do (print x))

is perhaps well and good, but even “simple” examples begin to show the monster lurking in the LOOP DSL

 (loop repeat 10
       for x = (random 100)
       if (evenp x)
          collect x into evens
          and do (format t "~a is even!~%" x)
       else
          collect x into odds
          and count t into n-odds
       finally (return (values evens odds n-odds)))

This laughably complex macro is no fault of Common Lisp the language. Well. Maybe the fault of the language standards committee who should have known better than to include it in the language. But the tools for building such macros are fundamental to the language, and even were this macro not standardized users could and did write their own.

Perhaps this macro is “easy” in terms of the notation being familiar to programmers who have used other languages. It certainly seems like much of a traditional Pascal has been crammed into this macro. But in no sense is it “simple” or consistent with other notation within the framework of Lisp. This macro is unlike anything else, and has an incredibly complex interpretation defined by a large macro.

By relying on user-defined language extensions even for what a larger language would consider the standard library or core features, small languages can actually be far more difficult to build tooling around. Larger languages such as Java or Kotlin are able to lean on well understood syntax and either limited or no space for syntactic extensions to be precise about what is and isn’t valid.

The fact that one can build an async/await engine as a userland Lisp library defeats the kind of analysis required to enable many helpful compiler errors. Perhaps one could make errors less bad in the presence of macros and many have tried, but the user experiences remain poor. Simplicity of implementation can be counterproductive to managing the complexity users experience.

Building on Steele’s deliberate blurring of the line between a “language” and a “library”, I suggest this train of thought applies to libraries as well. Libraries – language extensions really – help us say and do more but can fail to help us manage complexity or impose costs on their users in exactly the same ways as language features.

I’d suggest that ORMs alone provide all the evidence of this one could ever want. Fundamentally, ORMs exist to try to automatically build bridges between however your program runs and how SQL (or some other database language) runs over there in a different interpreter on the database. This is an incredibly hard problem, and implementing this well requires getting an incredible number of design and ergonomic tradeoffs right. It may even depend on having a sufficiently malleable base language to do the job well going by the nearly unique success ActiveRecord has achieved by (ab)using metaclass hacking.

A large enough stack of T-shirts

All of this brings me and I hope you to the counter-intuitive conclusion that simple tools do not necessarily do better at helping users manage complexity than more complex tools. If anything, simple tools seem to do worse because by being locally simple they push more concerns out to the user to manage rather than participating in managing them.

A language or tool which prioritizes its own implementation or specification over the interface it presents to users will never be easy or enable its users to achieve simplicity as they must wrangle the remainder of complexity from the incomplete tool. Such a tool is at best superficially simple.

The real question – the unanswered question – is what tools effectively help users manage “complexity”, how and why.