Docs Sketches

At present I'm working on developing Shelving, a clojure.spec(.alpha) driven storage layer with a Datalog subset query language. Ultimately, I want it to be an appropriate storage and query system for other project of mine like Stacks and another major version of Grimoire.

In building out this tool, I've been trying to be mindful of what I've built and how I'm communicating it. What documentation have I written? Is there a clear narrative for this component? Am I capturing the reason I'm making these decisions and the examples that motivated me to come here?

Writing good docs is really hard, doubly so when you're trying to write documentation largely in docstrings. Docs you write in the docstrings drift rapidly from docs written elsewhere. The only surefire way I've come up with to keep the two in sync is to consider your docstrings to be authoritative and export them to other media. So I wanted a tool for that.

Another major problem is documentation organization. In Clojure, namespaces have long known limitations as the unit of structure for your API. Clojure's lack of support for circular references between namespaces and the need to use explicit forward declarations lead to large namespaces whose order reflects the dependency graph of the artifact not perhaps the intentional API or relative importance of its members. In my first iterations of shelving, I was trying to just use a large .core namespace rather than maintain separate namespaces. I wanted a tool that would let me take a large API namespace and break its documentation up into small subsections which had conceptual relationships.

The result of these two needs is the following script, which I used for a while to generate the docs/ tree of the shelving repo. To work around the issues of having a large, non-logical .core namespace, I annotated vars with ^:categories #{} metadata. This allowed me to build up and maintain a logical partitioning of my large namespace into logical components.

Documentation files are generated by opening an existing file, truncating any previously generated API docs in that category out, and appending newly generated docs to it. This allows me to have Eg. a Markdown title, some commentary, examples and what have you, then automatically splat the somewhat formatted and source linked docs for the relevant category at the end after that header.

I won't say the results are good. But they beat the heck out of not having any docs at all. Mainly they succeeded in being easier to browse, cross-check and consequently made copy editing docstrings much easier.

With the introduction of distinct "user" and "implementer" APIs to Shelving which happen to share some vars, this categorization is no longer so clear cut. Some docs now belong in one place, some in multiple places. This is an inadequate tool for capturing those concerns. Also it still sorts symbols by line numbers when generating docs 😞

So I'll be building another doodle, but thought that this one was worth capturing before I do.

Some sample output -

(ns compile-docs
  "A quick hack for building the doc tree based on `^:category` data."
  (:require [shelving.core :as sh]
            [ :as io]
            [clojure.string :as str]))

(def category-map
  {::sh/basic  (io/file "docs/")
   ::sh/schema (io/file "docs/")
   ::sh/rel    (io/file "docs/")
   ::sh/util   (io/file "docs/")
   ::sh/query  (io/file "docs/")
   ::sh/spec   (io/file "docs/")
   ::sh/walk   (io/file "docs/")})

(defn ensure-trailing-newline [s]
  (if-not (.endsWith s "\n")
    (str s "\n") s))

(defn relativize-path [p]
  (str/replace p (.getCanonicalPath (io/file ".")) ""))

(defn compile-docs [category-map nss]
  (let [vars (for [ns              nss
                   :let            [_ (require ns :reload)
                                    ns (if-not (instance? clojure.lang.Namespace ns)
                                         (the-ns ns) ns)]
                   [sym maybe-var] (ns-publics ns)
                   :when           (instance? clojure.lang.Var maybe-var)]

        groupings (group-by #(-> % meta :categories first) vars)]
    (println groupings)
    (doseq [[category vars] groupings
            :let            [category-file (get category-map category)]
            :when           category-file
            v               (sort-by #(-> % meta :line) vars) ;; FIXME: better scoring heuristic?
            :let            [{:keys [categories arglists doc stability line file]
                              :as   var-meta} (meta v)]]
      (println v)
      (with-open [w (io/writer category-file :append true)]
        (binding [*out* w]
          (printf "## [%s/%s](%s#L%s)\n"
                  (ns-name (.ns v)) (.sym v)
                  (relativize-path file) line)
          (doseq [params arglists]
            (printf " - `%s`\n" (cons (.sym v) params)))
          (printf "\n")
          (when (= stability :stability/unstable)
            (printf "**UNSTABLE**: This API will probably change in the future\n\n"))
          (printf (ensure-trailing-newline
                   (-> doc
                       (str/replace #"(?<!\n)\n[\s&&[^\n\r]]+" " ")
                       (str/replace #"\n\n[\s&&[^\n\r]]+" "\n\n"))))
          (printf "\n"))))))

(defn recompile-docs [category-map nss]
  (doseq [[_cat f] category-map]
    (let [buff      (slurp f)
          truncated (str/replace buff #"(?s)\n+##.++" "\n\n")]
      (spit f truncated)))

  (compile-docs category-map nss))

(defn recompile-docs!
  "Entry point suitable for a lein alias. Usable for automating doc rebuilding."
  [& args]
  (recompile-docs category-map


Arch Package Transaction Log Fun

Those of you following me on twitter have probably seen me kibitzing about my development environment having been utterly destroyed by an Arch Linux package upgrade in the last ~24h.

It's been really fun.

Arch Linux unfortunately doesn't provide a tool for resetting your computer's package state to whatever it was previously following a bad update. However, the Arch package manager pacman does helpfully write a log file which records everything that it's done as a good old fashioned append only commit log.

This means that getting my computer back to a sane state was obnoxious, but a pretty trivial hack. Write a program that takes a date stamp, scan the transaction log for everything that happened at and after that date stamp and roll it back.

While my well polished emacs install was unusable, due to the aforementioned awesomeness, I still remember enough vim to be dangerous.

And so, the hack:

#!/usr/bin/env python3

import os
import re
import sys

def main(opts, args):
    """Usage: python date

    Parse /var/log/pacman.log, enumerating package transactions since the
    specified date and building a plan for restoring the state of your system to
    what it was at the specified date.

    - /var/log/pacman.log has not been truncated

    - /var/cache/pacman/pkg has not been flushed and still contains all required

    - The above paths are Arch default and have not been customized

    - That it is not necessary to remove any "installed" packages

    Note: no attempt is made to inspect the dependency graph of packages to be
    downgraded to detect when a package is already transitively listed for
    downgrading. This can create some annoying errors where eg. systemd will be
    downgraded, meaning libsystemd will also be downgraded, but pacman considers
    explicitly listing the downgrade of libsystemd when it will already be
    transitively downgraded an error.


    date, = args

    print("Attempting to roll back package state to that of {0}...\n"

    # These patterns can't be collapsed because we want to select different
    # version identifying strings depending on which case we're in. Not ideal,
    # but it works.

    # Ex. [2017-04-01 09:51] [ALPM] upgraded filesystem (2016.12-2 -> 2017.03-2)
    upgraded_pattern = re.compile(
        ".*? upgraded (?P<name>\w+) \((?P<from>[^ ]+) -> (?P<to>[^\)]+)\)")

    # Ex: [2018-02-23 21:18] [ALPM] downgraded emacs (25.3-3 -> 25.3-2)
    downgraded_pattern = re.compile(
        ".*? downgraded (?P<name>\w+) \((?P<to>[^ ]+) -> (?P<from>[^\)]+)\)")

    # Ex: [2017-03-31 07:05] [ALPM] removed gdm (3.22.3-1)
    removed_pattern = re.compile(
        ".*? removed (?P<name>\w+) \((?P<from>[^ ]+)\)")

    checkpoint = {}
    flag = False

    with open("/var/log/pacman.log") as logfile:
        for line in logfile:
            if date in line:
                flag = True
            elif not flag:

            match = re.match(upgraded_pattern, line)\
                or re.match(downgraded_pattern, line)\
                or re.match(removed_pattern, line)

            if match:
                package ="name")
                from_rev ="from")
                if package not in checkpoint:
                    checkpoint[package] = from_rev

    print("Checkpoint state:")
    for k in checkpoint.keys():
        print("{0} -> {1}".format(k, checkpoint[k]))

    pkgcache = "/var/cache/pacman/pkg"
    pkgs = os.listdir(pkgcache)
    pkgnames = ["{0}-{1}".format(k, v) for k, v in checkpoint.items()]

    selected_pkgs = [os.path.join(pkgcache, p)
                     for n in pkgnames
                     for p in pkgs
                     if n in p]

    print("\n\nSuggested incantation:\nsudo pacman -U {}"
          .format("\\\n  ".join(selected_pkgs)))

if __name__ == "__main__":
    main(None, sys.argv[1:])

Over and out from my recovered emacs setup 😊


Architecture and Microarchitecture

From a recent paper review I wrote -

So @ztellman is writing a book The Elements of Clojure which would better be titled The Elements of Software. It’s almost entirely about exploring ideas about abstraction and how abstraction and engineering interact. Maybe worth perusing since this paper seems to hit a bunch of related ideas. To crib some of those:

Science is about dealing with timeless things. The lambda calculus has no engineering constraints. It’s a great substrate for proof, and takes some head standing to become an adequate substrate for doing “real” work because doing “real” work is characterized by trying to fit within a finite machine and finite time.

Engineering is about shoveling the constants around to optimize for how many transistors we can fab at what density and how many we can drive / cool at what power without melting the whole thing. That is, it’s entirely an exercise in factors which are far from timeless and thus not suitable for long lasting abstractions. Architectures like the Itanium, MIPS and TTAs you allude to are the final form of this kind of thinking. It may be momentarily convenient say for performance to expose, as in the MIPS, a pipeline without interlocks or the full physical register set, but as the paper conveys well this constitutes a fragile API which will prove limiting in the future when the engineering constraints which motivated those decisions change.

The bit you don't seem to hit on yet is what is architecture, and how do we do it? I suggest, still cribbing from Zach, that architecture both in software and in hardware is all about trying to find the perfect layer of indirection between the engineering constraints of the preset (time, memory, chip area, heat, etc.) and the universals/existentials that represent a stable target. That is, you’re trying to find what set of cost trade-offs you can presume are constant-ish for as long as you hope your design will be relevant. Will we still be monocore and care about single cache density? Will the NUMA gap continue to increase? Will we continue to be able to rely on speculation for performance wins, or will we switch to more communication based models of computation where privilage level context switches must become more expensive and communication must be efficient?

To cherry-pick your CAR / CDR example of an architecture feature which has lasted far too long; While the names are absurd and now long irrelevant the notion of the first/next pair has proven to be general, timeless representation for arbitrary graph structures. Which has been repeated time and time again silly names included even in the face of memory architecture [1] changes which render the original conception less and less appropriate.

Straight stack, thunk or SECD VMs follow hot on the heels of really trying not to assume anything. Maybe that's fine. Maybe as you hint at the end, an abstract register machine with some support for generic operations in the style of the Cray-1 which may be specialized or implemented with accelerators later is appropriate.

The central idea that this paper seems to be toeing is that as implementers with our eyes on what we’ve built our natural tendency when “specifying” is to write down the specific thing we’ve envisioned. From a cognitive laziness perspective it’s the natural thing to do. Taking a step back and realizing what’s essential versus what’s incidental is really goddamn difficult because their (our) job is to have our heads so far down in the microarchitecture that we’re counting clock cycles and cache lines in our heads before we even have a simulator. This sort of clock cycle counting - engineering and the failure of architecture - is what leads to the microarchitecture-as-architecture trap sketched above.

Dunno how much of this is useful without turning it into a philosophy paper, but it does seem like there’s plenty of room to hit on these issues a little more directly and be concrete about the possible advantages of abstract structures like the admittedly absurd x86 floating point stack unit.

牛: How static is enough?

Ox has been and will continue to be a long-haul project in part because it's a dumping ground of ideas for me, and in part because I don't desperately need it. There's plenty of bad in the software industry, but I'm able to swing the hammers that already exist. Ox isn't so much a hammer as it is an exercise in the ongoing search for what a better hammer would be.

Part of this search comes from my judgments of the tools I use on the daily and their failures. The most significant part however is how the decisions I've made in what prototypes I have built have shaken out.

It would have been easy enough to crank out an RⁿRS interpreter and call it Ox. It just wouldn't have added anything to the software ecosystem or solved any problems I actually had. In fact most of the problems I've tried in one way or another to approach in Ox are directed by my programming experience.

So lets start with the motivation and what I was trying to get at in the last cut I took at Ox.

Clojure is a nice language which does a good job at trying to provide sane defaults. Its data structures are immutable by default, it features destructuring - macro sugar for binding values in structures to locals, it interoperates nicely with the JVM and it has a namespace system which makes code organization fairly predictable.

Because Clojure uses a singe pass compiler and its compilation unit is a single form rather than a file/module or group thereof, it can be difficult to Clojure whose module structure reflects its logical structure. Cyclic dependencies between modules/namespaces are difficult to achieve and have poor ergonomics. In fact, cyclic dependencies within a single namespace are - if not difficult to achieve - then at least requires an effort which tends to prohibit their use.

This has a couple of consequences, well exemplified by the language implementation itself. The namespace clojure.core is a huge namespace containing many groups of related functions which would do just as well if not better in their own namespaces. Big namespaces which don't have a unifying concept are difficult to approach as a new user and difficult to document. Furthermore, they create artificial contention over the "good" names, and tend to lead to the use of prefixes.

Perhaps most interesting is the consequences the lack of circular dependency support has for the language's own implementation. The first several thousand lines of clojure.core serve to bootstrap the language using mainly JVM inter-operation to Java interfaces and the Java implementation of Clojure's various built-ins. Some of this bootstrap code is unavoidable, but parts of it could be avoided if the implementation was able to make forward use of symbols which weren't defined yet.

With explicit forward declarations in some cases this could be made possible today within a single namespace such as clojure.core. The most interesting use cases of circular dependencies however require circular dependencies across module boundaries. For instance what if we could have a clojure.core.seqs namespace containing all the operations related to the sequence abstraction, and so-forth. A large number of small namespaces would be more modular, easier to break out of the standard library into library packages where possible.

And after all, Ox is a pie in the sky project so what would it take?

Well - circular dependencies on functions are pretty easy. They don't have a data dependency on each-other at compile time; okay they do but letrec is easy to write and rewriting modules as letrec to support mutual function recursion is well understood and a traditional Scheme trick.

For instance if I had the nonsensical module

(define (f a b)
  (c a))

(define (g a b)
  (c b))

(define (c x)
  (+ x 1))

This could be implemented instead by re-writing it to the letrec form

(letrec ((f (lambda (a b) (c a)))
         (g (lambda (a b) (c b)))
         (c (lambda (x) (+ x 1))))

Implementing circular dependencies Scheme style via letrec works because the top level special forms (define etc.) are well understood and can be re-written into letrec bindings as above.

The idea I had at the time was that REALLY the core problem here is that the evaluator can't figure out the dependency graph between functions and macros because we have an indeterminable number of un-expanded macros. A solution to this problem is to simply make the dependency graph statically visible.

In scheme, the above letrec conversion is trivial because the top level form define is well understood by the implementation and so can be rewritten into a form where circular references will function as desired. The take on this in #29 is to do so by restricting top level forms to a small set of forms which explicitly state what definitions they provide. For instance def, defn, deftype and soforth all explicitly name the definitions they create.

(ns example)

;; This provides the definition example/foo. `def` is well know, so
;; this form can be understood as a definition at read time without
;; any effort to evaluate the body or docstring just like `define`.
(def foo
  "Some docs"

;; Same as above, `defn` is well known and can be statically parsed
;; at least to the degree we need to understand that this defines
;; the symbol example/qux for dependency analysis
(defn qux
  "Some docs"
  ([x] "arity 1")
  ([x y] "arity 2")
  ([x y z & more] "arity 3+"))

;; `defmacro` is just `defn` with the macro flag and some tweaks to
;; accept implicit arguments like `&form` and `&env`.
(defmacro deftype
  ;; ...

;; deftype, definterface and friends all nicely fit this pattern to
;; the extent one may choose to implement them as special forms.

This all works pretty well. The problem is what happens when a macro occurs at the top level? Doing macros this way doesn't work, especially when there are sequential dependencies between the macros and the functions in the module. While the macros can be re-written as letrec bindings, the semantics of macro expansion don't fit that way because the macros have to be expanded before the functions or (or the other macros!) using them become executable.

Consider something like this...

(deftype Foo "A foo record." [a b c])
;; We now have the functions `->Foo`, `Foo?` and some friends

(defn foo-or-bar [a]
  (or (Foo? a)   ;; Where does `Foo?` come from? Nothing at the top
                 ;; level trivially provides `Foo?`, but we have to
                 ;; find it in order to compile `foo-or-bar`.
      (Bar? a)))

I thought about this for a long time because I think there's a clear case for supporting mutually recursive definitions with the most generality possible. And the first solution I came up with is the one implemented in ox-lang/ox#29.

By defining a new top-level special form providing, users can just annotate their top level macros with the symbols they're expected to produce! Problem solved! This lets Ox determine what top level forms need to be evaluated to see of they provide macros and perform lazy loading by doing functionally free dependency analysis before any evaluation.

So for instance

(providing [->Foo, Foo?]
  (deftype Foo "A foo record." [a b c]))

(defn foo-or-bar [a]
  (or (Foo? a) ;; The reader understands `providing` and can determine
               ;; that to get the needed `Foo?` definition the
               ;; previous form needs to be evaluated.
      (Bar? a)))

The problem with this approach - and the reason I call it a mistake - is that it greatly complicates load time and neuters definition writing macros. As noted earlier, there's plenty of good reason to want to write macros that write definitions. However a good bit of the syntactic power of such macros is lost if you have to provide a list by hand of definitions the macros will create for you.

Frankly this made implementing code loading so complicated that what with trying to bootstrap Ox in Java I couldn't do it. This decision proved absolutely fatal to that particular run at the Ox project.

There are, now that I've seen the failures of this approach and had some hammock time, other ways to achieve this same end. For instance if you don't allow macros to write macros, you can play games where macro definitions are letrec rewritten as above, and the definitions of "normal" functions or expressions are delayed until forced by a macro expansion. This lets you play games whereby functions and macros may mutually recur so long as you can ground out a macroexpansion to some concrete form or function of forms that can be evaluated without unbounded mutual recursion between functions and macros. Which, honestly, is the best I think you can do.

The advantage of Clojure's single pass approach here is a huge complexity win which avoids all these considerations, and the fact that the Clojure read/evaluate/print loop has exactly the same structure as code loading.

Anyway. Languages are hard. That's what makes them fun.

More when I've got better ideas.


Intermediate Abstraction

This talk was presented at the Bay Area Clojure meetup, hosted by Funding Circle. These notes are presented here with thanks to the organizers and attendees.

1 Preface

Hey folks! For those of you who don't know me, I'm Reid McKenzie. I've been writing Clojure on and off for about four years now. These days I work for Twitter as a Site Reliability Engineer.

Twitter Site Reliability is really small. We're not even 150 engineers tasked with ensuring that all the infrastructure and services required to support the business stay online. It's clear that as the business continues to grow, we in SRE can't continue be traditional reactive system operators. We must be software engineers developing automation and designing systems capable of intervention-less recovery.

Achieving quality in software engineering means we must develop an understanding of how to exercise our powers of abstraction, and that we appreciate and managing the complexity with which we are surrounded. This talk presents some formal and philosophical concepts intended to elevate the ways we approach problem solving which will I hope be useful to you.

2 Context

Programming is a young profession all things told.

If we trace the roots of our profession to Lovelace as she is the first to have written of a machine which could be used for general purpose and perhaps program or modify itself then our profession such as it is dates to the 1830s. Just shy of the two century mark.

Date Who Event
1928 Hilbert Entscheidungsproblem
1933 Godel, Herbrand μ-recursive functions (simple arithmetic)
1936 Church λ calculus, models Entscheidungsproblem with reduction
  Turing Turing machine, models Entscheidungsproblem with halting
1941 Zuse First programmable, general purpose computer
194{5,6} Von Neumann Von Neumann single memory architecture
1950 USGOV, Banks Computers become common place for census & data processing
1956 Bell Labs Transistors are invented
1970 IBM, Commodore First transistorized PCs come to the market
1976 Wozniak Apple I
2017   Us, here, today, floundering variously

Computation as a science is grounded in the concepts of evaluation and analysis. The science is in the theoretical limits of what a machine could do. The question of how to achieve something on a given machine is the province of software engineering.

Software engineering is a younger discipline still. We've only really been writing software since the '50s (Excepting Lovelace's programs which never had a machine), so call it 60 years of recognizable programming during the course of which huge shifts have occurred in the industry.

In retrospect one can only wonder that those first machines worked at all, at least sometimes. The overwhelming problem was to get and keep the machine in working order.

– Dijkstra "The Humble Programmer" EWD340 (1972)

The problem with software engineering is, well, for the most part we never get to do it. The profession of programming as adopted in industry is rooted in the now seemingly long tradition of just getting the machine to work at all and then "debugging" it which has carried since the first programs in the '50s. The operations tradition from which SRE programs are usually developed is particularly guilty of this.

Rather than have a concrete body of science to reference and with which to inform decisions, every working programmer develops an aesthetic taste for what seems to them like the right way to solve problems; what kept the computer or systems of computers in working order and solved the problem best. The problem with taste is that it is deeply subjective. It reflects our personal experiences in education, what javascript frameworks we've played with on the side and how we did it at $LASTJOB. When tastes conflict, they can't inform each-other except with great difficulty because there's no underlying theory which can be used to relate and moderate them.

The question is how do we earn the E in our titles. And the answer is not to continue artistically operating the systems we have or continuing to do what seems to work. We must develop and communicate formal understandings for the tools and tactics we use.

3 Complexity

Simple; Simplex; To have but a single strand, to be one.

Complex; To be woven from several strands.

Complect (archaic); To plait together.

Easy; from the French alsie meaning comfortable and tranquil. Not difficult, requiring no great labor or effort.

– Rich Hickey "Simple Made Easy" (2012)

Often, we miss-use simple to mean easy.

Ideally, we want to write code which is both simple and easy. This requires that we be concious both of complexity and of ergonomics. However we only rarely have objective critera for these two critical dimensions.

The fact that a perfectly reasonable engineer may consider a tool sufficient and easy while another may just as reasonably a tool baroque and overbearing is a core motivator for this talk. Sharing an objective metric of simplicity clarifies otherwise subjective questions and enables us to agree upon measurements of solution quality.

Thankfully there is prior art on the subject of attempting to estimate complexity. Karl Popper's Logic of Scientific Discovery builds a framework of formal logic with which to formalize the scientific process itself.

Popper bases his project upon falsifiability and corroboration through experiment. A hypothesis for Popper is a formal statement which is falsifiable because it implies testable outcomes. An experiment cannot confirm a hypothesis, because to do so would imply that no other experiment (state of the entire universe) could possibly falsify the hypothesis. However, we can fail to falsify the claims of the hypothesis, and we can even show that the experiment produced a state which corroborates the prediction(s) of the hypothesis.

Popper proposes that the complexity of a theory can be measured by reasoning about the set of states of the universe which would falsify the theory. These sets can be related to each other and relative complexity measured by establishing implication and subset relations, but this is far from enough to be a really working rule of measurement. Comparing infinities (sets of states of the universe) is hardly intuitive.

The good news is that there are ways we can do this!

Cyclomatic complexity (McCabe) measures the number of control paths through a program. This approach works, and even gives an obvious heuristic for recognizing complexity in the simple number of branches but it is uni-dimensional and doesn't capture the fact that our programs have and manipulate state.

Dataflow complexity is a related family of metrics which try to measure the possibility space of the data (state) in a program. Unfortunately, beyond giving an intuition for the idea that complexity grows with the memory footprint of a program there's not an efficient way to quantify or relate this kind of complexity.

We get into this mess of having to count states in order to talk about complexity because both of these complexity estimators use the model of a Von Neumann finite state machine machine whose states cannot easily be related to each other.

If we go back to Popper, if we have two logical reduction statements P and Q, we can estimate their relative complexity by simply comparing the number of degrees of freedom of each expression. Likewise using the μ calculus (simple arithmetic and reduction) or the λ calculus (function application and reduction) we instantly regain the property that we can understand the complexity of an expression or program merely by looking at the number of terms it involves and counting them.

This intuition is supported by cyclomatic complexity and dataflow complexity metrics because (for programs which halt under reduction) the complexity of a program written with reduction is, well, one. Each datum and program point is unique and occurs only once.

λx.xx λx.xx

→ λx.xx λx.xx

That this must be an estimate, not a precise metric as a divergent combinator such as ω will mess this whole thing right up. But per the halting problem there's not much we can do here so if we accept (as we must) that this is an incomplete analysis and restrict ourselves to the domain of terminating expressions we can still get a lot of utility out of this rule of thumb.

Now, we don't write purely functional programs, and when we do write programs which use functional tools and abstractions there's still global state lying around because we compute on Turing machines not graph reduction engines (those are really hard to build). How do we estimate complexity for real programs then?

It has been suggested that there is some kind of law of nature telling us that the amount of intellectual effort needed grows with the square of program length. But, thank goodness, no one has been able to prove this law. And this is because it need not be true.

– EWD, "The Humble Programmer"

There is room for reasonable disagreement here, but I'd propose a very simple heuristic for estimating complexity; the product of the following properties of a program or program segment.

  1. 1 + Number of input parameters
  2. 1 + Number of branches
  3. 1 + Number of back-edges
  4. 1 + Amount of REACHED CAPTURED state which is accessed (maybe 2x cost of a parameter)
  5. 1 + Amount of REACHED CAPTURED state which is modified (maybe more than 2x the cost of a parameter)
(defn map [f coll]
  (when coll
      (cons (f (first coll))
            (map f (rest coll))))))

That is, a function which accepts more parameters and takes many branches using them is more complex than a function which accepts a function and a few parameters as parameters and merely delegates to the argument function. The real branching behavior of such an applied higher order function may be complex, but the function itself is simple. It captures little state and does little directly.

(defn log*
  "Attempts to log a message, either directly or via an agent; does not check if
  the level is enabled.
  For performance reasons, an agent will only be used when invoked within a
  running transaction, and only for logging levels specified by
  *tx-agent-levels*. This allows those entries to only be written once the
  transaction commits, and are discarded if it is retried or aborted.  As
  corollary, other levels (e.g., :debug, :error) will be written even from
  failed transactions though at the cost of repeat messages during retries.
  One can override the above by setting *force* to :direct or :agent; all
  subsequent writes will be direct or via an agent, respectively."
  [logger level throwable message]
  (if (case *force*
        :agent  true
        :direct false
        (and (clojure.lang.LockingTransaction/isRunning)
             (*tx-agent-levels* level)))
    (send-off *logging-agent*
              (fn [_#] (impl/write! logger level throwable message)))
    (impl/write! logger level throwable message))


(declare ^:dynamic *logger-factory*)


(defmacro log
  "Evaluates and logs a message only if the specified level is enabled. See log*
  for more details."
  ([logger-factory logger-ns level throwable message]
   `(let [logger# (impl/get-logger ~logger-factory ~logger-ns)]
      (if (impl/enabled? logger# ~level)
        (log* logger# ~level ~throwable ~message)))))

An expression which does little, but does so using values which it draws from global state such as a configuration value may still be simple, but it is not so simple as a function which accepts that same configuration structure directly as a parameter. This complexity becomes evident when testing, as a test for the simple function merely requires calling the simple function with the appropriate configuration value where testing the globally configured function requires manipulating the state of the entire application so that the shared mutable configuration state conforms to the tests requirements.

We can see that the log macro is actually quite complex becuase it reaches a significant amount of global state despite the fact that the log macro itself doesn't actually do that much. The log macro has to check in a global mutable registry of namespaces to logger implementations for a logger, check the whether that logger is enabled and then do the logging, potentially manufacturing a new logger using the factory from global state if it has to.

(defn mongo!
  "Creates a Mongo object and sets the default database.
      Does not support replica sets, and will be deprecated in future
      releases.  Please use 'make-connection' in combination with
      'with-mongo' or 'set-connection!' instead.
       Keyword arguments include:
       :host -> defaults to localhost
       :port -> defaults to 27017
       :db   -> defaults to nil (you'll have to set it anyway, might as well do it now.)"
  {:arglists '([:db ? :host "localhost" :port 27017])}
  [& {:keys [db host port]
      :or   {db nil host "localhost" port 27017}}]
  (set-connection! (make-connection db :host host :port port))

A function which changes state may be complex compared to a function which produces no effects and returns a value, but it introduces far more whole program complexity if global state is modified especially if it is modified many times.

This supports the intuition that perhaps sorting a list received as an argument in place doesn't add so much whole program complexity because the effect is restricted in scope to the caller and wherever the list to be modified was sourced from whereas reading and manipulating the ERRNO global value in a C program may directly impact any number of other program behaviors. Such modification defeats referential transparency, and forces us to use sequential logics at the level of the whole program to achieve understanding.

We use multiplication rather than addition to reflect that really what we're trying to approximate is the VOLUME of a multi-dimensional state space, which is measured by the product of the length of the sides, not their sum.

With apologies to Dijkstra, I note that it is quite possible for a program's complexity by this metric alone to grow at least in proportion to the square of the program's length. However there's also still plenty of room at the bottom for simple programs to achieve near-linear complexity growth.

4 Abstraction

We all know that the only mental tool by means of which a very finite piece of reasoning can cover a myriad cases is called “abstraction”; as a result the effective exploitation of his powers of abstraction must be regarded as one of the most vital activities of a competent programmer. In this connection it might be worth-while to point out that the purpose of abstracting is not to be vague, but to create a new semantic level in which one can be absolutely precise.

– EWD, "The Humble Programmer"

What is an abstraction?

Abstractions can be considered to consist of a model, interface and an environment.

The model of an abstraction is the set of operational semantics which it exposes to a user. For instance a queue or channel as an abstraction provides the model that a user writes to it and a consumer reads from it. Reads may be in some order such as FIFO or LIFO, or un-ordered with respect to writes. A bounded queue may provide the additional semantics that, if the writer outpaces the reader it can only get so far ahead before the queue put operation blocks the writer. A Concurrent Sequantial Processes queue provides the operational semantics that when a write occurs the reader is awakened and so only the reader or the writer is ever operating at once.

The interface is the set of operations which the abstraction provides to the user. For instance the channel can have elements enqueued and dequeued. Perhaps the queue can also be closed at one end, denoting that either the source or the destination is finished whereupon the other end will eventually (or immediately) also close.

Abstractions don't exist in a vacuum. They have expectations which they may demand of the outside world (externalities) such as restrictions on how they are used or in what context they are employed. Everything that the model omits is ultimately an assumption about the environment.

It may be helpful to note that the environment is the dual of the model. Anything that is not part of the model must be part of the environment, and by growing the model we can shrink dependency on the environment.

This is deeply significant, in that mismatch between environments and reality results in friction for the user, and tends to be resolved not by abandoning the abstraction but by humans adapting to the restrictions imposed upon them by inadequate tools.

Abstractions may be compared - one said to be simpler than the other - on the basis of the size of the model, interface and expectations imposed on the environment. The model after all is a space of states, the interface a set of functions above, and externalities may be either a set of propositions or a space of states as the two are equivalent.

The ergonomics of abstractions may also be compared - one said to be more consistent than the other - on the basis of the consistency of the names and structure of the abstraction's interface with each-other and predictability of the relationship between the interface, the model and its externalities.

4.1 Examples of Abstraction

Per the Dijkstra quote above, abstraction is all about changing or choosing semantics. I'll give a treatment of linguistic abstraction shortly, but by naming things we can build semantic levels - languages - which convey what we want.

We can abstract over control flow by using higher order functions which provide for instance conditional execution giving us a tool with which we can talk about for instance conditionally applying a transformation.

We can abstract over values by choosing types or decomposing values into their components, using aggregate logical values and references which enable us to refer to parts of our data.

However it is not abstractive to perform computation. When we "compute" (reduce) an expression, we discard information irrecoverably. For instance an average is not meaningfully an abstraction over a time series. It is a datom, the product of some other irrecoverable data under a function. By computing this new value, we've given ourselves an aggregate view which no longer carries the same semantics and cannot meaningfully be considered to be equivalent to its source.

4.2 Limits of Abstractions

Unfortunately, abstractions do have limits.

Traditional types which you may be familiar with, your int and char, are really just raw presentations of what the physical machine can do with a given value. Rarely however do these presentations of machine capabilities map to the senses which are meaningful to us. They are the poorest possible of abstractions - none at all.

#include <stdio.h>
#include <stdint.h>

#define EVAL(...) printf("%s: %d\n", #__VA_ARGS__, __VA_ARGS__)

int main(int argc, char** argv) {
  return 0;
(int32_t)(1<<31)-1: 2147483647
(int32_t)(~(1<<31)): 2147483647
(int32_t)(1<<31): -2147483648

How many times in Java have you actually wanted precisely a value with 232 states of which 231-1 states are said to represent positive numbers, 231 states represent negative numbers and the zeroed bit string represents … zero. And arithmetic wraps around, flipping the sign should we excede the bounds of [-231, …, +231-1]. We also know that while we have a vector of bits … somewhere and one of them is a sign, and we can count on the 32nd bit being the sign we can't actually make endianness assumptions about the bytes in our integer(s). So much for bitmasking tricks.

This abstraction of an int type provides addition, subtraction, multiplication, division, absolute value and of course comparison. All of which we associate with the concept of an integer. Furthermore this abstraction supplies, by specifying model of the machine representation, all the operations we customarily associate with a vector of bits - and, or, xor, shifting, inversion.

If all we want is addition and comparison, the rest of this behavior is not just semantic baggage, it adds complexity. What if I want to count to 231? I have to be aware that this abstraction's model says I'll get -231+1, not the value I expect. Or perhaps an out of band signal that an IntegerOverflowException occurred, which constitutes another cross-cutting concern because now the abstraction implies a system of exceptions with which I must be familiar.

For a large domain of problems, this abstraction does well enough. However we must be constantly aware of its failure modes. We as professionals bear a constant cognitive cost in conforming ourselves to the limits which this abstraction presents to us.

It should be recognized here that Haskell, Lisp(s) and other languages which provide arbitrary precision integer types out of the box capture a simpler programming model by default. A model which has more implementation complexity to be sure and consequently expects more of the environment for instance the presence of a memory manager but the interface is smaller and less thorny.

4.3 Inheritance, Composition & Re-use

Software quality is defined first by solving the business needs of the here and now. If software or strategy or any other tool doesn't do that, it's worthless. Software is a form of physical capital in the same sense as a steam engine or a factory. A factory which can be re-tooled to produce articles of a different design is more valuable than a factory which will require full replacement should the product change. Likewise an engine or assembly line which must be significantly reworked in order to add capacity or change task is less valuable than a design which naturally features a pattern along which it can be changed without significant alteration.

Object Oriented Programming (that is, programming via encapsulation and inheritance) promised that it would improve among other things software reusability when it broke into the scene.

The problem with inheritance as a strategy in practice is that any tool which wants to participate in some abstraction is forced to conform to the expectations of the interface established by the parent which the child wishes to participate in. The interface must accumulate.

Furthermore, reuse through inheritance has the problem of capturing mutable state. An extension or wrapper class must be intimately familiar with the state of the class it wraps and the expectations it may make. The model accumulates.

In "Simple Made Easy" Rich gave a good treatment of this, pointing out that code partitioning doesn't imply decoupling and pointing to Interface Oriented Programming is the response to this problem, because interfaces only bring with them their direct set of expectations.

To be precise about what's happening here - the model of inheritence oriented programming forces us to accreet interfaces and externalities. Thankfully this is not an essential property of abstraction, merely a property of this particular technique.

(defn mapcat [f & colls]
  [f & colls]
  (apply concat (apply map f colls)))

When we take two abstractions and put them together, we say that we've composed them. For instance two functions compose together to make a third. The map function composes with the concatenate function to give us mapcat. map lets us apply a function over many values, and concat allows us to join sequences together. We can thus define mapcat to be a function which lets us join the results of applying a function which produces sequences to many values. The composite model of mapcat requires only the additional constraint that f: a → [b], that is the function to be mapped produces results which can be concatenated.

(def evens #(keep even? %1))

We can build abstractions which have smaller interfaces. For instance if we take a function of a configuration and a datom and we partially apply it with a configuration, we now have a fixed function of a datom. Its model is smaller - we know how it will behave because we've specified whatever configuration(s) the larger base model depends on and the interface is smaller - it consists only of the datom to manipulate.

Building a small wrapper around a large Java API would be a good example of such a composite abstraction with a smaller interface.

We can also build composite abstractions which traverse the model and externality trade-off. For instance we can build abstractions which simplify the set of externalities by providing input or state validation where previously we had unchecked expecatations. This expands the model since now the expectations are explicitly modeled and checked. We can also reduce the complexity of the model by choosing to make assumptions.

4.4 Naming

Credit: Zachary Tellman, Elements of Clojure (unreleased).

Linguistic abstraction in the form of naming is the most fundamental tool available to us as both artisans and engineers.

Names being the tools we use to understand our tools, choosing good names (whatever that means) is of primary importance. But what characterizes a "good" name?

Frege suggests that a name consists of three properties - the sign, the textual representation of the name, the referent being the entity referred to by the sign, and finally the sense, being the properties ascribed to the referent.

The traditional example is that Phosphorous (morning star) and Hesperus (evening star) are both Greek celestial bodies which we now understand to both name the planet Venus. The sign and referent are according to Frege insufficient to understand these two terms which are prima facie synonyms because they don't actually interchange.

A good name must satisfy two properties - it must be narrow and consistent.

A narrow name excludes things it cannot represent; its sense is only as broad as the context of its use requires. For instance if the only expectation to be made of a reference is that it will be a mapping, then map or its contraction m is a perfectly acceptable name if that is the only sense in which the referent is used.

A consistent name shares the same sense for a referent as is used for that referent in other related contexts. To continue the example of a mapping, to subsequently sign the same referent as dict elsewhere would be less than ideal both because it's a broader name which implies a specific kind of mapping and thus no longer shares the same sense. A simple mapping m may only imply clojure.lang.ILookup, whereas hashmap implies java.util.HashMap and side-effectful update. Naming and naming conventions are tremendously important in the context of dynamically checked systems wherein names largely depend on their sense to communicate the type or at least intent of the use of the referent.

Lets consider some examples (adapted from Zach's fine book)

;; This is a Very Bad expression.
;; The left referent is very broad, when we clearly use it in a narrow sense. It
;; should be more narrowly named.
;; The value we get out of our broad referent is named very narrowly. The single
;; string contains far too much sense.
(get m "sol-jupiter-callisto")

;; This is a better equivalent expression, because we communicate a more
;; appropriate sense for our previously over-broad sign, and we unpacked the
;; previously overly precise key into a set of more general structures each of
;; which communicates less sense.
(get-in starsystem ["jupiter" "callisto"])

;; Better yet capture the logical operation, the fact that Callisto is a moon of
;; Jupiter is an assumption in all of the above expressions.
(get-body-by-name system "callisto")

;; It would also be better to be completely precise about the sense of the name
;; Callisto by doing something like
(-> (get-system "sol")     ; This communicates that we want to get a "system" named sol
    (get-body "callisto")) ; And we care about the component "body" named callisto

;; This expression further communicates the sense that Callisto is a moon of Jupiter.
(-> (get-system "sol")
    (get-body "jupiter")
    (get-moon "callisto"))

;; Both of these two latter expressions are better because more of the relevant
;; sense of the name callisto is captured explicitly, rather than being implied
;; upon or expected of the seemingly broad mapping referent.

4.5 Achieving Abstraction

As suggested in the discussion of the consistency of signs, part of the problem here is that we're forced by the tools we use to project our sense directly upon the sign.

Types: A heresy static sense dynamic sense
static dispatch Haskell, C, Java Java (class casting)
dynamic dispatch C (dyn linking) Smalltalk, Python, Ruby

Static type systems provide a model with which we can talk about the sense we associate to a name by capturing at least some of the sense in which we intend to use the referent as a type and knowing that if we can determine a type (and the type must conform to a type of which we knew ahead of time) then we can begin to understand the behavior of our systems through the lense of the types we've assigned to our data.

Some systems allow us to use this information statically to check that at least some part of the sense or type is consistently respected in the use of a referent.

Some systems enable you to capture more sense than others, by encoding the sense statically in what we usually call a type.

Type systems can be compared in terms of how much sense they enable you to capture, and how difficult it is to do so.

Lets say that we wanted to count up from zero, to continue the example of positive integers from above. We could choose a sign with which to reference our value which indicates that it's an index such as the traditional i, or idx or even index depending on individual taste. But sometimes we want to play clever tricks with the fact that sometimes negative indexes are defined, or we want to be tricky and reach back to some location "before" in memory where we're currently indexed to and soforth. The choice of sign implies only sense, it cannot constrain the referent.

However, we can choose abstractions which let us do so. For instance, we could build an abstraction which only allows us to capture ℕ, the natural or counting numbers.

from collections import namedtuple
from functools import total_ordering

class N(namedtuple("N", ["value"])):
  """A numeric value constricted to N (the counting numbers)."""

  def __new__(cls, value):
    if value < 0 or isinstance(value, float):
      raise ValueError("N is constrained to [0, 1, ...)")
    return super(N, cls).__new__(cls, value)

  def __int__(self):
    return self.value

  def __add__(self, other):
    return N(int(self) + int(other))

  def __sub__(self, other):
    return N(int(self) - int(other))

  def __eq__(self, other):
    return int(self) == int(other)

  def __lt__(self, other):
    return int(self) < int(other)

This isn't so different from the full Haskell implementation of Numeric.Positive.

There are techniques which can be used to try and capture more sense. This technique is best known as "smart constructors". Rather restrict the sense to the sign, we've defined a type which can only take on values conforming to our desired sense.

The additional semantic restrictions we can impose on ourselves from the type free us from either having to check the sense ourselves over and over again which is the most verbose and fragile path, or from settling for the integer sense we get "for free" and hoping that it's enough.

This tactic can be applied to any domain where we want to give a type to a subset of some other space of values. For instance strings of a known format. It is just window dressing, but frequently it's enough that we can approximate the semantics we actually want.

To take another example, what if we were to try and make a type for a Start of Authority (SOA) version? We could use a simple string, after all that's what the SOA becomes when we lay it out in a zone file, and the only requirements we have of an SOA is that it increase to define an ordering among zone file versions.

Generally however, strings (even constrianed strings!) should be avoided. Text is the least structured form of data. Parsing is, even with the aid of parser combinators and regular expressions, difficult and error prone. It's far better to use fully realized data representations which can be rendered to strings than to keep data as a string and decode it.

Which carries more semantic meaning - the string "20170706001", or the tuple

from collections import namedtuple
from scanf import scanf  # 3rdparty

class SOA(namedtuple("SOA", ["year", "month", "day", "version"])):
  """A tuple capturing a DNS SOA format."""

  PATTERN = "%04d%02d%02d%03d"

  def __new__(cls, text):
    """Decode a SOA string, returning a SOA tuple"""
    return super(SOA, cls).__new__(cls, *scanf(text, self.PATTERN))

  def __str__(self):
    """Encode this SOA as a string"""
    return self.PATTERN % self  # self is an appropriate 4-tuple already

print SOA("20170706001")
# SOA(year=2017, month=7, day=6, version=1)

The tuple can be rendered to and parsed from text as required, and above all it captures the sense in which we use SOAs which leaving the SOA as a plain string does not. If strings are just passed through and not manipulated, then perhaps it's acceptable to keep them represented as strings and skip decoding but structures like this which capture more of the sense of our data should be preferred.

Unfortunately there's often lots of other sense we may care about - for instance that we have a number within the index bounds of some structure or that there exists a record in the database with an id primary key of this value for any possible value. Checking these senses statically may well be impossible. There could be a concurrent change to the database and a once valid identifier becomes dangling and soforth.

Static types in the Haskell sense of things help, they allow you to capture some of the relevant sense - enough in my experience that software development may be easier because the type system helps you especially when you need to change abstractions but thanks to the completeness theorem we know all too well that they can never check every property we may care about.

4.6 Utility Is Contextual


It's tempting to think that we can, through great skill or shear luck, discover perfect abstractions; Platonic ideal or Martian crystalline forms of a concept which satisfy all our needs for all time. The problem with this view is that different contexts have different needs. Many objects may generate the same shadow on the wall of Plato's cave.

For instance above I attacked the "stock" numeric abstractions which include hardware implementation details, suggesting that arbitrary precision math is a better default. I stand by that argument, but it must be recognized that there are occasions when we must make different tradeoffs in choosing our models.

Sometimes we need to interface with the machine. Sometimes we want to use (or author) bit vectors. Sometimes we don't have a memory manager and can't afford truely arbitrary precision arithmetic. Sometimes the performance degredation of operating on a double floating point number as opposed to a single or half width floating point number is meaningful and we must make a choice to increase complexity and sacrifice ergonomics in the name of utility. That some machine type is the shared interface of some other abstraction and that the externality of the machine's particular wall time performance relate to our context changes what is appropriate.

Just as we should exercise caution in choosing our names so that we don't choose overly narrow or broad names, so we must exercise caution in choosing our abstractions. The simplest possible abstraction with the fewest possible externalities may well not be the most appropriate to the domain, and worse still a simple abstraction is not likely to lend itself to change. Like other crystals, abstractions shatter or shear rather than bend from their natural shape.

Coupling of abstractions - allowing the tools you choose to share externalities and to expect one-another is dangerous because it means that even small deviation from the original requirements and context may prove cause for significant change with rippling consequences for your entire system.

5 Performance

The greatest performance improvement of all is when a system goes from not-working to working.

– Osterhaut "My Favorite Sayings"

Performance and efficiency generally are factors in the measurement of how well a tool solves or responds to the needs of the "user", whether they be a person, business or other entity.

A code breaking machine which can decode a daily message in 32 hours isn't sufficient. It can't decode messages fast enough for them to be relevant. Such a machine is not useless, but it doesn't solve the problem set for it adequately. Three such machines together will provide, at an 32 hour delay each, the previous day's message and perhaps a fourth or even pairs of machines may be supplied for reliability in order to provide a more-or less continuous feed of intelligence but it would still be yesterday's data today and delivered at great cost.

In this example we have an objective criterion of how much performance is "enough". The cracking machine must have a cracking rate of at least 1 msg per 24 hrs, and we have an objective measurement that we're only able able to crack at the rate of 1 msg per 32 hrs.

There are two primary approaches to achieving performance where it is required. The first approach is to simply do less work - either by reducing the problem using algorithmic optimization, or by reducing the size of inputs by firing customers or negotiating system scope. When optimization must occur, algorithmic or organizational optimization should be by far preferred in the interest of overall program simplicity with all its other virtues.

The second approach is to tune what work must be done to better suit the machinery we use to do it. This is what we usually mean when optimization comes up, although in general we should prefer to find ways to simplify and reduce the problem.

Two opinions about programming date from those days. … The one opinion was that a really competent programmer should be puzzle-minded and very fond of clever tricks; the other opinion was that programming was nothing more than optimizing the efficiency of the computational process, in one direction or the other.

– EWD, "The Humble Programmer"

To be clever means to be "sharp", "quick" and "skillful". The old English roots suggest a sense of "cutting" and "keenness". It would be cleverness to design a tool which does the difficult or seemingly impossible within resource constraints by leveraging intimate knowledge of the system.

This formulation of a "clever trick" implies that a really clever trick must also have a high complexity. This is the root of the trouble with optimizations of the second kind. Clever tricks while perhaps relevant in the moment are in the long term likely to prove troublesome.

Worse, clever tricks of this kind require specializing some part or all of your program so that it reflects performance considerations. This has the particular negative consequence of meaning that your code cannot be so simple and algorithmic anymore - it must be reflective of some state(s) or trick(s) used to achieve acceptable performance because performance is now a crosscutting concern which must be accounted for by your abstractions.

Frequently, reasoning from first principles is not enough to produce fully optimized code. Modern high performance sorting algorithms are absolutely fascinating, because while the algorithmic complexity may suggest that provably optimal algorithms such as mergesort or quicksort should be universally adopted it turns out there are frequently ways to cheat and do better. Modern collections libraries use adaptive sorting strategies which benefit from existing sortedness in the input collection and frequently contain multiple different sorting implementations which are applied to different sizes of collection.

Even apparently simple statements like "doing two string concatenations will be slower than doing one" are impossible to evaluate without a real effort at profiling. The JVM among other platforms recognize that string manipulation is "expensive" due to the need to resize and copy buffers. However, both the Oracle JDK and OpenJDK actually include several optimizations which detect code doing trivial string concatenations and silently replace it with equivalent code which uses the StringBuilder class which delays buffer resizing as long as possible.

If you want fast, start with comprehensible. You can't make it fast if you can't change it.

– Paul Phillips, "We're Doing It All Wrong" (2013)

Unless there are known constraints, such as here the hard throughput requirement of a message a day question of "is it fast enough?" and the statements "this isn't performant" or "we need to do this for performance" are totally meaningless. Worse, they detract from the goal which should be to ship a tool which serves a purpose and introduce artificial pressure to conform the choice of abstractions to arbitrary false goals.

6 Fin: Purism

Niklaus Wirth's work is notable superficially because Wirth is himself a prodigious hacker. He's responsible for, among other projects, no fewer than five separate programming languages and two operating systems. What's the trick? How does he do it?

In the implementation of the Modula-2 compiler, Wirth and his students used two metrics to estimate and track its quality over the course of its evolution. One was the simple line count of the program - a poor proxy for complexity but a data point none the less. Another was the time which it took the compiler to compile itself.

One of Wirth's students implemented an elegant hash table backed by binary trees, for use in mapping strings to symbols - to look up local variables and implement lexical scope. Doing so introduced a new, complex datastructure, increased the line count of the compiler and added at best no performance benefit for it turned out most scopes in real programs were small for which a simple association list could be faster. Wirth ripped the hash table out.

Wirth was also a great user of tools, such as Backus-Naur Form and parser generators for expressing precisely the meaning which he wished to ascribe to a given set of symbols and the notation which his languages should represent. Even in these, Wirth's ascetic attachment to simplicity is clear - all of his languages are simple and can be efficiently implemented without any clever tricks whatsoever.

Simplicity and quality aren't properties which code or systems get by osmosis or by grant of the gods. They are the product of deliberate effort by programmers to resist the entropic pressure applied by reality and customer requirements on systems and to in spite of that pressure carve out cohesive, appropriate abstractions which leave room for growth.

It would be equally absurd, then, to expect that in unjust, cowardly, and voluptuous action there should be a mean, an excess, and a deficiency; for at that rate there would be a mean of excess and of deficiency, an excess of excess, and a deficiency of deficiency. But as there is no excess and deficiency of temperance and courage because what is intermediate is in a sense an extreme, so too of the actions we have mentioned there is no mean nor any excess and deficiency, but however they are done they are wrong; for in general there is neither a mean of excess and deficiency, nor excess and deficiency of a mean.

– Aristotle, "Nichomachean Ethics", Book II, Ch. 6

In the "Nichomachean Ethics", Aristotle (or rather his student Plato writing in his master's name) develops a system of ethics on the basis that one must always choose the mean between excess and deficiency.

There is a fashion in the software community to be "pragmatic". That is, to be tempered or practical with respect to an attitude or policy. Unfortunately, if you temper temperance with an added dose of reality, you no longer have "pragmatism", you have a regression. Software as an industry is built on an incrementalism of "progress" which is informed by experience but largely unaware and often dismissive of the possibility space.

Before we can claim to be pragmatic, we must have a vision of the perfection from which we turn aside in practical interests. It is my opinion that the overwhelming majority of working programmers are not conscious of what could be; of how higher order tools could help them; of how the pains they feel are the fault of specific failures of their tools and are not essential.

My goal with this talk was to share with you some of the philosophy and purism which I carry with me and which I hope helps me to strive for more perfect solutions. I hope that it helps you too.