Last Mile Maintainership

So Daniel Compton (@danielwithmusic) is a good bloke. We've been co-conspirators on a number of projects at this point, and I just wanted to share a quick vignette before I pack it in for the night.

Almost a year ago, James Brennan (@jpb) was kind enough to offer up a pull request (#158) to the kibit tool which Daniel and I currently maintain. We're both relatively inactive maintainers all things told. Kibit largely does what it's supposed to do and is widely used for which neither of us can take much credit. We're stewards of an already successful tool.

Between Daniel's day job and my move out to San Francisco #158 just got lost in the shuffle. It's an awesome feature. It enables kibit to, using the excellent rewrite-clj library, automatically refactor your code for style. If kibit can find a "preferred" replacement expression, thanks to James's work #158 enabled kibit to make the replacement for you. While Daniel and I kinda just watched James pushed it to feature completeness and found a significant performance win which made it not just a compelling feature but fast enough that you'd want to use it.

Then a couple months passed. Daniel and I had other things to do and presumably so did James.

About 11 hours ago now, I saw a github notification about a new comment - "Feature: lein kibit fix or similar that applies all suggestions" (#177). Cody Canning (@ccann) was suggesting exactly the feature James had offered an implementation of.

At this point James' patch adding exactly this feature had sat idle for many months. Some other things had come in, been more active and been merged. James' changeset now had conflicts.

Following the github help docs for how to check out a pull request (spoiler: git fetch $UPSTREAM_REPO pull/ID/head:$NEW_LOCAL_BRANCHNAME) I had James' patches on my laptop in less than a minute. git merge immediately showed that there were two sources of conflict - the kibit driver namespace had had its namespace refactored for style and a docstring had been added to the main driver function which James' patches touched. The other was that dependencies had been bumped in the project.clj.

Fixing this took.... five minutes? Tops?

The test suite was clean and in 11 minutes Daniel merged my trivial patch to James' awesome work done and live.

The whole process of about 10 months was overwhelmingly waiting. James finished the patch in like four days (April 20 '16 - April 26 '16). Daniel and I were just bad maintainers at getting it shipped.

Were Daniel and I worse maintainers, we could have seen #177 come in and asked either Cody or James to update the patch. It would have taken maybe five minutes tops to write that mail and maybe it would have saved me 15 minutes and Daniel 5.

After months of waiting? Why?

I've written before about my own thoughts on code review after working in an organization which is high trust, high ownership and sometimes it feels high process anyway. In this case and I'm sorry to say almost a year late, I went by what I've come to believe - that reviewers should make an effort to take responsibility for merging code rather than requiring the primary author to do all the leg work.

Sure I could probably have pinged James or suckered Cody into writing that merge commit but why? What does that buy anybody? It was so, so easy to just take James' changes and merge myself rather than asking someone else for trivial revisions. And it makes for a better process for contributors. It's not their fault that your project has grown merge conflicts with their changes.

If there had been a huge conflict, or James' changes had seemed somehow deeply unreasonable it would have been a different story.

But going the last mile for your contributors is worthwhile


Nihilist Reviewboard

Let's talk about another concept that's as old as the hills - code review.

"Design and code inspections to reduce errors in program development" [Fagan 1976] (pdf) introduced the notion for a structured process of reviewing programs & designs. The central argument which Fagan presents is that it is possible to quantitatively review software for flaws early in the development cycle, and to iterate on development while the cost of change is low compared to cost of iterating on software which had been deployed to customers.

The terminology is a little archaic, but in all the elapsed time the fundamental idea holds. Code review for Fagan is as much an architectural design review as it is anything else.

This shouldn't be terribly surprising, given some of the particular concerns Fagan's process is designed with addressing. While many of these things haven't intentionally changed, some of these concerns such as the specifics of register restoration reflect the paper's age.

While the underlying goal of the code review process, to examine software for flaws early and often, has not changed meaningfully in the intervening decades many of the particulars of process described by Fagan reflect a rigidity of process and a scale of endeavor which is no longer reflective of the state of industry at large.

Fagan's process is designed to prevent architecture level mistakes through intensive review, as well as to detect "normal" bugs en-masse and provide a natural workflow for iterative searching for and fixing of bugs until the artifact is deemed of sufficient quality. This is a work of process engineering optimized for code being slow & difficult to write, and for software being slow & risky to ship.

So what does a modern code review system look like? What makes for a good code review? What has changed?

With the cheapening of computer time, advent of integrated testing systems, generative testing, continuous integration and high level languages, many of the properties which previously required extensive deliberate human review can now be automatically provided. Likewise, modern linters & formatters can provide extensive stylistic criticism and enforce a degree of regularity across entire codebases.

Continuous delivery systems and incremental deployment methodologies also serve to mitigate the expensive "big bang" releases which informed Fagan's process. Continuous or near continuous delivery capabilities mean that teams can be more focused on shipping & testing incremental products. Artifacts don't have to be fully baked or finalized before they are deployed.

Similarly, linters & other automatic code inspection together with the advantages of high level languages at once make it possible to make meaningful change to artifacts much more rapidly and to automatically detect entire classes of flaws for remediation before an author even begins to engage others for review.

Ultimately, the job of every team is to deliver software. In many contexts, incomplete solutions, delivered promptly & iterated on rapidly, are superior to fuller solution on a longer release cadence.

So. What does this mean for code reviews?

Reid's Rules of Review

True to the style of The Elements of Style, these rules are hard, fast and have exceptions. They're deeply motivated by the tooling & engineering context described above. If your team has missile-launching or life ending reliability concerns, you'll want a different approach. If you can't trivially test or re-deploy or partially roll out your code you'll also want a different approach. This is just the way I want to work & think people should try to work.

1. Ensure that the artifact is approachable.

If you are not able to understand a changeset or a new artifact, let alone the process by which its author arrived at the current choice of design decisions & trade-offs, that is itself a deeply meaningful criticism because it means that both the code is unclear and the motivational documents are deeply lacking.

As the reviewee the onus is on you to enable your reviewers to offer high level criticism by removing low level barriers to understanding.

1.1. Corollary: Write the docs.

As the reviewee, how are your reviewers supposed to understand what problem(s) you're trying to solve or the approach you're taking if you don't explain it? Link the ticket(s). Write docstrings for your code. Include examples so that it's obvious what and how. Write a meaningful documentation page explaining the entire project so that the why is captured.

1.2. Corollary: Write the tests.

Those examples you wrote? They should be tests.

I'm totally guilty of the "It's correct by construction! Stop giving me this tests pls crap!" but it's a real anti-pattern.

As the reviewee even to the extent that you may succeed in producing or composing diamonds, someone will eventually come along and refactor what you've written and if there aren't tests covering the current behavior who knows what will happen then. You may even be the person who introduces that regression and won't you feel silly then.

Furthermore tests help your reviewers approach your code by offering examples & demonstrations. Tests aren't a replacement for documentation and examples, but they certainly help.

1.3. Corollary: Run the linter.

If you have a style guide, stick to it in so much as is reasonable. As the reviewee if you've deviated from the guide to which your coworkers are accustomed you've just made it harder for your coworkers to meaningfully approach and criticize the changes you're proposing. You've decreased the review's value for everyone involved.

2. Criticize the approach.

Algorithmic or strategic commentary is the most valuable thing you can offer to a coworker. Linters and automatic tooling can't really help here. Insight about the future failings of the current choice of techniques, benefits of other techniques and available tools can all lead to deeply meaningful improvements in code quality and to learning among the team. This kind of review may be difficult to offer since it requires getting inside the author's head and understanding both the problem and the motivations which brought them to the decisions they made, but this can really be an opportunity to prevent design flaws and teach. It's worth the effort.

3. Don't bike shed your reviewee.

If the code works and is close to acceptable, leave comments & accept it. The professional onus is on the reviewee to determine and make appropriate changes. It's not worth your or their time to go around and around in a review cycle with a turn around time in hours over this or that bikeshed.

3.1. Corollary for the reviewer: Style guides are something you apply to yourself, not something you do to others.


If someone clearly threw the style guide out the window or didn't run the linter, then a style guide review is appropriate. Style guides should be automatically enforced, or if tooling is not available then they should be mostly aspirational.

What's the point of wasting two or more humans' time doing syntactic accounting for minor infractions? If it takes half an hour to review a change, maybe another hour before the reviewee can respond to changes, half an hour or more to make the requested changes and then the updated changeset has to be reviewed again syntax and indentation bike sheds in code review can easily consume whole work days.

3.2. Corollary for the reviewer: Don't talk about performance concerns unless you have metrics in hand.

Seriously. Need to push thousands of requests per second? Yeah you may care about the performance of an inner loop somewhere. Performance criticisms are meaningful and you should have performance metrics already.

Got a service that'll see a few hundred requests at the outside? Who cares! It can probably be quintic and still get the job done.

It is far better to write inefficient code which is easy to understand & modify, ship it and iterate. If legitimate performance needs arise, code which can be understood and modified can always be refactored and optimized.

Code which is optimized early at the expense of understanding and modifiability is a huge mistake because much as it may make the reviewee or the reviewer feel clever to find some speedup, that speedup may or may not add value and the semantic cost of the optimizations increases the maintenance or carrying cost of the codebase as a whole.

3.3. Corollary for the reviewer: Don't be dogmatic.

There are exceptions to every rule. There are concrete time and morale costs to requesting change. Be mindful of these, and remember that you're all in the same codebase together with the same goal of shipping.

4. Hold your coworkers accountable after the fact and ship accordingly.

If their service is broken, they who broke it get to be on the front lines of fixing it. The consequence of this is that you should trust your coworkers and prefer shipping their code thanks to a culture of ongoing responsibility. Your default should be to accept code and ship code more or less whenever possible.

In short, don't be this guy


Composition and Diamonds

In software, there is an ever present tempation to declare that something is finished. To look upon an artifact, to pronounce it perfect, and to believe that it will persist unchanged for all time. This is the model of "martian computing" which begat the Urbit project. And it's wrong.

A specification is a precise description of what an entity is, typically written in terms of decomposition. An abstraction is an attempt to describe an entity, or class of entities, in more general terms. Where a specification will define precisely how something happens, an abstraction will merely state that it will happen.

Abstractions may be judged by their hardness -- that is, the strength of the invariants they enforce internally or provide externally, and those which they require but leave to their environment to ensure.

Some abstractions, like the idea of a generator or a stream, are weak in that they require little and provide little. All the notion of a generator exports is a contract or pattern for getting more values and by which the source will signal when its end has been reached. Yet this is a convenient model for the sequential consumption of any number of chunked sequential or eventual value sources which presumes nothing about how the values are generated.

We can define the abstraction of

filter :: (λ a → Bool) → [a] → [a]

(Note: in Haskell notation that's "filter is a function of an a function which returns a boolean for any a and a source of as to a source of as") to be x for x in xs if not f(x). In Python, this exact formulation is an explicitly sequential generator which preserves the order of elements. But what does filter actually have to do? Does the order of elements matter? Should it? When should an element's membership in the result be determined? Does it matter? Why would it matter?

The type of filter is part of the abstraction, but it is a weak contract compared to either of the operational formulations above. Consider what other functions could be defined that satisfy the type signature λ (λ a → Bool) → [a] → [a] as above. You could define a function which repeats the first element for which the provided function is true forever. You could define a function which repeats the 2nd element for which the provided function is true only as many times as the are elements in the input sequence. You could define a function which ignores the function argument and returns the input sequence. You could define a function which ignores the function argument and returns the input sequence reversed. And on and on and on and on.

A more precise definition of filter would be ∄x∈filter(f, xs) | f(x) is false. (Note: to unpack the notation, that is "there is no x in filter(f, xs) such that f(x) is false") This is a far better, more general abstraction. At an operational semantics level, filter could shuffle. It could operate in parallel on subsequences and return a parallel "first to deliver" concatenation. It could be lazy or any manner of other things.

Let's consider another abstraction - the (first, next) or cons cell.

+-------+------+    +-------+------+
| first | next | -> | first | next | -> null
+-------+------+    +-------+------+
   |                    |
   v                    v
  val                  val

This is, honestly, a really bad abstraction because it's quite explicit about the details. Heck the name "cons", "car" and "cdr" are all historical baggage. However this is an abstraction. It provides the notion of the first of a list, the next or rest of the list, and the end of the list being nil. In doing so provides a model for thought to be sure, but it hides none of the details of the machine. As processor core speed has outstripped memory access speed and as caches have become more and more important for circumventing the Von Neuman bottleneck, it has become a progressively less relevant abstraction because it is precise about machine details which are less and less appropriate to modern machines.

For this reason many Lisp family systems choose to provide what are referred to as CDR-optimized or chunked lists. These are list-like structures wherein a number of value links are grouped together with a single next link.

 | first | second | third | fourth | fifth | sixth | // | next | -> null
     |       |
     v       v
    val     val

For instance a list of eight elements could fit entirely within a single chunk, and occupies a contiguous block of memory which provides more cache locality for linear traversals or adding elements to the end. However, this chunked model makes splicing sub-lists, slicing, or explicitly manipulating next links expensive because the next link doesn't exist! For instance if from (0, 1, 2, ..., 10) as a CDR₇ encoded list one were to try and slice out the sub-list [1...5], one could build a "sub-list" structure which refers to the substructure of the source list. The instant one tries to alter a link pointer within the extracted sub-list, the entire sub-list must be copied so that there exists a link pointer to be manipulated. However all these approaches to chunking, slicing, and manipulation still easily provide a common first, next, end sequence traversal abstraction.

So what does this mean about abstractions generally? Abstractions are models for computation and are relevant in a context. For instance, big-O analysis of an algorithm is an analysis of asymptotic performance with respect to an abstract machine. It is not a precise analysis of the performance of the algorithm with respect to the average or worst cases on a physical machine. These details, however, are the things which programmers care about. O(N) could mean T(100*N) or T(N/2). In order to be useful for practicing programmers, abstractions must eventually become more detailed than they must be as tools for proof. It is not enough to know that f(xs) will be sorted; programmers are at least accustomed to expectations that f(xs) will occur in such and such time and space. Were those expectations to be violated or suddenly change, program architecture decisions which presumed those performance properties would have to be revisited.

Church numerals are an interesting case of this mismatch between tools for thought and tools for implementation. They're a useful tool for expressing abstract arithmetic and repetition in a proof divorced from any practicable machine. You can express division, remainders, negatives, and even imaginary numbers this way. Church numerals provide a natural representation for arbitrarily large values in the context of the lambda calculus. But they're grossly mismatched with the realities of finite binary machines which working on fixed length bit vectors. Bit vector machines can't capture the entire unbounded domain of Church numerals. But we can't build a machine which can perform arithmetic on Church numerals with the same performance of a bit vector machine. It's fundamentally a trade-off between a tool for thought and a tool for implementing and reasoning about a physical machine.

This pattern has consequences for the decisions we make when designing software. It may be hubristically tempting to conceive of the artifacts we develop as generation ships; construct which will long survive us without significant structural change if we but exercise appropriate art and find the right Martian gem, reality is far less forgiving. Rarely is there a diamond-hard artifact so divorced from business concerns that it can adequately weather the ravages of time unchanged.

Rather than seek such gems -- or, in failing to produce such a gem, making an excessive number of trade-offs -- good software engineering should be characterized by using and producing a number of small abstractions. Small abstractions are advantageous because they provide little and expose little, thus involving a minimum number of externalities and minimizing vulnerability to crosscutting concerns. In order to build a system of any size or complexity, composing several such abstractions is required. If, due to a change in externalities, one or several such small abstractions become inappropriate, replacing a small abstraction, in the worst case, involves no more impact to the system as a whole than replacing a larger -- or, worse, monolithic (no) -- abstraction. Due to small changing surface area it is likely that reuse between the initial and successor system states will be maximized and the cost to transition the system will be lower than if there were a large or so-large-as-to-be-no abstraction which must be replaced almost entirely.

So. Write better software by decoupling. Seek to prevent or at least minimize crosscutting concerns. Take advantage of control flow abstraction and compose abstractions together. Mars doesn't have a monolithic diamond. It is a field of small gems.

Spoiler warning: this document is largely a product of an evening arguing with @ztellman, being fixated and barely sober enough to write when I got home. @argumatronic egged me on to finish this and subsequently was kind enough to copy edit early versions of this for me. I've been told that many of the ideas appearing here will soon appear in his book, and wanted to note that for the most part Zack got here first.


Of Inertia and Computing

Let's talk about Stanislav, probably best known for his blog loper-os. He's interesting in that I'm indebted to him for his writing, which was very personally influential, and yet I consider him useless.

I didn't realize this at first, but Stanislav is rabid about the ergonomics of computing. BruceM was kind enough to make this comment the last time I voiced my objection to Stanislav's writings,

The Lisp Machines have pretty much nothing to do with a “lost FP nirvana”. Working in Common Lisp, or the Lisp Machine Lisps like ZetaLisp or Interlisp, had little to do with functional programming. Lisp Machines are about an entire experience for expert developers. It isn’t just the OS. It isn’t just the hardware (compared to other hardware of the time). It isn’t just the applications that people ran on them. It was the entire environment, how it worked, how it fit together and how it was built to be a productive environment.

I'm a software developer by training and by trade. Maybe on a good day I'm a software engineer, on a rare day a computer scientist. For the most part my job consists of slinging code. Lots of code. Code that interacts with ugly, messy but somehow important systems and software. For me, the goal of computing and the goal of my work is to take the dreck out of software development.

The lisp machines of which Stanislav and above Bruce write predate me. I've been fortunate to once work once with someone who developed software for them, but that's hardly the same. I've never seen one, and I've never used one. All I know because I grew up on them is the current generation of x86 and ARM laptops running Windows, Linux and MacOS.

In short, I've never had an ergonomic computing experience of the kind they describe. No chorded keyboards. No really innovative -- forget convenient or helpful -- operating systems. Maybe at best I had error messages which tried to be helpful, Stack Overflow, and on a good day, the source code.

Stanislav's essays, much in the same spirit as the work from Engelbart (Mother of all Demos) and Bush (As We May Think) which he references, suggest that computers could and should be should be vehicles for thought. They could be tools by which humans can wrangle more information and explore ideas far and beyond the capabilities of an unaided human. This is as I understand it the entire thrust of the field of Cybernetics.

And indeed, for computer experts, computers can be huge enabler. I've seen people extract amazing results about society and individual behavior from simple SQL databases sampling social network data. I've personally been able to at least attack if not wrangle the immense complexity of the tools we use to program largely because I have had immensely fast computers with gigabytes of ram on which I can play with even the most resource inefficient of ideas without really worrying about it. But these capabilities are largely outside the reach of your average person. You and I may be able to hammer out a simple screen scraper in a matter of an hour, but just consider explaining all of what's involved in doing so to a lay person. You've got to explain HTTP, requests, sessions, HTML, the DOM, parsing, tree selectors and on and on -- forget the tool you're using -- to do all of this.

So by and large, I agree with Stanislav. We've come too far and software is too complex for anyone but trained professionals to really get anything done, and even to trained professionals computers hardly offer the leverage which it was hoped they would. Consider how much time we as programmers spend implementing (explaining to a computer) solutions to business problems, compared to how much time is required to actually devise the solution we implement? Computers enable us to build tools and analyses on an otherwise undreamed of scale. But that very scale has enabled the accumulation of hidden debt. Decisions that made sense at the time. Mistakes that lurk unseen beneath the surface until we see an 0day in the wild.

Given as Joe Armstrong said "the mess we're in", what of the industry is worth saving? Has the sunk cost fallacy sunk its hooks in so deep that we are unable to escape fallacy as it may be?

Stanislav among others seems to believe that the only way forwards is to lay waste to the industry from the very silicon up; to find a green field and in it build a city on a hill abandoning even modern CPU architectures as too tainted.

My contention is this - that Stanislav is off base. From the perspective of hardware we've come so far that "falling back" to simpler times just doesn't make sense. Where will someone find the financial resources to compete with Intel and ARM as a chipmaker with a novel simplified instruction set? Intel has the fabs. They have the brain trust of CPU architects. They have the industry secrets, a culture of the required art, and most importantly, existing infrastructure around proof and implementation validation. The only challenger in this field is Mill Computing who seem to have been struggling between filing patents and raising funding since their first published lectures two years ago. Furthermore adopting a new CPU architecture means leaving behind all the existing x86 tuned compilers, JITs and runtimes. Even with all their resources Intel wasn't able to overcome the industry's inertia and drive adoption of their own green field Itanium architecture.

Nor will such a revolution arise of hobbyists. Who will be willing to accept a sub-gigahertz machine or the otherwise crippled result of a less than professional CPU effort? Where will a hobby CPU effort find the financial resources for a small, high cost per unit production run or the market for a large one?

Even if the developer story were incredibly compelling, the rise to dominance of another platform given the existing investments of billions of dollars not only in literal dollar spend, but in the time of really excellent programmers writing nontrivial software like Clang, the JVM and modern browsers? It's a huge investment in throwing away mostly working software, one few people can or will make. But I'm OK with that. Intel and ARM chips are really good. It's the software ergonomics and software quality that's the real problem here today in 2016, not the hardware we run our shoddy programs on.

Likewise network protocols by and large are ossified. There is so much value in UDP, TCP/IP, HTTP and the Unix model of ports just from being able to interact with existing software deployments that I think they'll be almost impossible to shake. Consider the consumer side. Were a company to have so much courage and such resources that they were to begin down this path, the first question they'll be met with is "does this machine run Facebook"? How can a competitor network stack even begin to meet that question with an answer of "we can't interoperate with the systems you're used to communicating with"?

Microsoft for a while ran a project named Midori which Joe Duffy wrote some excellent blog posts about, available here. Midori was supposed to be an experiment at innovating across the entire operating system stack using lessons learned from the .NET projects. Ultimately Midori was a failure in that it never shipped as a product, although the lessons learned therein are impacting technical direction. But I'll leave that story for Joe to tell.

I propose an analogy. Software and organizations which rely on software can be considered as Newtonian objects in motion. Their inertia is proportional to their mass and their velocity. Inertia here is a proxy for investment. A small organization moving slowly (under low pressure to perform) can change tools or experiment without much effort. A small organization moving faster (under high pressure to execute some other task) will have much greater difficulty justifying a change in tooling because such a change detracts from their primary mission. The larger an organization grows, the more significant the "energy" costs of change become. Other people in the organization build tooling, command line histories, documentation, and organizational knowledge, all of which rides on the system as it is. The organization size here extends beyond the scope of the parent organization to encompass all users.

The Oracle Java project has explicitly made the promise of stability and binary backwards compatibility, and is thus a great study in the extremes of this model. Now the Java project's guardians are forever bound to that expectation, and the cost of change is enormous. Brian Goetz gave an amazing talk on exactly this at Clojure/Conj in 2014. Java's value proposition is literally the juggernaut of its inertia. It won't change because it can't. The value proposition and the inertia forbid it, except in so far as change can be additive. While immensely convenient from a user's perspective, this is arguably a fatal flaw from a systems perspective. Object equality is fundamentally unsound and inextensible. The Java type system has repeatedly (1, 2, 3) been found unsound, at least statically. The standard library is not so much a homogeneous library, but rather a series of strata -- Enumerations from way back when, interfaces with unrelated methods, interfaces which presume mutability... the list of decisions which made sense at the time and in retrospect could have been done better goes on and on. And because of the stability guarantees of the JVM, they'll never be revisited because to reconsider them would break compatibility -- an unthinkable thing.

As you may be able to tell at this point, I firmly believe that software is in a sense discovered rather than invented and that creative destruction applies directly to the question of developing software. As suggested by Parnas et all, we can't rationally design software. We can only iterate on design since final design requirements for an artifact are unknowable and change as the artifact evolves together with its users and their understanding of it. Thus systems like the JVM which propose a specification which can only be extended are as it were dead on arrival unless as with scheme they are so small and leave so much to libraries that they hardly exist. The long term system consequences of architectural decisions are unknowable, which admits of keeping mistakes forever. Moreover even in the optimistic case, better design patterns and techniques when discovered cannot be applied to legacy APIs. This ensures that there will be a constant and growing mismatch between the "best practice" of the day and the historical design philosophies of the tools to hand.

Eick et all propose a model of "code decay" which I find highly compelling, and which feeds into this idea. Eick's contention is that, as more authors come onto a project and as the project evolves under their hands, what conceptual integrity existed at the beginning slowly gets eaten away. New contributors may not have the same vision, or the same appreciation of the codebase, or the underlying architecture may be inappropriate to new requirements. These factors all lead back to the same problem: code and abstractions rot unless actively maintained and curated with an eye to their unifying theories.

All of this leads to the thought which has been stuck in my mind for almost a year now. Stanislav is right in that we need a revolution in computing, but it is not the revolution he thinks he wants. There is no lost golden past to return to, no revanchist land of lisp to be reclaimed from the invaders. The inertia of the industry at large is too high to support a true "burn the earth" project, at least in any established context. What we need is a new generation of new, smaller systems which by dint of small size and integrity are better able to resist the ravages of time and industry.

I found Stanislav's writing useful in that it's a call back to cybernetics, ergonomics, and an almost Wirthian vision of simplicity and integrity in a world of "flavor of the week" Javascript frameworks and nonsense. But in its particulars, I find his writing at best distracting. As I wrote previously (2014), I don't think we'll ever see "real" lisp machines again. Modern hardware is really good, better than anything that came before.

Max really hit this one on the head -

Stanislav's penchant to focus on the hardware which we use to compute seems to reductively blame it for all the ills of the modern software industry. This falls directly into the usual trap we programmers set for ourselves. Being smart, we believe that we can come along and reduce any problem to its essentials and solve it. After all that's what we do all day long to software problems, why can't we do the same for food and a hundred other domains? The problem is that we often miss the historical and social context around the original solutions -- the very things which made them viable, even valuable.

The future is in improving the ergonomics of software and its development. The hardware is so far removed from your average programmer -- let alone user -- that it doesn't matter.

Right now, it's too hard to understand what assumptions are being made inside of a program. No language really offers contracts on values in a seriously usable format. Few enough languages even have a real concept of values. It's too hard to develop tooling, especially around refactoring and impact analysis. People are wont to invent their own little DSL and, unable to derive "for free" any sort of syntax analysis or linter, walk away, leaving behind them a hand rolled regex parser that'll haunt heirs of the codebase for years. Even if you're lucky enough to have a specification, you can easily wind up with a mess like JSON, or a language the only validator for which is the program which consumes it. This leaves us in the current ever-growing tarpit of tools which work but which we can't understand and which we're loath to modify for compatibility reasons.

I don't have an answer to this. It seems like a looming crisis of identity, integrity, and usability for the software professions generally. My goal in writing this rant isn't just to dunk on Stanislav or make you think I have any idea what I'm doing. My hope is that you find the usability and integrity problems of software as it is today as irksome as I do, and consider how to make the software you work on suck less, and how your tools could suck less. What we need is a new generation of new, smaller, actively maintained and evolved software systems which by dint of small size and dedication to integrity are better able to resist the ravages of time and industry. In this field hobbyist efforts are useful. It's a greenfield fundamentally, but one running atop and interfacing with the brown field we all rely on. These dynamics I think explain much of the "language renaissance" which we're currently experiencing.

There are no heroes with silver swords or bullets to save us and drag us from the tarpit. The only means we have to save ourselves is deliberate exploration for a firm path out, and a lot of lead bullets.


A Better VM

For the last couple of years I've been working with Clojure, a lisp which runs on top of the JVM. My reservations with Clojure itself, and Clojure's maintainership are at this point fairly well established. However I'd be lying if I said that after thinking long and hard about the way I want to develop software I've come up with anything incrementally achievable and better. Clojure's syntax is convenient. Its datastructures are clever. Its immutable defaults are sane with respect to any other language. Its integration with the JVM while fatal to its own semantics ensure unmatched leverage. In short, I don't know if it's possible to do better atop the JVM.

But why use the JVM?

The JVM itself is a wonder of engineering, compiler technology and language research. But the JVM standard library is deeply mutable and makes assumptions about the way that we can and should program that aren't true anymore. While the JVM itself may be awesome, I'm just not convinced that the object/class library it comes with is something you'd actually want around as a language designer. Especially as the designer of a functionally oriented language, or a language with a different typing/dispatch model than Java's.

The conclusion I personally came to was that, faced with what already exists on the JVM I couldn't muster the wherewithall to work in that brownfield. I needed a greenfield to work with. Somewhere I could explore, have fun exploring and make mistakes.


I honestly don't remember why I chose this name, but it's stuck in my head and adorns the git repo where all of this lives. Dirt isn't something I'm gonna be releasing on github, although you can totally browse the code on Simply, dirt isnt's intended for anyone else to use or contribute to now or in the foreseeable future. It's my experiment, and I think that my total control over the project is important to, well, finishing it sometime.

So what's the goal?

Versioning is something I think is really important both at the package and compilation unit level. I previously wrote about artifact versioning, and experimented with versioned namespaces in my fork of Clojure. Unfortunately, as a user experience versioned namespaces didn't pan out. There were too many corner cases, and too many ways that partial recompilation could occur and generate disruptive, useless warnings. So versioning is one major factor in Dirt's design.

Another is functional programming. After working with the JVM, I'm pretty much convinced that design by inheritance is just flat out wrong. While I was working with Dr Perry, he shared an awesome paper with me: Object-Oriented programs and Testing. The point of the paper essentially is that testing inheritance structured programs adequately is really hard and everybody does it wrong. Testing mutable programs is already hard enough, and single inheritance poses so many design difficulties that it just doesn't seem worthwhile. In my time using Clojure, I found that I never actually needed inheritance. Interfaces satisfied, and functions which operated against interfaces were better captured as functions in namespaces than as inherited static functions packaged away in some base class or helper package.

The failure mode of interfaces as presented by Java however is that they are closed. Only the implementer of a type can make it participate in an interface. This is far too restrictive. Haskell style typeclasses get much closer to my idea of what interfaces should look like, in terms of being open to extension/implementation over other types and carrying contracts.

Which brings me to my final goal: contracts or at least annotations. I like types. I like static dispatch. While I can work in dynlangs, I find that the instant my code stops being monomorphic at least with respect to some existential type constructed of interfaces/protocols I start tearing my hair out. Types are great for dispatch, but there's lots and lots of other static metadata about programs which one could potentially do dataflow analysis with. For all that I think Shen is batshit crazy and useless, the fact that it provides a user extensible type system which can express annotations is super interesting. Basically I think that if programmers are given better-than-Clojure program metadata, and tools for interacting with program metadata that it would be possible to push back against @Jonathan_Blow's excellent observation that comments and documentation always go stale and become worthless.

So what's the architecture?

DirtVM so far is just a collection of garbage collectors and logging interfaces I've built. All of the rest of this is hot air I hope to get to some day.

The fundamental data architecture is something like this:

    Group, Package
    Terms: (GPL1 | GPL2 | EPL | MIT | Other)
    (bag of attributes)
    List[(Name, Version)]

  Imports: List[Import]
    Types, Ifaces, Fns

  Namespace, Alias

The essential idea is that because versioning is so hard, it's easier to fix the runtime to allow co-hosting of multiple versions of artifacts than to somehow try and solve the many many difficulties of software versioning and artifact development. Java 9 modules look really really good and come close to being an appropriate solution, but the Java team have abandoned the idea of versioned modules. In Dirt when code is compiled within a namespace it has access only to what has been explicitly imported by that namespace. Imports are restricted to the contents of the module and the module's direct dependencies. It is not possible for a namespace to import a transitively depended module. This means that at all times a user is in direct control of what version of a function or a type they are interacting with. There is no uncertainty.

This gets a little sticky for datastructures. If module A depends directly on B and B depends directly on C, it's possible that B will return into A a data structure, function or closure which comes from the module C. This turns out to work fine. Within a single module, protocol/interface dispatch is done only against the implementation(s) visible in that module scope. Because A has no knowledge at compile time of any such type from C, it can't do anything with such a type except hand it back to B which can use it.

Types and interfaces are very haskell style. Mutability will be supported, but avoided in the standard library wherever possible. Interfaces will be typeclass style pattern matching dispatch, not call target determined. This makes real contracts like Eq possible and extensible rather than being totally insane like non-commutative object equality. Types are just records, and will be internally named and distinct by package, version namespace and name. This makes it possible to have multiple consistent implementations of the same interface against versions of the same type co-hosted. Much in the Haskell style, the hope is that for the most part interface dispatch can be statically resolved.

Why record types instead of a Lua or Python style dynamic object dispatch system? Because after working with Clojure for a while now it's become clear to me that whatever advantages dynamic typing may offer in the small are entirely outweighed by static typing's advantages in the large, and that packaging functions into objects and out of namespaces buys you nothing. While dynamic typing and instance dispatch can enable open dispatch they also defeat case based reasoning when reliability is required. Frankly my most reliable Clojure code would have translated directly to Haskell or Ocaml. Refactoring matters especially as projects grow. Being able to induct someone else to your code matters. Being able to produce meaningful errors someone understand and can trace to a root cause requires information equivalent to types. Dynamic typing just obscures static contracts and enables violations to inadvertently occur, leaning on exhaustive test coverage. Dynamic typing introduces concrete runtime costs, and slows down program evolution because building tools is simply harder. Tools matter, so static typing ho.

In addition to interfaces/typeclasses, there are also fns (fn in Clojure) which are statically typed, non-extensible, single arity procedures. Despite impurity, the term function is used for these in keeping with industry convention.

The namespaces are very much Clojure style, because I've been really happy with the way that Clojure's namespaces work out for the most part and I want to support a language which isn't at least syntactically and in the namespace system that distant from Clojure. Import renaming is awful, but qualified imports are fine hence why imports support aliases.

The ultimate goal of this project is to be able to present a virtual machine interface which is itself versioned. Imagine if you could write software which used dependencies themselves targeting incompatible/old versions of the standard library! That would solve the whole problem of language and library evolution being held to the lowest common denominator.

Dirt itself will be a garbage collected, mostly statically typed bytecode VM much like the JVM. Probably gonna get a ssa/dataflow bytecode level representation rather than a stack machine structure. But that level of detail I'll figure out when I get to it. For now I'm having fun with writing C and garbage collectors. The next step will probably be some pre-dirt language to help me generate C effectively.

Here's to log cabins and projects of passion!