Techwriting 101

At Twitter we have regular employee training courses on a number of subjects. These are my notes from our class on writing technical documentation for developers.


What are docs

Anything from diagrams to stained bar napkins. Could be a book, could be a public website, could be an internal resource.

Docs are the way that experts re-visit subject domains. They're how your customers / other service owners read in on your system. They're how new hires will start orienting themselves to the Twitter environment.

Why docs

Google study: 1/3rd of all code searches are looking for examples or specific API information.

Django: 120Kl of English, 80KloC. If you haven't written your docs you haven't written your product.

Outline

Gonna go over three main sections...

  1. The secret to writing good docs (know thy audience)
  2. Writer's block
  3. Polishing (patterns & antipatterns)

1. Know Thy Audience

Who is the document you are preparing for? Other developers? Non-technical user? Yourself in six months? An SRE at 2 a.m. when things are on fire? People with no context (far distant team or new hires) and remotes who need to be able to find information without constantly standing on someone.

Common kinds of documents & audiences (think job to be done)

  1. Introductory material
  2. Runbooks
  3. Service docs
  4. Design docs
  5. Cookbooks
  6. Team pages
  7. Troubleshooting
  8. Style guides
  9. Policy pages
  10. Roadmaps

What do your readers know comming in?

Different sets of documentation serve different purposes. What knowledge do they have and what can you assume?

Most developer docs err on the side of assuming knowledge. Lots of possible reasons for this, but it's an antipattern to be aware of.

What's the takeaway your document should communicate?

Doesn't have to be 5pgf MLA format, but you should have a TL;DR or thesis. This helps avoid problems like the interminable FAQ.

Audience Creep

Know the scope of the doc. Often you'll incrementally expand the vision of the document until it tries to be everything to all people at which point it's kinda useless.

2. Writer's Block

JUST. FRIGGIN. START. WRITING.

A metaphor: activation energy. Anything you can do to lower the activation energy encourages the reaction to occur. Once you get into a project you're "in the flow" and it's easy to keep hacking/writing, and you want to do whatever you can to stay there.

Forcing functions: pomodoro / the most dangerous writing app etc. Lifehacks and soforth abound here.

Often the way to get started is to just braindump.

2½. The documentation lifecycle

  1. Rapid development - visual artists & cartoonists. You start with shapes and just start doodling, iteratively refining. Someone dumped a wall of text, someone else organized it, someone else added examples... etc.

  2. Feedback - People read it and comment. Someone else goes on-call and has comments / discovers things which aren't written down. Even comming back to it the next day helps, but really it's about getting someone with a new context someone else who doesn't have the same context to look at the code. This is why docbird has code review as part of the workflow, it ensures that someone else has to put eyes on the document. You've got to build this into your team's workflow.

  3. Maintenance - Documentation which isn't true to the live system is pretty much worthless.

3. Writing really super amazing docs!

Antipattern: docs that aren't docs

It's not documentation to tell someone to ask someone else. Guideposts of where to do discovery are helpful but they aren't sufficient.

Antipattern: nerdview

Defined by the Language Log crowd

Nerdview is the tendency for people with any kind of technical or domain knowledge to get hopelessly stuck in their own frame of reference

Imagine reserving a rental car inline, and being asked to enter a date in DD/MM/YYYY format and getting an error message "Please pick a date greater than today". Nobody says "greater than today". This sentence only makes sense when you're thinking about dates as integral values which you're gonna be comparing as in a computer system.

This is really just an instance of knowing your audience.

Antipattern: noun piles "crash blossoms" or "garden path sentences"

Just adding hyphenation to clarify qualification can help a lot.

Antipattern: Acandemese/technicalese

Don't slip into formalism just because you're writing docs.

Antipattern: slang

Non-english-first speakers exist. We employ lots of 'em. Know thy audience and try not to over-use slang if only for the purpose of clarity.

Antipattern: Buzzwords

Type-safe! Modular! Parameterized!

Sometimes these help, sometimes these just add volume. Be aware.

Antipattern: Acronymns

Define them! Did you know that we have TSA and ACID at Twitter neither of which mean what you think they mean in common outisde-twitter usage?

Antipattern: Birds

DON'T FUCKING NAME YOUR THING AFTER A BIRD. STOP IT. THE JOKE IS DEAD. IT'S NAILED TO ITS PERCH. IT'S SLID DOWN THE CURTAIN AND JOINED THE CHOIR INVISIBLE. IT IS AN EX-PARROT.

Pattern: Simplicity all the way down

Simple pages

10-20min ABSOLUTE LIMIT on a given page of docs. Some say 5-10min is better... it depends on what your use case is and how you expect your audience to read. Again think what is the job of the document and what kind of attention do you expect your reader to direct to it?

Simple sections

Sections should be single-topic and targeted.

"Tasks not tools". Title your section in terms of what problem your reader is likely trying to solve, becuase only rarely is your audience reading to lear rather than reading trying to solve a problem. Can you title your question as a question or a gerrund rather than as a spare noun?

Remember that while you may write your docs front-to-back people will tend to discover them via some search tooling, so you shouldn't be afraid of repetition and certainly not of linking back to things you've already written about.

Simple paragraphs

Don't be afraid of single sentence paragraphs.

Don't present information that's better rendered tabularly or as a diagram as prose just because that's what's convenient / easy to write from pick your editor of choice.

Simple Sentences

"Someone said that words are a lot of inflated money - the more of them you use the less each one is worth"

~ Malcolm Forbes

Simple Words

The beginning of knowledge is to call something by its proper name

~ Confucius

Coherence vs. Completeness

Your doc should hang together. You should be able to sum up topics. This fits naturally into the context of completeness.

Kinda naturally at odds with Coherence. Completeness is saying everything, and coherence is saying just enough or just what is needed. Same as #shipit vs #berigorous. What are the top three use cases for this doc? What are the next three? Can you hide complexity with links?

You can kinda only keep five things in your working memory at once… three is kinda this nice magical number. Linking is a great way to reference complexity without becomming massively monolithic.

Templates & metadata

Often times you'll open a document and kinda have no idea who the author was and what the status was and what the changelog was and so on and soforth.

By using templates which provide those empty headings/sections/fields it becomes easier to get it right and write all that down.

Templates also help you be consistent in the structure of your documentation, which means that users can predict and rely on the structure of your content which makes docs more approachable.

Correctness, Maintenance & Culture of Documentation

  • Docs have to be treated as a first class priority & are part of the product.
  • Products aren't shipped unless you have docs &c.
  • Teams recognize that docs take time and allocate time for building documentation.

Formatting & Variation

In lots of docs you have this "wall of text" antipattern. This is particularly bad in confluence because the editor is weak and lines are long.

Formatting helps make the high-level structure of the document easier to understand / visualize / work with.

Lists, boldface, italics and break out boxes help display structure tastefully. BUT DON'T OVERDO IT. If everything has a warning box, your users will ignore all of them and soforth.

asside: Overdoing it. If you can't simplify the techdocs, it's commonly a smoking gun that the thing you're documenting is too complex.

Edward Tuft's The Visual Display of Quantitative Data.

Pie charts are the worst. Bar charts and tables are better almost universally.

Closing bits

  • Examples! Write them! They should be as the user will use them!
  • Interleave code and prose! Interactive examples are awesome!
  • Permalinks! Twitter has its own nice internal permalinks system, but these can make it a lot easier to find and reference the document you want.
  • Use a prose linter! (http://hemmingwayapp.com etc.)
  • Teach a class / make a video!

Chowderheads [2004]

Volker Armin Hemmann wrote:

if we are talking about people who are only able to install gentoo because of an automated, graphical installer, then we will get: a lot more 'bug reports' that are not ones. a lot more really really stupid questions (I wait for the day, someone asks, where he can find the gentoo-homepage), and no new developers (but a lot more work for the existing ones).

One might also imagine we'd get less questions on actually installing Gentoo and more on doing stuff with it. There I go again with my tricky logic.

All successful and useful projects get new people. It's a fact of life and frankly if you aren't, you're doing something wrong. That holds true from the Gentoo project to the Roman Empire. If you can not integrate new people successfully into your organization, it will fail.

Gentoo has in fact from the very start been about making things easier. Easier to pick the packages you want, easier to upgrade, easier to custom build packages, easier to update etc files, etc. Gentoo has even gone out of its way to make better how-tos and is known in the Linux community at large for having just about the most useful and friendly forums.

Gentoo can either continue extending the infrastructure to support the people being attracted to a damn useful distro. Or clowns like you can attempt to keep Gentoo all to yourself with Jim Crow style exclusionary tactics.

kashani

Via marc.info

Puppet, I guess

So I have, or rather should say had, a problem.

At work, I'm partially responsible for managing a large puppet deployment. Puppet is largely configured in terms of providers packaging resources & templates which are used to lay down configuration files on disk to manage host behavior. It is after all a configuration mangement tool. And it works great for the problem domain of "I have lots of servers with lots of state to configure".

But I have dotfiles. The way that I manage dotfiles needs to capture the idea that I want to merge many different components together into the state of the given machine. Sounds like puppet...

But my dotfiles have to be self-bootstrapping, and they can't presume access to any particular master node. These two requirements are actually pretty firm, because whenever I bring up a new laptop or a new VPS it's basically clone down my dotfiles, run an install script in the repo and done.

GNU Stow has served me really well so far. It totally solved my initial dotfiles problem, which was "I want to just symlink a bunch of stuff out of a repo".

When I wanted to start organizing the repo, stow became less ideal and I wound up with a wrapper script around stow which had some notion of sets of repository delivered stow packages to install on a given host. And that worked for quite a while on my personal machines.

Then I had a problem at work. I use ~/.zsh/conf/* as a holding pen for a bunch of configuration files regarding variables, functions and general shell bringup. When I was only configuring my personal machine(s), stowk worked fine because I had the same settings everywhere. Stow installs a whole zsh package from the repo, which includes a .zsh and a .zsh/conf and you're all set.

The problem came when I realized that some of my configurations only apply to my work machine. Because I use a Mac at work, I have some extra stuff I need to do with the path and homebrew to get things... quite right and I don't want to burden my Linux install(s) with all that. A monolithic zsh package won't cut it anymore.

But Stow only deals with packages consisting of whole directories! It won't let you manage single files! Really what I want is not a stow which lays down symlink directories, but a tool which makes real directories and symlinks in files. That way I could write several packages, say zsh and work_zsh which would merge together into the ~/.zsh tree the way I want.

So lets start laying out files on disk.

function arrdem_installf () {
  # $1 is the logical name of the file to be installed
  # $2 is the absolute name of the file to be installed
  # $3 is the absolute name of its destination
  #
  # Installs (links) a single file, debugging as appropriate

  if [ -n "$DEBUG" ]; then
      echo "[DBG] $1 -> $3"
  fi
  ln -s "$2" "$3"
}

export -f arrdem_installf

Okay so that's not hard. Just requires ln, makes appropriate use of string quoting and has support for a debugging mode. We're also gonna need a thing to lay out directories on disk.

function arrdem_installd () {
  # $1 is an un-normalized path which should be created (or cleared!)

  dir="$(echo $1 | sed 's/\.\///g')"
  if [ ! -d "$dir" ]; then
      if [ -n "$FORCE" ]; then
          rm -rf "$dir"
      fi

      mkdir -p "$dir"
  fi
}

export -f arrdem_installd

Okay so this function will accept an un-normalized path (think a/./b/c), normalize it (a/b/c) and ensure it exists. Because we're interacting with a file/directory tree which Stow used to own, it's entirely possible that there used to be a symlink where we want to put a directory. So we have to support a force mode wherein we'll blow that away.

Alright... so now let's write a stow alternative that does what we want.

function arrdem_stowf () {
  # $1 is the stow target dir
  # $2 is the name of the file to stow

  f="$(echo $2 | sed 's/\.\///g')" # Strip leading ./
  TGT="$1/$f"
  ABS="$(realpath $f)"

  if [ -h "$TGT" ] || [ -e "$TGT" ]; then
      if [ "$(realpath $TGT)" != "$ABS" ]; then
          if [ -n "$FORCE" ]; then
              if [ -n "$DEBUG" ]; then
                  echo "[DBG] Clobbering existing $ABS"
              fi
              rm "$TGT"
              arrdem_installf "$f" "$ABS" "$TGT"
          else
            echo "[WARNING] $TGT already exists! Not replacing!"
            echo "          Would have been replaced with $ABS"
          fi
      else
        if [ -n "$DEBUG" ]; then
            echo "[DEBUG] $TGT ($f) already installed, skipping"
        fi
      fi
  else
    arrdem_installf "$f" "$ABS" "$TGT"
  fi
}

export -f arrdem_stowf

So if we just want to install a single file, we normalize the file and compute the path we want to install the file to. Now it's possible since this is a symlink based configuration management system that the target file already exists. The existing file could be a deal (old) symlink, or it could be a link we already placed on a previous run. We can use realpath to resolve symlink files to their paths, and determine whether we have a link that already does what we wanted to do. In the case of an existing file and $FORCE we'll clobber, otherwise we'll only install a new link if there isn't something there.

Great so this deals with installing files, assuming that we want to map from ./foo to ~/foo.

Now we can really write our stow, which will eat an entire directory full of files & subdirectories and emplace them all.

function arrdem_stow () {
  # $1 is the install target dir
  # $2 is the source package dir
  #
  # Makes all required directories and links all required files to install a given package.

  # If a target directory doesn't exist, create it as a directory.
  # For each file in the source, create symlinks into the target directory.
  #
  # This has the effect of creating merge mounts between multiple packages, which gnu stow doesn't
  # support.
  ( cd "$2"

    # Make all required directories if they don't exist
    #
    # If force is set and something is already there blow it the fsck away
    find . -mindepth 1 \
         -type d \
         -exec bash -c 'arrdem_installd "$0/$1"' "$1" {} \;

    # Link in all files.
    #
    # If the file already exists AND is a link to the thing we want to install, don't bother.
    # Else if the file already exists and isn't the thing we want to install and force is set, clobber
    # Else if the file already exists emit a warning
    # Else link the file in as appropriate
    #
    # Note that this skips INSTALL, BUILD and README files
    find . -type f \
         -not -name "INSTALL" \
         -not -name "BUILD" \
         -not -name "README.*" \
         -exec bash -c 'arrdem_stowf "$0" "$1"' "$1" {} \;
  )
}

export -f arrdem_stow

It turns out that the ONLY really bash safe way to support filenames and directory names containing arbitrary whitespace or other characters is to use find -exec bash, rather than parsing the output of find. If you try to parse find's output, you wind up having to designate some character as special and the delimeter. I thought that whitespace was a safe assumption and found out I was wrong, so wound up taking this approach of using the -exec option to construct recursive bash processes calling my exported functionss. Which is why I've been exporting everything all along.

So given ~ as the install target, and ./foo/ as the package to install, we'll cd into foo in a subshell (so we don't leap CWD state), find & arrdem_installd all the required directories. THen it will arrdem_stowf all the files, with a couple exceptions. README fiiles, BUILD and INSTALL files are exempted from this process so we don't litter ~ with a bunch of files which aren't logically a part of most packages.

Okay. So we can install packages. Great.

This lets me build out a tree like

./install.sh
./zsh/README
./zsh/.zsh/.....
./zsh-work/README
./zsh-work/.zsh/.....
./emacs/README
.....

Which will get the job done, but still leaves me with the problem of picking and choosing which packages to install on a given host. I can solve this problem by going full puppet, and defining a concept of a profile, which consists of requirements of other profiles or packages. When I ./install.sh on a host, it's gonna try to install the profile named $(hostname) first, falling back to some default profile if I haven't built one out for the host yet.

So really the tree will look like

./install.sh
./packages.d/emacs
./packages.d/vim
./packages.d/zsh
./packages.d/work-zsh
./profiles.d/work
./profiles.d/work/requirements
./profiles.d/home
./profiles.d/home/requirements
./profiles.d/default
./hosts.d/$WORK_HOSTNAME
./hosts.d/$WORK_HOSTNAME/requirements
./hosts.d/$LAPTOP_HOSTNAME
./hosts.d/$LAPTOP_HOSTNAME/requirements

where requirements files will be special and let a profile list out the other profile(s) and package(s) which it should be installed with. This lets me say for instance that the profiles.d/work profile is defined to be profiles.d/home more the package work-zsh for instance. Or that rather, profiles.d/work is profiles.d/default more a bunch of stuff and profiles.d/home is entirely independent and includes configuration(s) like my Xmonad setup which are irrellevant to a Mac.

So first we need to be able to install something we consider to be a package

function install_package() {
  # $1 is the path of the package to install
  #
  # Executes the package build script if present.
  #
  # Then executes the install script if present, otherwise using arrdem_stow to install the built
  # package.

    echo "[INFO - install_package] installing $1"

    if [ -x "$1/BUILD" ]; then
        ( cd "$1";
          ./BUILD)
    fi

    if [ -e "$1/INSTALL" ]; then
        ( cd "$1";
          ./INSTALL)
    else
        arrdem_stow ~ "$1"
    fi
}

export -f install_package

Packages are directories which may contain the magical files README, BUILD and INSTALL. If there's a BUILD file, execute it before we try to install the package. This gives packages the opportunity to do host-specific setup, such as compiling fortune files with ctags.

The INSTALL script gives packages an escape hatch out of the default package instalation behavior, say installing OS packages rather than emplacing resources from this directory.

Otherwise, we just tread the directory as a normal stow package and install it.

Okay, so now we need to support profiles.

function install_profile() {
  # $1 is the path of the profile to install
  #
  # Reads the requires file from the profile, installing all required profiles and packages, then
  # installs any packages in the profile itself.

  if [ -d "$1" ]; then
      echo "[INFO] installing $1"

      # Install requires
      REQUIRES="$1/requires"

      if [ -e "$REQUIRES" ]; then
          cat $REQUIRES | while read -r require; do
              echo "[INFO] $require"
              case "$require" in
                  profiles.d/*)
                      echo "[INFO - install_profile($1)] recursively installing profile $require"
                      install_profile "$require"
                      ;;

                  packages.d/*)
                      echo "[INFO - install_profile($1)] installing package $require"
                      install_package "$require"
                      ;;
              esac
          done
      fi

      # Install the package(s)
      find "$1" \
           -maxdepth 1 \
           -mindepth 1 \
           -type d \
           -exec bash -c 'install_package "$0"' {} \;
  else
    echo "[WARN] No such package $1!"
  fi
}

So if there is a directory with the given profile name, then if there is a requires file, pattern match profiles & packages out of the requires file & install them. For good measure, install any packages which may be included in the profile's directory.

Now we just need a main to drive all this.

function main() {
  # Normalize hostname

  HOSTNAME="$(hostname | tr '[:upper:]' '[:lower:]')"
  HOST_DIR="hosts.d/$HOSTNAME"

  if [ -d "$HOST_DIR" ]; then
      # Install the host profile itself
      #
      # It is expected that the host requires default explicitly (or transitively) rather than getting
      # it "for free".
      echo "[INFO - main] installing profile $HOST_DIR"
      install_profile "$HOST_DIR"
  else
    # Otherwise just install the default profile
    echo "[INFO - main] installing fallback profile profile profiles.d/default"
    install_profile "profiles.d/default"
  fi
}

main

At this point we've built out a shell script which depends only on find, bash and realpath but can support some really complex behavior in terms of laying down user config files. As hinted above, this could install OS packages (or homebrew).

By making heavy use of foo.d directories, it becomes super easy to modularize configurations into lots of profiles & merge them together for emplacement.

Best of all in debug mode it becomes pretty easy to sort out what's comming from where with a grep, or you can just stat the emplaced symlin(s) which will give you a fully qualified path back to the resouces they alias.

Not bad for a one-day garbage puppet implementation.

The code for this monstrosity is available here as a gist , but comes with a disclaimer that it's a snapshot of the working state of my dotfiles repository as of this article's writing and may be suitable for no purpose including my own usage.

^d

Last Mile Maintainership

So Daniel Compton (@danielwithmusic) is a good bloke. We've been co-conspirators on a number of projects at this point, and I just wanted to share a quick vignette before I pack it in for the night.

Almost a year ago, James Brennan (@jpb) was kind enough to offer up a pull request (#158) to the kibit tool which Daniel and I currently maintain. We're both relatively inactive maintainers all things told. Kibit largely does what it's supposed to do and is widely used for which neither of us can take much credit. We're stewards of an already successful tool.

Between Daniel's day job and my move out to San Francisco #158 just got lost in the shuffle. It's an awesome feature. It enables kibit to, using the excellent rewrite-clj library, automatically refactor your code for style. If kibit can find a "preferred" replacement expression, thanks to James's work #158 enabled kibit to make the replacement for you. While Daniel and I kinda just watched James pushed it to feature completeness and found a significant performance win which made it not just a compelling feature but fast enough that you'd want to use it.

Then a couple months passed. Daniel and I had other things to do and presumably so did James.

About 11 hours ago now, I saw a github notification about a new comment - "Feature: lein kibit fix or similar that applies all suggestions" (#177). Cody Canning (@ccann) was suggesting exactly the feature James had offered an implementation of.

At this point James' patch adding exactly this feature had sat idle for many months. Some other things had come in, been more active and been merged. James' changeset now had conflicts.

Following the github help docs for how to check out a pull request (spoiler: git fetch $UPSTREAM_REPO pull/ID/head:$NEW_LOCAL_BRANCHNAME) I had James' patches on my laptop in less than a minute. git merge immediately showed that there were two sources of conflict - the kibit driver namespace had had its namespace refactored for style and a docstring had been added to the main driver function which James' patches touched. The other was that dependencies had been bumped in the project.clj.

Fixing this took.... five minutes? Tops?

The test suite was clean and in 11 minutes Daniel merged my trivial patch to James' awesome work done and live.

The whole process of about 10 months was overwhelmingly waiting. James finished the patch in like four days (April 20 '16 - April 26 '16). Daniel and I were just bad maintainers at getting it shipped.

Were Daniel and I worse maintainers, we could have seen #177 come in and asked either Cody or James to update the patch. It would have taken maybe five minutes tops to write that mail and maybe it would have saved me 15 minutes and Daniel 5.

After months of waiting? Why?

I've written before about my own thoughts on code review after working in an organization which is high trust, high ownership and sometimes it feels high process anyway. In this case and I'm sorry to say almost a year late, I went by what I've come to believe - that reviewers should make an effort to take responsibility for merging code rather than requiring the primary author to do all the leg work.

Sure I could probably have pinged James or suckered Cody into writing that merge commit but why? What does that buy anybody? It was so, so easy to just take James' changes and merge myself rather than asking someone else for trivial revisions. And it makes for a better process for contributors. It's not their fault that your project has grown merge conflicts with their changes.

If there had been a huge conflict, or James' changes had seemed somehow deeply unreasonable it would have been a different story.

But going the last mile for your contributors is worthwhile

^d

Nihilist Reviewboard

Let's talk about another concept that's as old as the hills - code review.

"Design and code inspections to reduce errors in program development" [Fagan 1976] (pdf) introduced the notion for a structured process of reviewing programs & designs. The central argument which Fagan presents is that it is possible to quantitatively review software for flaws early in the development cycle, and to iterate on development while the cost of change is low compared to cost of iterating on software which had been deployed to customers.

The terminology is a little archaic, but in all the elapsed time the fundamental idea holds. Code review for Fagan is as much an architectural design review as it is anything else.

This shouldn't be terribly surprising, given some of the particular concerns Fagan's process is designed with addressing. While many of these things haven't intentionally changed, some of these concerns such as the specifics of register restoration reflect the paper's age.

While the underlying goal of the code review process, to examine software for flaws early and often, has not changed meaningfully in the intervening decades many of the particulars of process described by Fagan reflect a rigidity of process and a scale of endeavor which is no longer reflective of the state of industry at large.

Fagan's process is designed to prevent architecture level mistakes through intensive review, as well as to detect "normal" bugs en-masse and provide a natural workflow for iterative searching for and fixing of bugs until the artifact is deemed of sufficient quality. This is a work of process engineering optimized for code being slow & difficult to write, and for software being slow & risky to ship.

So what does a modern code review system look like? What makes for a good code review? What has changed?

With the cheapening of computer time, advent of integrated testing systems, generative testing, continuous integration and high level languages, many of the properties which previously required extensive deliberate human review can now be automatically provided. Likewise, modern linters & formatters can provide extensive stylistic criticism and enforce a degree of regularity across entire codebases.

Continuous delivery systems and incremental deployment methodologies also serve to mitigate the expensive "big bang" releases which informed Fagan's process. Continuous or near continuous delivery capabilities mean that teams can be more focused on shipping & testing incremental products. Artifacts don't have to be fully baked or finalized before they are deployed.

Similarly, linters & other automatic code inspection together with the advantages of high level languages at once make it possible to make meaningful change to artifacts much more rapidly and to automatically detect entire classes of flaws for remediation before an author even begins to engage others for review.

Ultimately, the job of every team is to deliver software. In many contexts, incomplete solutions, delivered promptly & iterated on rapidly, are superior to fuller solution on a longer release cadence.

So. What does this mean for code reviews?

Reid's Rules of Review

True to the style of The Elements of Style, these rules are hard, fast and have exceptions. They're deeply motivated by the tooling & engineering context described above. If your team has missile-launching or life ending reliability concerns, you'll want a different approach. If you can't trivially test or re-deploy or partially roll out your code you'll also want a different approach. This is just the way I want to work & think people should try to work.

1. Ensure that the artifact is approachable.

If you are not able to understand a changeset or a new artifact, let alone the process by which its author arrived at the current choice of design decisions & trade-offs, that is itself a deeply meaningful criticism because it means that both the code is unclear and the motivational documents are deeply lacking.

As the reviewee the onus is on you to enable your reviewers to offer high level criticism by removing low level barriers to understanding.

1.1. Corollary: Write the docs.

As the reviewee, how are your reviewers supposed to understand what problem(s) you're trying to solve or the approach you're taking if you don't explain it? Link the ticket(s). Write docstrings for your code. Include examples so that it's obvious what and how. Write a meaningful documentation page explaining the entire project so that the why is captured.

1.2. Corollary: Write the tests.

Those examples you wrote? They should be tests.

I'm totally guilty of the "It's correct by construction! Stop giving me this tests pls crap!" but it's a real anti-pattern.

As the reviewee even to the extent that you may succeed in producing or composing diamonds, someone will eventually come along and refactor what you've written and if there aren't tests covering the current behavior who knows what will happen then. You may even be the person who introduces that regression and won't you feel silly then.

Furthermore tests help your reviewers approach your code by offering examples & demonstrations. Tests aren't a replacement for documentation and examples, but they certainly help.

1.3. Corollary: Run the linter.

If you have a style guide, stick to it in so much as is reasonable. As the reviewee if you've deviated from the guide to which your coworkers are accustomed you've just made it harder for your coworkers to meaningfully approach and criticize the changes you're proposing. You've decreased the review's value for everyone involved.

2. Criticize the approach.

Algorithmic or strategic commentary is the most valuable thing you can offer to a coworker. Linters and automatic tooling can't really help here. Insight about the future failings of the current choice of techniques, benefits of other techniques and available tools can all lead to deeply meaningful improvements in code quality and to learning among the team. This kind of review may be difficult to offer since it requires getting inside the author's head and understanding both the problem and the motivations which brought them to the decisions they made, but this can really be an opportunity to prevent design flaws and teach. It's worth the effort.

3. Don't bike shed your reviewee.

If the code works and is close to acceptable, leave comments & accept it. The professional onus is on the reviewee to determine and make appropriate changes. It's not worth your or their time to go around and around in a review cycle with a turn around time in hours over this or that bikeshed.

3.1. Corollary for the reviewer: Style guides are something you apply to yourself, not something you do to others.

⚠⚠ EXCEPT IN THE MOST EGREGIOUS CASES ⚠⚠

If someone clearly threw the style guide out the window or didn't run the linter, then a style guide review is appropriate. Style guides should be automatically enforced, or if tooling is not available then they should be mostly aspirational.

What's the point of wasting two or more humans' time doing syntactic accounting for minor infractions? If it takes half an hour to review a change, maybe another hour before the reviewee can respond to changes, half an hour or more to make the requested changes and then the updated changeset has to be reviewed again syntax and indentation bike sheds in code review can easily consume whole work days.

3.2. Corollary for the reviewer: Don't talk about performance concerns unless you have metrics in hand.

Seriously. Need to push thousands of requests per second? Yeah you may care about the performance of an inner loop somewhere. Performance criticisms are meaningful and you should have performance metrics already.

Got a service that'll see a few hundred requests at the outside? Who cares! It can probably be quintic and still get the job done.

It is far better to write inefficient code which is easy to understand & modify, ship it and iterate. If legitimate performance needs arise, code which can be understood and modified can always be refactored and optimized.

Code which is optimized early at the expense of understanding and modifiability is a huge mistake because much as it may make the reviewee or the reviewer feel clever to find some speedup, that speedup may or may not add value and the semantic cost of the optimizations increases the maintenance or carrying cost of the codebase as a whole.

3.3. Corollary for the reviewer: Don't be dogmatic.

There are exceptions to every rule. There are concrete time and morale costs to requesting change. Be mindful of these, and remember that you're all in the same codebase together with the same goal of shipping.

4. Hold your coworkers accountable after the fact and ship accordingly.

If their service is broken, they who broke it get to be on the front lines of fixing it. The consequence of this is that you should trust your coworkers and prefer shipping their code thanks to a culture of ongoing responsibility. Your default should be to accept code and ship code more or less whenever possible.

In short, don't be this guy

^d