Docs Sketches

At present I'm working on developing Shelving, a clojure.spec(.alpha) driven storage layer with a Datalog subset query language. Ultimately, I want it to be an appropriate storage and query system for other project of mine like Stacks and another major version of Grimoire.

In building out this tool, I've been trying to be mindful of what I've built and how I'm communicating it. What documentation have I written? Is there a clear narrative for this component? Am I capturing the reason I'm making these decisions and the examples that motivated me to come here?

Writing good docs is really hard, doubly so when you're trying to write documentation largely in docstrings. Docs you write in the docstrings drift rapidly from docs written elsewhere. The only surefire way I've come up with to keep the two in sync is to consider your docstrings to be authoritative and export them to other media. So I wanted a tool for that.

Another major problem is documentation organization. In Clojure, namespaces have long known limitations as the unit of structure for your API. Clojure's lack of support for circular references between namespaces and the need to use explicit forward declarations lead to large namespaces whose order reflects the dependency graph of the artifact not perhaps the intentional API or relative importance of its members. In my first iterations of shelving, I was trying to just use a large .core namespace rather than maintain separate namespaces. I wanted a tool that would let me take a large API namespace and break its documentation up into small subsections which had conceptual relationships.

The result of these two needs is the following script, which I used for a while to generate the docs/ tree of the shelving repo. To work around the issues of having a large, non-logical .core namespace, I annotated vars with ^:categories #{} metadata. This allowed me to build up and maintain a logical partitioning of my large namespace into logical components.

Documentation files are generated by opening an existing file, truncating any previously generated API docs in that category out, and appending newly generated docs to it. This allows me to have Eg. a Markdown title, some commentary, examples and what have you, then automatically splat the somewhat formatted and source linked docs for the relevant category at the end after that header.

I won't say the results are good. But they beat the heck out of not having any docs at all. Mainly they succeeded in being easier to browse, cross-check and consequently made copy editing docstrings much easier.

With the introduction of distinct "user" and "implementer" APIs to Shelving which happen to share some vars, this categorization is no longer so clear cut. Some docs now belong in one place, some in multiple places. This is an inadequate tool for capturing those concerns. Also it still sorts symbols by line numbers when generating docs 😞

So I'll be building another doodle, but thought that this one was worth capturing before I do.

Some sample output -

(ns compile-docs
  "A quick hack for building the doc tree based on `^:category` data."
  (:require [shelving.core :as sh]
            [ :as io]
            [clojure.string :as str]))

(def category-map
  {::sh/basic  (io/file "docs/")
   ::sh/schema (io/file "docs/")
   ::sh/rel    (io/file "docs/")
   ::sh/util   (io/file "docs/")
   ::sh/query  (io/file "docs/")
   ::sh/spec   (io/file "docs/")
   ::sh/walk   (io/file "docs/")})

(defn ensure-trailing-newline [s]
  (if-not (.endsWith s "\n")
    (str s "\n") s))

(defn relativize-path [p]
  (str/replace p (.getCanonicalPath (io/file ".")) ""))

(defn compile-docs [category-map nss]
  (let [vars (for [ns              nss
                   :let            [_ (require ns :reload)
                                    ns (if-not (instance? clojure.lang.Namespace ns)
                                         (the-ns ns) ns)]
                   [sym maybe-var] (ns-publics ns)
                   :when           (instance? clojure.lang.Var maybe-var)]

        groupings (group-by #(-> % meta :categories first) vars)]
    (println groupings)
    (doseq [[category vars] groupings
            :let            [category-file (get category-map category)]
            :when           category-file
            v               (sort-by #(-> % meta :line) vars) ;; FIXME: better scoring heuristic?
            :let            [{:keys [categories arglists doc stability line file]
                              :as   var-meta} (meta v)]]
      (println v)
      (with-open [w (io/writer category-file :append true)]
        (binding [*out* w]
          (printf "## [%s/%s](%s#L%s)\n"
                  (ns-name (.ns v)) (.sym v)
                  (relativize-path file) line)
          (doseq [params arglists]
            (printf " - `%s`\n" (cons (.sym v) params)))
          (printf "\n")
          (when (= stability :stability/unstable)
            (printf "**UNSTABLE**: This API will probably change in the future\n\n"))
          (printf (ensure-trailing-newline
                   (-> doc
                       (str/replace #"(?<!\n)\n[\s&&[^\n\r]]+" " ")
                       (str/replace #"\n\n[\s&&[^\n\r]]+" "\n\n"))))
          (printf "\n"))))))

(defn recompile-docs [category-map nss]
  (doseq [[_cat f] category-map]
    (let [buff      (slurp f)
          truncated (str/replace buff #"(?s)\n+##.++" "\n\n")]
      (spit f truncated)))

  (compile-docs category-map nss))

(defn recompile-docs!
  "Entry point suitable for a lein alias. Usable for automating doc rebuilding."
  [& args]
  (recompile-docs category-map

;; In lein:
;; :profiles {:dev {:aliases {"build-docs" ["run" "-m" "compile-docs/recompile-docs!"]}}}
;; Works best with lein-auto to keep your doctree fresh