13 Jun 2016
As with some of my other posts, this was originally an email which turned into enough of a screed I
thought it would be of general interest and worth posting. Some op/ed material has been removed, but
the technical details are unaltered if not clarified.
So here's the rundown I've got so far on a lisp with immutable environments. I'm merely gonna sketch
at some stuff, because giving a full treatment would require dusting off and finishing a bunch of
stuff I came up with for Ox and put down again.
I also threw out the Haskell sketch I was dinking with because it
was just distracting from writing this :P
So lets start from the top.
Clojure is a form-at-a-time language which happens to support reading files of forms (or rather
Readers which produce textual representations of forms, where they come from is magical).
Binding is achieved through a global, thread shared, transactional, mutable mapping from
Namespaces, which are themselves transactional mappings from
Vars, which are
themselves a triple
(QualifiedSymbol, Metadata, Binding).
Vars happen to also support dynamic
binding (pushing and popping), but this is less used. From here on out I'll treat them simply as a
global mostly static binding mechanism, which is their primary use case anyway.
Control and local bindings are achieved entirely using lambda/fn forms compiled to JVM methods,
produced by the opaque compiler as opaque IFn/AFn objects. Top level forms are compiled and executed
for effect by being wrapped in an anonymous
(fn  <form>) which can be blindly invoked by the
compiler to realize whatever effects the form may have.
Circular dependencies are achieved via
Var indirection. A
Var is first created with no binding,
so that it can be referenced. With that environment binding, code which will depend on (call) the
Var's final bound value can be compiled since it just needs the target
Var to exist in order to
compile. The depended on
Var may then be redefined, having both itself and its dependent visible
in the compilation context and so the two
Var bindings can be mutually recursive.
While traditional and lispy, this approach has a number of problems which I'm sure I don't need to
reiterate here. The goals of an immutable namespace system then should be to
- Make Namespaces the compilation unit rather than Forms
- Make Namespaces reloadable (change imports/exports/aliases w/o system reboot)
- Enable Namespace compilation to fail sanely
- Enable refactoring/analysis
The sketch of this I've been playing with is as follows:
Lets assume an interpreter. If you can interpret you can compile as an implementation detail of eval
and interpretation is easier to implement initially. Assume a reader. So we read a form, then we
have to evaluate it. Evaluation is first macroexpansion, then sequential evaluation of each
resulting form in the environment. Evaluation at the top level is
eval Env Form -> Env (we discard
the result) which is fine. Because we aren't really doing compilation, only interpretation,
Clojure's tactic of having Vars and analyzing against Var bindings works 1:1. Create an unbound
symbol, analyze/bind against it, then add a binding later.
The (global) environment structure I had in mind is pretty simple: one unqualified symbol names the
current namespace (
*ns*), a mapping exists from unqualified symbols to Namespaces.
Namespaces are essentially compilation contexts, and really just serve to immutably store the
alias mapping, import mapping, the mapping of the
Vars in the namespace to bindings, and a list of
Vars/exports from the namespace.
Vars are a fully qualified name, metadata and a value. That's it.
Compilation occurs at the namespace level. All the forms in a namespace are sequentially read and
evaluated in interpretation mode. All macros expand normally and type hints are processed they just
don't do anything.
Once the whole namespace has been successfully loaded in interpretation mode, a second pass may be
made in which static linking is performed. Because everything in the namespace has been fully
realized already it's possible to statically link even mutually recursive calls within the same
namespace. Full type inference within the namespace is also possible here, etc.
Aside: Interestingly because it is single pass, the existing Clojure compiler doesn't (can't)
handle hinted mutually recursive function calls at all. It relies on Var metadata to store arglists
(and hints for primitive invocations), so in order to emit a hinted recursive call the forward
declaration of the var has to carry the same
^:arglists metadata (hints and all!) which will be
defn whenever it is finally defined.
Once a namespace has been fully loaded and compiled (including the second pass) the global
environment with the resulting namespace and var mappings is the result of whatever caused
loading. If an error occurs during compilation, we just abort and return the environment state
before anything else happened. I think there are tricks to be played with generated garbage class
names and classloader isolation here so that this works even during the 2nd static compile pass, but
it should be possible to fully clean up a failed compile.
So this works really great for module loading, and even for macros which expand into multiple def
forms. This model of
eval :: env -> form -> (env, result) starts to break down when you want to
talk about macros that do evaluation during code loading, and likewise the REPL. Basically your
interpreter winds up holding an additional reference,
*env* or something, which is intrinsically
mutable, but which references an immutable environment and holds shall we say the state continuation
of the entire REPL. Consider user code which actually calls
eval during execution. Whatever state
changes that evaluation creates need to be preserved in the "global" continuation.
In this model when you compile a form at the REPL, the runtime could obviously interpret (this may
be appropriate for macroexpansion) and can also compile. This is only tricky when the user
recompiles a symbol which was already bound/compiled. In this case, the runtime could either eagerly
recompile all the code which links against this function using the same interpret then statically
link tactic or could just invalidate any compiled bytecodes and lazily recompile later. The former
is probably more predictable.
Once a namespace has been reloaded/altered, all namespaces importing it must also be recompiled in
topsort order using the same tactics. That we already have per-symbol dependency information and
per-module dependency information helps with building refactoring tools which otherwise have to
derive all this themselves. Ideally top level expressions/definitions would also expose local
variable tables tables and local variable dataflow graphs so that analysis there is also possible.
Aside: It may be possible to allow cyclic dependencies between namespaces (see
submodules in racket for some study
on this problem). In the common case it may well be that macros are well behaved and that cyclic
dependencies between modules work out just fine. However because macros perform arbitrary
computation it's also pretty easy to cook up pathological cases where macro generated code never
reaches a fixed point in repeated interpretive analysis which can be statically compiled. For this
reason I'm inclined towards formalizing a
module special form and throwing out
import and friends as legal top level forms altogether. Namespaces should be declarative
as in Java/Haskell.
There's probably a way which I just haven't seen yet to treat updates to the "global" namespace
mapping and updates within a namespace the same way via the same mechanism since they're both
dependent on the same dependency tracking and recompile machinery.
Writing this has made me think about having an ImmutableClassLoader which is derived entirely from a
immutable environment and loads classes by lazily (statically) compiling forms.
That's kinda all I got. ztellman gets credit for the mutable
*env* in the repl bit which I spent
literally weeks trying to do without last year. Maybe something with monad transformers can get you
there but I couldn't figure it out. mikera has some sketches of all this with types in his
I've already prototyped an environment system that looks a lot like this in the Ox repo if you want
to go dig. Originally there was this
then I started dinking with a Java implementation of the environment structure
which also has a couple different stack based contextual binding types for the interpreter.
As written about in the Jaunt essays, I've kinda concluded that whatever this immutable ns language
winds up looking like is so divorced from what Clojure or at least the way that most people write
Clojure that you'll loose so much value in the changeover it won't pay off for a very long time. The
mount library is a perfect example of this in that it's deeply
imperative and takes advantage of code loading order to achieve start and stop. It's not the only
bit of code I've seen which does effects at load time either. Hence Jaunt which tries to be less
flawed around namespace reloading/refactoring without going whole hog on ns immutability.
The more I kick this stuff around the more I find that the whole endeavor verges on a Haskell clone
in sexprs and the more I decide I'd rather take the time to learn Haskell properly than mess with an
immutable lisp or Jaunt. Besides then you get nice stuff like typeclasses and fully typed instance
dispatch and equality that actually works and theorems and native bytecode and yeah.
31 May 2016
For those of you just tuning in, this is part of my ongoing journey of migrating to SF to work for
Twitter. The lessons contained herein are a direct result of my coming to the area pretty much
blind and paying the various prices for doing so.
So I visited eight apartments/complexes today. Pretty tired although better informed at the end of
it, but emotionally exhausted enough by the process to be ranting about it. Prices may vary because
I'm going by memory, pretty tired, horrified and generally uninterested in many of these places after
Older building. Mixed use. Very tired. Rent controlled. New landlord for the residential portion
in the last 4 years. An effort has been made to put new paint on and renovate, but it wasn't a
really quality job. Bad vibes. $2750 for an unrenovated studio across the street from the office.
NEMA. Newer building. Feels like and is priced like a
resort. Justifies its rent on amenities (gym, pool, social shit, rooftop terraces, trainers and
staff). Nothing under $3100, and that doesn't really buy you much in terms of the living space
itself. Largely access to the amenities it feels like. For an apartment you'd want in the
building it'll run $3400.
SOMA residences. Newer building. Not much for staff or
amenities. Lovely inner terrace on the 2nd floor which most of the apartments open in onto. The
place I saw was 2nd of 3 floors, so only window was out onto the terrace and it came off as
dark. Baldly perfumed, but spacious enough. Couple blocks from the office. $2750 or thereabouts.
The Civic. Nothing under $3400. Not even high service like NEMA, just
nicer rooms with more space.
Argenta. l m a o. No studios in the whole building, $3800 and
up. Not more space then the civ, lower service than NEMA. Just no.
Olume. Same parent company as Argenta, but studios and "more reasonably"
priced (start at $3300 (on promo! $3225!)). Lovely building. Really cute brunette sales lady. Low
service but awesome rooftop space and much more attractive rooms than NEMA for the price.
Fell & Gough. Weird little studio facing
an inner courtyard in a 1912 building. Great old wood floors that creak with every step, but in
an awful location IMO. Kitchen was microscopic/narrow, bathroom didn't look like it'd ever
scrub clean from simple age. Maybe for $2300 or $2200, for the asked $2400 no.
Fell & Webster. On the same busy street
(Fell) as 7, but actually faces the street. The main living area has noticeable street noise
that'll never go down because the street is always high traffic. The bedroom is partitioned from
the living area and doesn't really hear the street, but shares walls with 1) outside walkway, 2)
next apartment's bedroom 3) neighbor below's bedroom. Kitchen isn't super spiffy. Not that I'll
use it a ton but still, at the same price point Fox's was nicer. For $2675 not so much. For $2.6k
or $2.5k maybe.
NEMA or Olume represents a commitment of 48% of my income >.> no. Today's lesson is that I'm pretty
clearly gonna have to craigslist this one out and find a studio in order to get a lease in my $2.7k
and under goal budget.
So. Yeah. Lots of running about. Stressful because today was a negative lesson in that Fox which I
thought was a safe choice really isn't somewhere I want to be and everything else like/near it is
too damn expensive.
The good news is that the gays I'm staying with got me connected with a local Realtor who does
rentals and validated that my goal budget & areas were reasonable. So if I don't find something by
Friday I've got an appointment with him. At the low low price of 80% of one month's rent if I sign
on something he shows and $500 if I don't sign on anything.
End Of Rant
08 Mar 2016
This week I'm working on getting Jaunt's 1.9.0 release nearer
the door, on which note I'm happy to properly demo one the features of this first release:
A bare REPL will do for these demos.
$ cd $(mktemp -d)
$ wget https://clojars.org/repo/org/jaunt-lang/jaunt/1.9.0-RC4/jaunt-1.9.0-RC4.jar
$ java -jar jaunt-1.9-RC4.jar
Jaunt #2 (not my best ticket ever in terms of hygiene)
was the first work I did on what became Jaunt and was really pretty critical for getting my teeth
into the project and validating that there were tractable incremental improvements to be made over
This change introduced four new switches to the Jaunt compiler, the most interesting of which is
:warn-on-deprecated which enables or disables compiler warnings in support the use of
^:deprecated metadata on functions and namespaces.
:warn-on-deprecated is on (
default, although it can be disabled globally with the JVM system property
clojure.compiler.warn-on-deprecated=false, or by
(set! *compiler-options* (dissoc *compiler-options* :warn-on-deprecated)) in a namespace
where you want these checks to be disabled.
The semantics of deprecation are simple:
- A namespace is deprecated if and only if it has the
- A definition (Var) is deprecated if either it occurs within a
^:deprecated namespace, or is
Vars and namespaces may become deprecated in any SemVer minor version or major version, but may only
legally be deleted only on a major version.
Deprecation warnings are only emitted when a deprecated namespace or definition is accessed from a
context which is not deprecated. Deprecated contexts are namespaces which are deprecated, and the
bodies of deprecated definitions.
So for instance, in the following code:
(defn ^:deprecated bad-idea [x]
(println "Oh noez!")
(throw (new Exception "Bad code is bad")))
;; => #'user/bad-idea
(defn ^:deprecated also-bad-idea [x]
;; => #'user/also-bad-idea
(defn almost-better-idea [x]
;; Warning: using deprecated var: #'user/bad-idea ...
;; => #'user/also-bad-idea
The definition of
bad-idea itself is innocuous. Simply evaluating deprecated definitions is
fine. Warning that
bad-idea would be superfluous, because that fact is
innocuous so long as neither is used. Suppressing warnings in this case allows Jaunt to load
arbitrarily much deprecated code without drowning a user with false positive warnings.
Users should only have to care about deprecation at the edge between current code and deprecated
code. Thus compiling
almost-better-idea will emit a warning that it makes use of
almost-better-idea is not itself deprecated. Likewise at the repl invoking either deprecated
function would also generate a warning.
As to namespaces, similar principles apply. Consider the two namespaces and trace:
(ns ^:deprecated com.proj.code)
;; => nil
(def some-const 3)
;; => #'com.proj.code/some-const
(:require [com.proj.code :as old :refer [some-const]]))
;; Warning: aliasing deprecated ns: com.proj.code ...
;; Warning: referring deprecated var: #'com.proj.code/some-const ...
;; => nil
(def g 4)
;; => #'com.proj.new-code/g
(def h (+ old/some-const g))
;; Warning: using deprecated var: #'com.proj.code/some-const ...
;; => #'com.proj.new-code/h
As the whole
com.proj.code namespace is deprecated, merely referencing it and requiring that it be
loaded is cause for a warning. Should that namespace ever be removed, the require would break
regardless of whether anything from the required namespace is used or not.
If symbols which are deprecated are referred into a namespace which is not deprecated, each one of
those will generate a warning as well for the same reason. However Jaunt tries to be smart about
this, suppressing such warnings if a
(require ' [... :refer :all]) or a
(use ...) directive has
been issued which constitutes an indefinite referral.
com.proj.code/some-const, is deprecated by dint of being defined in a deprecated
namespace. Consequently a warning is emitted when it is used outside of a deprecated context, here
in the definition of
One last demo, disabling warnings!
;; => nil
(dissoc *compiler-options* :warn-on-deprecated))
(def ^:deprecated a 3)
;; => #'com.proj.no-warnings/a
(def b (+ a 4))
;; => #'com.proj.no-warnings/b
There's more than just this coming down the line in Jaunt, so if you're interested check out the
1.9 CHANGELOG, watch
the repo or stay tuned here for more demos. Issues, ideas and
especially pull requests are welcome
22 Feb 2016
I've started a fork of Clojure which I intend to operate in a
community directed manner;
Jaunt. Check out the
demo script or just keep reading for
The Long Version
After several months of dinking with
Oxlang on and off I came to a
realization. Ox, while an annoyingly persistent daydream was doomed
to be one of two things. Either it would be something akin to Haskell
in S expressions, or it would be indistinguishable from Clojure with
some tweaks under the hood.
I'd gotten Ox's reader sort of working, and was starting to experiment
with notations, explore notions of higher kinded types and generally
was having a ball. However even getting that far was a highly
educational exercise in how much Clojure already does fairly well. To
continue down the road of Ox was to admit that I'd have to reinvent
all of Clojure's datastructures, most of the Clojure compiler and a
whole boatload of the standard library. It was also to admit that due
to limitations of the JVM as a target, I wouldn't be able to play some
of the many tricks which the GHC folks get to play even were I to go
down the road of a pure language.
Why do all this? Because it's hard? I'm no Heracles seeking labors
or yaks for their own sake. I'd rather be
learning something new or playing Dota or... a thousand other things.
I swear I would.
Besides, even though Michiel Trimpe was
to mention Ox
alongside a number of other "conversational" programming languages in
his ClojureX talk, I'm not convinced that it's a good idea.
In looking back at that particular talk, one thing struck me. Sure in
a pure functional programming language such as Haskell you could stick
an arbitrary AST in a Merkle tree and have a globally distributed
version control and distribution system with strong guarantees about
tree uniqueness and reuse safety.
Unfortunately, Clojure is no such beast. We have no language level
concept of effects (okay we have the
macro but that's hardly the same). While you could implement immutable
namespaces (and in fact they are backed by immutable maps already) the
concept doesn't make a whole lot of sense to me.
The very notion of a conversational language is that you can reference
any definition from any version of the program as if it were the
current version. What does this look like at a REPL? Are namespaces
(or rather full states) identified with a commit ID and compilation
occurs in terms of a fully qualified "commit" and Var? Now we need
notation for that. Maybe version/name/namespace or something. Some
sort of explorer to browse older commits is in order... the list goes
on and on. In short, I think you wind up reinventing a bunch of stuff
that git already does pretty well.
This is not to say that Clojure's existing mutable namespaces are
ideal. For me, they work against the ns or file level code loading
style I use when developing Clojure code. Old habits from C die hard
as it were. A common failure mode I encounter is that I'll rename a
function but not all of its uses. IntelliJ and refactor-nrepl can be
of some help here, but I've had them break down and I'm not always in
a "full ide" configuration. Because namespaces are mutable, even if I
were to perform a rename correctly, the old name is still present in
the namespace. If I got it wrong, my code reloads, all symbols
resolved then I get Really Cool Unexpected Behavior when parts of my
code call the new function and parts call the old function. Or worse
my code works fine until I restart my repl or the test server and
everything explodes. Or the old name just sticks around cluttering up
This becomes a real UX problem for me, because I rely heavily on
refactoring. I start by sketching out what amount to types, or
deriving operations from types I already have, and keep working
renaming and moving code around until I get something I'm happy with.
And Yet I Shave
The thing which came to me now a few weeks ago is that there's an
interesting halfway house between these two extremes of mutable and
immutable namespaces. What if we gave Namespaces and Vars version
numbers? When a Namespace is re-defined (the ns form is evaluated)
then the Namespace's version is incremented. When a
def is evaluated,
the version of the Var bound by the
def is set to be the version of
the Namespace in which that Var exists.
This allows traditional REPL style interaction with Namespaces. You
may enter a Namespace, add bindings, aliases, imports and so forth as
you will. However the version numbers give you important static
information about the state of the Var in its Namespace.
If I have some
def which I entered at the REPL, when I do something
ns backing file and reload it, that
def persists. But, its
version is now out of sync with the Namespace it exists within. When
we reloaded the file, the Namespace's version was incremented and then
every definition was reevaluated. If my code is still using an old
symbol which is not textually present in the file and was was not
reloaded, its version doesn't increase. Now the compiler can detect
the version disparity and emit warnings that I'm using old code.
While this solves dependency freshness issues within a single
and a single Namespace, solving them globally requires more leg
work. The compiler needs to be adapted to track the use and reach sets
of each compiled expression, and to inter that information into the
Var binding forms so that it's possible for a newly compiled form to
emit warnings about transitive use of stale vars.
These changes don't alter the semantics of the Namespace at all
outside of reloading. If you evaluate a new
def or a single form in
a Namespace it works just as before. However if you take this whole
file reloading approach, the language can offer added support which
papers over a rather painful pitfall.
A related change is that Namespaces can be made to purge their imports
and aliases on reload. Right now, Namespaces persist imports and
aliases until the VM is restarted. This leads to a nasty set of
refactoring problems where you want to change the name of an alias,
but can't reuse a name because aliases are never cleared. If we
consider a whole Namespace and its source file(s) as the unit of
compilation we can again do better. This is another language level
change which would meaningfully improve my day to day use of the
language without breaking existing use patterns.
Both of these changes turn out to be quite simple to implement. In the
course of only a few days I went from the concept of these changes to
a build of Clojure (pr,
which features Namespace clearing and Var versions. I'm using now
daily because it better supports the way that I want to work with the
Rich, Stuart Sierra, Stu Halloway, Alex and the rest of the
Clojure/Core crew (see
built Clojure and have shepherded it this far. I'm immensely grateful
to them for the work they've done to date in building a tool which
I've had great fun using. But I think our priorities diverge.
From what I've been able to gather by talking to people who've been
around longer and
what Rich has written,
it should be clear that Clojure is not a community project in the same
way Python or Rust is. Clojure is Rich's personal project, which he is
so kind as to share with the rest of us and in in its refinement Rich
chooses to accept some amount of input from us as users. I'm somewhat
ashamed to admit that it took me two years to realize this.
Discussion of particulars, and the reasons for the status quo is I
think a somewhat useless exercise. Rich has his project and
administers it as he sees fit. As Clojure is Rich's project, the
priority of its governance is to conserve Rich's time and it has
repeatedly been made clear that there is no desire to change
this. This is an eminently reasonable goal I can find no fault
in. Rich doesn't owe me or anyone else jack, let alone time. However
the consequence of this structure and of not delegating is the slow
and steady path which frustrates my desires for more rapid iteration.
A Fork in the Road
With loyalty the path of the last 18 months and clearly one of
frustration, and voice ineffective due to the (reasonable!) priorities
of Clojure/Core, The only course of action left to me is some form
I think that there's a lot of possible improvements that can still be
made to the language. Currently there doesn't seem to be a lot of
appetite for exploring that in Clojure itself. The namespace reloading
stuff sketched above is just the first thing that popped to mind when
I started thinking about how to improve my day to day experience and
implementing that alone has shaken loose a number of other small but
The consequence of this opinion is that I've started
Jaunt is my personal fork of Clojure which seeks to deal with the
various organizational and technical limitations expressed above.
Technically, with Jaunt I want to explore a space of refinements to
the language such as the Namespace reloading changes sketched above
which add value without breaking compatibility with the ecosystem of
libraries and tools like CIDER which I've come to rely on every day.
Organizationally, I want Jaunt to be a community driven affair. I
don't claim that my judgment is infallible, nor do I think I'll have
all the good ideas, nor will I have infinite time to consider changes
even were these restrictions lifted. I'd love to involve contributors
who want to bring improvements to the project within the stated bounds
of preserving compatibility at least with the documented parts of
Because otherwise how do you believe that any of this works, here's a
script that'll build a empty
leiningen project in a
temporary directory and run a quick demo of some of the stale Var
If you want a stable language. If you're running code in
production. If you don't want to get down and dirty with the language
implementation or the development process, stick with Clojure.
As the EPL states:
THE PROGRAM IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OR
CONDITIONS OF ANY KIND, EITHER EXPRESS OR IMPLIED INCLUDING, WITHOUT
LIMITATION, ANY WARRANTIES OR CONDITIONS OF ... FITNESS FOR A
Jaunt should be a drop in replacement for Clojure. I want to keep it
compatible. I foresee no reason to break the documented Clojure
API. But some amount of drift is inevitable. I've already moved some
stuff around inside the Java implementation. If you were depending on
those undocumented yet public implementation details, all bets are
changed how metadata behaves on vars
(I think it's a better approach but it does impose behavioral
added deprecated warnings. I
(pr) which the compiler
totally had internally and just didn't expose
forcing weirdness. And
this list of differences (perhaps improvements) is just going to grow.
Whether Jaunt proves to be something I depend on in the future,
whether I wander off into the land of types and leave it to someone
else or whether it diverges too much from Clojure and dies as a result
of the weight of its mutations, I've overheard enough interest in a
project such as this that I think this is a worthwhile endeavor and I
hope you'll join me in this little adventure.
Update: Clojarr has been renamed to Jaunt, updated article accordingly.
24 Dec 2015
Those of you who follow me on Twitter may be aware of my recent
travails with a bogeyman I've named only as "XML". Having banged my
head against this abomination of software for several days, it's
venting time so buckle in for the ride.
The Hell Am I Building
The summer I was working at Calxeda, my friends
@Sweet_Grrl got me hooked on a
tabletop game, Warmachine by
the Privateer Press aka PP. At a high level since PP's digital
presence is pretty last decade, Warmachine is a tabletop miniature
army game in the 2.5cm to 12cm base diameter size range. Models and
units of models have statistics in the classical D&D style, and
players engage in deathmatch and king of the hill style games.
Compared to Warhammer 40k which is probably better known, Warmachine
features a much more steampunk theme & universe, simpler rules of
play, lower entry price point and a greater focus on balance of play
rather than on army construction and model specialization.
In short, Warmachine is a player's game while Warhammer is a modeler's
Unfortunately, a few realities of this game genre assert themselves.
- Games are slow and take up a lot of space
- Games are only possible at the other player's convenience
- Exploring game balance questions is difficult b/c play is time-hard
- You've got to buy before you try much of the time
As a student with homework, I simply haven't been able to attend
Austin's various weekly Warmachine evenings in about a year which
means it's a long time since I've gotten to play a game I enjoy a
lot. Unlike Starcraft or Dota I can't just drop into a game at my
convenience. But what if I could?
I've previously messed with AIs for Magic: the Gathering, so the idea
at some point asserted itself that I could (or even should) try to
build a full up simulator for Warmachine IP issues be damned.
There are two "moving pieces" to this project. Or maybe prior art is
the better term. Sebastien Laforet, here and after referred to as the
an Android app
which is capable of rendering data files encoding the rules of play
and model statistics for the game. Great! Someone else already typed
all that in for me. In English, but typed it in. These files are
and are "just" XML. More on this in a bit.
Two other people, "Hobo" and "PG_Corwin" build
a tile set for
the VASSAL tabletop simulation
engine. Unfortunately VASSAL is just a sprite engine with no
understanding of the rules or this whole project would be
moot. Treating the module as a JAR yields a whole bunch of sprites in
/images/ tree, so there's "free" art to work with as well.
The Devils in the Data
In theory this should be dead easy. The rules files are a full
enumeration of every known piece in the game, and the sprites package
has an image for most of them, so just wire them together into a data
store of some sort and I'd be done from an assets point of view. I'd
just need to well build the graphics engine and figure out how to
simulate the ruleset.
Unfortunately this isn't so simple for a couple reasons.
First of all, the XML files Sebastien put together are designed to do
one thing and one thing only: encode the rule text on the playing
cards associated with each model and unit. Cards are generally
formatted with an image, some fluff and statistics on the front and
rules on the back. So for instance this recently released model,
Coleman Stryker v3 has a model and stat card face as such:
So we can slap all this in XML and it works just fine (with serious
elisions for length)
full_name="Lord General Coleman Stryker"
qualification="Cygnar Epic Cavalry Warcaster"
spd="8" str="6" mat="7" rat="6" def="15" arm="16" cmd="10"
hitpoints="18" immunity_electricity="true" />
<melee_weapon name="Quicksilver MKIII"
<mount_weapon name="Mount" pow="10">
<capacity title="THUNDEROUS IMPACT">
<ranged_weapon name="Quicksilver Blast"
rng="8" rof="1" aoe="-" pow="14"
<feat title="LIGHTNING CHARGE">
<spell name="ARCANE BOLT"
cost="2" rng="12" aoe="-" pow="11" up="NO" off="YES">....</spell>
<spell name="CHAIN BLAST"
cost="3" rng="10" aoe="3" pow="12" up="NO" off="YES">....</spell>
cost="2" rng="SELF" aoe="CTRL" pow="-" up="YES" off="NO">....</spell>
cost="2" rng="6" aoe="-" pow="-" up="YES" off="NO">....</spell>
<spell name="IRON AGGRESSION"
cost="3" rng="6" aoe="-" pow="-" up="YES" off="NO">....</spell>
<capacity title="ELITE CADRE [STORM LANCES]" type="">....</capacity>
<capacity title="FIELD MARSHAL [ASSAULT]">....</capacity>
<capacity title="PLASMA NIMBUS">....</capacity>
So we have some meaning for a
<warcaster> which is a title and a
<model>, we have
<basestats> which is common to
pretty much everything, we have a
<weapons> block with a list of
<weapon> entries each of which has some attributes (rules) named
<capacity> and this is all pretty sane. Writing a set of functions
which can walk this tag tree using
clojure.xml was really simple,
and worked great, and all you have to do is reduce over all this tag
soup with some bindings and a state object to build a representation
of a given model.
That code has been working for about 12 months, because shortly
thereafter batteries of special cases appeared from the woodwork and
fouled my initial efforts on this up.
See as in Magic: the Gathering, Warmachine derives a lot of well value
from novelty and from mixing things up in the ruleset. This means that
it's not uncommon to find things which aren't what they appear to
be. The newest faction in the game has this awesome model:
The text at the bottom "Battle engine and solos" is the important
bit. The TEP itself is what is called a
Battle Engine and has a
whole battery of rules associated with that concept. However it comes
in a box with three other models which aren't part of this
Engine entity, they are independent during play hence the use of the
Solo. So... what the hell is this thing and how do we represent it?
Well this is our Frenchman's answer, again with omissions:
full_name="Transinfinite Emergence Projector & Permutation Servitors"
cost="9" qualification="Convergence Battle engine" fa="2"
<basestats spd="5" str="10" mat="0" rat="4" def="10" arm="19" cmd="10"
hitpoints="20" construct="true" pathfinder="true"/>
<ranged_weapon name="Aperture Pulse"
rng="SP10" aoe="-" pow="10" rof="1" location="-">
<capacity title="AUTO FIRE">....</capacity>
<capacity title="FIRING FORMULAE">....</capacity>
<capacity title="GUN PLATFORM">....</capacity>
<capacity title="SACRIFICIAL PAWN [PERMUTATION SERVITORS]">....</capacity>
<capacity title="SERVITOR SATELLITES">....</capacity>
<model id="Permutation Servitors" full_name="Permutation Servitors">
<basestats spd="6" str="3" mat="5" rat="5" def="12" arm="13" cmd="0"
Did you catch that?
<battleEngine> NESTS another
<model>! From the
WHAC app, which seeks to duplicate the cards that come in the box this
makes sense. There's a card which represents the TEM, and there's
another card which gives the statistics for each of the spawnable
<battleEngine> as a tag is implicitly dealt with as
<model>, same as
<warcaster> was earlier.
Rendering in the app this looks just fine, because you render the
"top" model being the big ol' TEM anyone reading the cards cares about
and the Servitors are secondary and so don't occur as a primary
Unfortunately this is nonsense from my point of view as a consumer
trying to extract structured information. The TEM is a
and of itself, as are the Servitors, and they come together in a
package that PP calls PIP 36028.
A similar problem applies to units. It turns out that the Frenchman
encodes units by using this implicit
<model> behavior. So a unit
will be named "Foo, Leader & Grunts", use the implicit
<basestats>, some weapons and capacities and that'll be
all. Which sorta works.
Unfortunately it falls apart in units with models that aren't all the
same. I've wasted enough length on images and listings here, but go
check out the encoding of the
if you care. The Black 13th is composed of three characters, being
Lynch, Watts and Ryan each of whom is unique. They are encoded using
<model> on their shared
<unit> to describe Lynch, the
leader, and the other two have their own nested
Again from a rendering standpoint this makes sense, but really what I
want is something like this:
<unit name="the Black 13th" id="...">
because that would make sense.
The even more degenerate case of this is unfortunately the common case
of an infantry unit: "Foo, Leader & Grunts" where the leader will get
<model> and there'll be a
<model name="grunts"> or
something in its body. This is really a problem because usually the
officer has a different sprite than the grunts, and the grunts need to
have an ID in order to have a sprite permanently associated with
them. Guess what attribute they may not have.
And don't get me started on the spelling errors and qualification
mistakes and soforth I've found in these files as a result of trying
to automate parsing them.
The Devils in the Sprites
The sprites are their own little shitshow. There isn't a naming
convention to all of the files as they were done by several artists
over the course of years. Most of the time they're named with
substrings or abbreviations for a model's name, so some regex voodoo
is able to recover about a third of the model to file associations but
for most of it pfft I got to spend last night hand matching data file
model IDs to sprite names.
I'm down to about 250 entities which aren't associated with a sprite,
and I built a three page webapp to speed up building these
associations, so it could be worse but it's still painful.
Why do this? Well because the XML file that's in the root of the
VASSAL module and describes all the sprite to model associations isn't
actually valid XML, it's some VASSAL specific nonsense that I can't
seem to parse.
So yeah. XML and other people's data can go burn.
The question now before me is whether I try to do awful things to my
XML file parser so I can continue consuming Sebastien's data file
format non-deal as it is in the hope that as he maintains it I can
"just" import changes or whether I want to for my own sanity just
refactor these files and walk away from any future work of Sebastien's
in the interest of quality and sanity of parsed results.
End Of Rant