牛: the environment model

Disclaimer: Oxlang is vaporware. It may exist some day, there is some code, however it is just my thought experiment at polishing out some aspects of Clojure I consider warts by starting from a tabula rasa. The following represents a mostly baked scheme for implementing ns and require more nicely in static (CLJS, Oxcart) rather than dynamic (Clojure) contexts.


Unlike Clojure in which the unit of compilation is a single form, Oxlang's unit of compilation is that of a "namespace", or a single file. Oxlang namespaces are roughly equivalent to Haskell modules in that they are a comprised of a "header", followed by a sequence of declarative body forms.

In Clojure, the ns form serves to imperatively create and initialize a namespace and binding scope. This is done by constructing a new anonymous function, using it as a class loader context to perform cached compilation of depended namespaces. Subsequent forms are compiled as they occur and the results are accumulated as globally visible defs.

Recompiling or reloading a file does exactly that. The ns form is re-executed, incurring more side-effects, and all forms in the file are re-evaluated generating more defs. However this does not discard the old defs from the same file, nor purge the existing aliases and refers in the reloaded namespace. This can lead to interesting bugs where changes in imports and defs create name conflicts with the previous imports and cause reloading to fail. The failure to invalidate deleted defs also creates conditions where for instance during refactorings the old name for a function remains interred and accessible the program runtime allowing evaluation of code which depends on the old name to succeed until the entire program is reloaded in a fresh runtime at which point the missing name will become evident as a dependency fault.

Furthermore, the Var mechanism serves to enable extremely cheap code reloading because all bindings are dynamically resolved anyway. This means that there is exactly zero recompilation cost to new code beyond compilation of the new code itself since the Var look up operation is performed at invoke time rather than at assemble time.

Unfortunately in my Clojure development experience, the persistence of deleted symbols resulted in more broken builds than I care to admit. Building and maintaining a dependency graph between symbols is computationally inexpensive, is a key part of many language level analyses for program optimization and here critically provides better assurance that REPL development behavior is identical to program behavior in a cold program boot context.

In order to combat these issues, two changes must be made. First, re-evaluating a ns form must yield a "fresh" environment that cannot be tainted by previous imports and bindings. This resolves the import naming conflict issues by making them impossible. By modeling a "namespace" as a concrete "module" value having dependencies, public functions and private functions we can mirror the imperative semantics enabled by Clojure's defs and Vars simply by accumulating "definitions" into the "module" as they are compiled.

This model isn't a total gain however due to the second change, that reloading entirely (and deliberately) invalidates the previous definitions of every symbol in the reloaded namespace by swapping out the old namespace definition for the new one. This implies that other namespaces/modules which depend on a reloaded module must themselves be reloaded in topological sort order once the new dependencies are ready requiring dependency tracking and reloading infrastructure far beyond Clojure's (none). Naively this must take place on a file by file basis as in Scala, however by tracking file change time stamps of source files and the hash codes of individual def forms a reloading environment can prove at little cost that no semantic change has taken place and incur the minimum change cost. I note here the effectiveness of GHCI at enabling interactive development under equivalent per-file reloading conditions as evidence that this model is in fact viable for enabling the interactive workflow that we associate with Clojure development.

With "namespaces" represented as concrete immutable values, we can now define namespace manipulation operations such as require and def in terms of functions which update the "current" namespace as a first class value. A def when evaluated simply takes a namespace and returns a new namespace that "happens" to contain a new def. However the work performed is potentially arbitrary. refer, the linking part of require, can now be implemented as a function which takes some enumeration of the symbols in some other namespace and the "current" environment, then returns a "new" environment representing the "current" environment with the appropriate aliases installed.

This becomes interesting because it means that the return value of load need lot be the eval result of the last form in the target file, it can instead be the namespace value representing the final state of the loaded module. Now, given a caching/memoized load (which is require), we can talk about an "egalitarian" loading system where user defined loading paths are possible because refer only needs the "current" namespace, a "source" namespace and a spec. Any function could generate a "namespace" value, including one which happens to perform loading of an arbitrary file as computed by the user. See technomancy's egalitarian ns for enabling the hosting of multiple versions of a single lib simultaneously in a single Clojure instance is one possible application of this behavior.

It is my hope that by taking this approach the implementation of namespaces and code loading can be simplified greatly however one advantage of the Var structure is that it enables forwards and out of order declarations which is immensely useful while bootstrapping a language runtime ex nihilo, as done here in the Clojure core.

^d

Compiler introduction of transients to pure functions

In Clojure and pure functinal languages, the abstraction is provided that values cannot be updated, only new values may be produced. Naively, this means that every update to a value must produce a full copy of the original value featuring the desired change. More sophisticated implementations may opt for structural sharing, wherein updated versions of some structure share backing memory with the original or source value on the substructures where no update is performed. Substructures where there is an update must be duplicated and updated as in the naive case, but for tree based datastructures this can reduce the cost of a typical update from O(N) on the size of the updated structure to O(log(N)) because the rest of the structure may be shared and only the "updated" subtree needs to be duplicated.

This means that tree based structures which maximize the ammount of sharable substructure perform better in a functional context because they minimize the fraction of a datastructure which must be duplicated during any given update.

Unfortunately however, such structural sharing still carries a concrete cost in terms of memory overhead, garbage collection and cache and performance when compared to a semantically equivalent update in place over a mutable datastructure. A mutable update is typically O(1), with specific exceptions for datastructures requiring amortized analysis to achieve near O(1) performance.

Ideally, we would be able to write programs such that we preserve the abstraction of immutable values, while enabling the compiler or other runtime to detect when intentional updates in place are occurring and take the opportunity to leverage the performance improvements consequent from mutable data in these cases while ensuring that no compiler introduced mutability can become exposed to a user through the intentional API as built by the programmer.

In such a "pure" language, there is only one class of functions, functions from Immutable values to Immutable values. However if we wish to minimize the performance overhead of this model four cases become obvious. λ Immutable → Immutable functions are clearly required as they represent the intentional API that a programmer may write. λ Mutable → Mutable functions could be introduced as implementation details within an λ Immutable → Immutable block, so long as the API contract that no mutable objects may leak is preserved.

Consider the Clojure program

(reduce (partial apply assoc)
    {}
    (map vector
       (range 10000)
       (range 10000)))

This program will sequentially apply the non-transient association operation to a value (originally the empty map) until it represents the identity mapping over the interval [0,9999]. In the naive case, this would produce 10,000 full single update coppies of the map. Clojure, thanks to structural sharing, will still produce 10,000 update objects, but as Clojure's maps are implemented as log₃₂ hash array mapped tries, meaning that only the array containing the "last" n % 32 key/value pairs must be duplicated, more the root node. This reduces the cost of the above operation from T(~10,000²) to T(10,000*64) ≊ T(640,000) which is huge for performance. However, a Sufficiently Smart Compiler could recognize that the cardinality of the produced map is max(count(range(10000)), count(range(10000))), clearly being 10000. Consequently an array map of in the worst case 10000 elements is required given ideal hashing, however assuming a load factor of 2/3 this means our brilliant compiler can preallocate a hashmap of 15000 entries (presumed T(1)), and then perform T(10000) hash insertions with a very low probability of having to perform a hash table resize due to accounting for hash distribution and sizing the allocated table to achieve a set load factor.

Clearly at least in this example the mutable hash table would be an immense performance win because while we splurge a bit on consumed memory due to the hash table load factor (at least compared to my understanding of Clojure's hash array mapped trie structure) the brilliantly compiled program will perform no allocations which it will not use, will perform no copying, and will generate no garbage compared to the naive structurally shared implementation which will produce at least 9,967 garbage pairs of 32 entry arrays.

The map cardinality hack is it's own piece of work and may or may not be compatible with the JVM due to the fact that most structures are not parametric on initial size and instead perform the traditional 2*n resizing at least abstractly. However, our brilliant compiler can deduce that the empty map which we are about to abuse can be used as a transient and made static when it escapes the scope of the above expression.

Consider the single static assignment form for the above (assuming a reduce definition which macroexpands into a loop) (which Clojure doesn't do).

    [1 ] = functionRef(clojure.core/partial)
    [2 ] = functionRef(clojure.core/apply)
    [3 ] = functionRef(clojure.core/assoc)
    [4 ] = invoke(2, 3, 4)                   ;; (partial apply assoc)
    [5 ] = {}
    [6 ] = functionRef(clojure.core/map)
    [7 ] = functionRef(clojure.core/vector)
    [8 ] = functionRef(clojure.core/range)
    [9 ] = 10000
    [10] = invoke(8, 9)                      ;; (range 10000)
    [11] = invoke(6, 7, 10, 10)              ;; (map vector [10] [10])
    [12] = functionRef(clojure.core/first)
    [13] = functionRef(clojure.core/rest)
loop:
    [14] = phi(5,  18)
    [15] = phi(11, 19)
    [16] = if(13, cont, end)
cont:
    [17] = invoke(12, 14)
    [18] = invoke(4, 14, 15)
    [19] = invoke(13, 15)
    [20] = jmp(loop)
end:
    [21] = return(18)

Where the phi function represents that the value of the phi node depends on the source of the flow of control. Here I use the first argument to the phi functions to mean that control "fell through" from the preceeding block, and the second argument to mean that control was returned to this block via instruction 20.

This representation reveals the dataflow dependence between sequential values of our victim map. We also have the contract that the return, above labeled 21, must be of an Immutable value. Consequently we can use a trivial dataflow analysis to "push" the Immutable annotation back up the flow graph, giving us that 18, 14 and 5 must be immutable, 5 is trivially immutable, 18 depends on 14, which depends on 18 and 5, implying that it must be immutable as well. So far so good.

We can now recognize that we have a phi(Immutable, Immutable) on a loop back edge, meaning that we are performing an update of some sort within the loop body. This means that, so long as Transient value is introduced into the Immutable result, we can safely rewrite the Immutable result to be a Transient, and add a persistent! invocation before the return operation. Now we have phi(Immutable, Transient) → Transient which makes no sense, so we add a loop header entry to make the initial empty map Transient giving us phi(Transient, Transient) → Transient which is exactly what we want. Now we can rewrite the loop update body to use assoc! → Transient Map → Immutable Object → Immutable Object → Transient Map rather than assoc → Immutable Map → Immutable Object → Immutable Object → Immutable Map.

Note that I have simplified the signature of assoc to the single key/value case for this example, and that the key and value must both be immutable. This is required as the persistent! function will render only the target object itself and not its references persistent.

This gives us the final operation sequence

    [1 ] = functionRef(clojure.core/partial)
    [2 ] = functionRef(clojure.core/apply)
    [3 ] = functionRef(clojure.core/assoc!)
    [4 ] = invoke(2, 3, 4)                      ;; (partial apply assoc)
    [5 ] = functionRef(clojure.core/transient!)
    [6 ] = {}
    [7 ] = invoke(5, 6)
    [8 ] = functionRef(clojure.core/map)
    [9 ] = functionRef(clojure.core/vector)
    [10] = functionRef(clojure.core/range)
    [11] = 10000
    [12] = invoke(10, 11)                       ;; (range 10000)
    [13] = invoke(8, 9, 12, 12)                 ;; (map vector [12] [12])
    [14] = functionRef(clojure.core/first)
    [15] = functionRef(clojure.core/rest)
loop:
    [16] = phi(7,  20)
    [17] = phi(13, 21)
    [18] = if(17, cont, end)
cont:
    [19] = invoke(14, 17)
    [20] = invoke(4, 16, 17)
    [21] = invoke(15, 17)
    [22] = jmp(loop)
end:
    [23] = functionRef(clojure.core/persistent!)
    [24] = invoke(23, 20)
    [25] = return(24)

Having performed this rewrite we've one. This transform allows an arbitrary loop using a one or more persistent datastructures a accumulators to be rewritten in terms of transients if there exists (or can be inferred) a matching Transient t → Transient t updater equivalent to the updater used. Note that if a non-standard library updater (say a composite updater) is used, then the updater needs to be duplicated and if possible recursively rewritten from a Persistent t → Persistent t by replacing the standard library operations for which there are known doubles with their Transient t counterparts until either the rewrite fails to produce a matching Transient t → Transient t or succeeds. If any such rewrite fails then this entire transform must fail. Also note that this transformation can be applied to subterms... so as long as the Persistent t contract is not violated on the keys and values here of the map in a nontrivial example their computation as well could be rewritten to use compiler persisted transients.

Now yes you could have just written

(into {}
   (map vector
      (range 10000)
      (range 10000)))

which would have used transients implicitly, but that requires that the programmer manually perform an optimization requiring further knowledge of the language and its implementation details when clearly a relatively simple transformation would not only reveal the potential for this rewrite but would provide it in the general case of arbitrarily many mutable accumulators rather than into's special case of just one.

^d

牛: a preface

(Ox or oxlang) is an experiment, a dream, a thought which I can't seem to get out of my head. After working primarily in Rich Hicky's excelent language Clojure for over two years now and spending a summer hacking on Oxcart, an optimizing compiler for Clojure, it is my concidered oppinion that Clojure is a language which Rich invented for his own productivity at great personal effort. As such, Clojure is a highly oppinionated Lisp, which makes some design decisions and has priorities.

Ultimately I think that the Clojure language is underspecified. There is no formal parse grammar for Clojure's tokens. There is no spec for the Clojure standard library. Due to underspecification Clojure as a language is bound to the Rich's implementation for Java as due to leaking implementation details and lack of a spec no other implemetation can ensure compatability. Even ClojureScript, the official effort to implement Clojure atop Javascript is at best a dialect due to the huge implementation differences between the two languages.

These are not to say that Clojure is a bad language. I think that Clojure is an excellent language. If there is an Ox prototype, it will likely be built in Clojure. It's just that my priorities and Rich's are at a mismatch. Rich is it seems happy with the existing JVM implementation of Clojure, and I see that there's a lot of really interesting work that could be done to optimize, type and statically link Clojure or a very Clojure like language and that such work is very unlikely to become part of the Clojure core.

Clojure's lack of specification makes it futile for me to invest in providing an imperfect implementation so the only thing that makes sense to do is specify my own lang. The result is the Ox language. The joke in the name is twofold: the compiler for Clojure I was working on when I originally had the idea for Ox was Oxcart, named after the A-12/SR-71 program. The rest of it is in the name. Oxen are slow, quiet, tractable beasts of burden used by many cultures. This ultimately characterizes the language which I'm after in the Ox project. There's also some snarking to be found about the tractability of the Ox compared to that of other languages mascots like the gopher, the camel and the gnu which are much less willing.

The hope of the Ox project is to produce a mostly typed, mostly pure language which is sufficiently well specified that it can support both static and dynamic implementations on a variety of platforms. Ox draws heavily on my experience with Clojure's various pitfalls, as well as my exposure to Haskell, Shen, Kiss and Ocaml to produce a much more static much more sound Lisp dialect language which specifies a concrete standard library and host interoperation mechanics in a platform abstract way amenable to both static and dynamic implementations on a variety of platforms.

In the forthcomming posts I will attempt to illustrate the model and flavor of the Ox programming language, as well as sketch at some implementation details that differentiate Ox from Clojure and I think make it a compelling project. Note however that as of now Ox is and will remain vaporware. As with the Kernel lisp project, while I may choose to implement it eventually the exercise of writing these posts is really one of exploring the Lisp design space and trying to determine for myself what merit this project has over the existing languages from which it draws inspiration.

^d

On Future Languages

As Paul Philips notes at the end of his talk We're Doing It All Wrong [2013], ultimately a programming language is incidental to building software rather than critical. Ultimately the "software developer" industry is not paid to write microcode. Rather, software as an industry exists to deliver buisness solutions. Similarly computers themselves are by a large incidental. Rather, they are agents for delivering data warehousing, search and transmission solutions. Very few people are employed to improve software and computer technology for the sake of doing so compared to the hordes of industry programmers who ultimately seek to use computers as a magic loom to create value for a business. In this light I think it's meaningful to investigate recent trends in programming languages and software design as they pertain to solving problems not to writing code.

As the last four or five decades of writing software stand witness, software development is rarely an exercise in first constructing a perfect program, subsequently delivering it and watching pleased as a client makes productive use of it until the heat death of the universe. Rather, we see that software solutions when delivered are discovered to have flaws, or didn't solve the same problem that the client thought that it would solve, or just didn't do it quite right, or the problem changed. All of these changes to the environment in which the software exists demand changes to the software itself.

Malleability

In the past, we have attempted to develop software as if we were going to deploy perfect products forged of pure adamantium which will endure forever unchanged. This begot the entire field of software architecture and the top down school of software design. The issue with this model as the history of top down software engineering stands testament is that business requirements change if they are known, and must be discovered and quantified if they are unknown. This is an old problem with no good solution. In the face of incomplete and/or changing requirements all that can be done is to evolve software as rapidly and efficiently as possible to meet changes as Parnas argues.

In the context of expecting change, languages and the other tools used to develop changing software must be efficient to re-architect and change. As Paul Philips says in the above talk, "modification is undesirable, modifiability is paramount".

Looking at languages which seem to have enjoyed traction in my lifetime, the trend seems to have been that, with the exception of tasks for which the language in which the solution was to be built was a requirement the pendulum both of language design and of language use has been swinging away from statically compiled languages like Java, C & C++ (the Algol family) towards interpreted languages (Perl, Python, Ruby, Javascript) which trade off some performance for interactive development and immediacy of feedback.

Today, that trend would seem to be swinging the other way. Scala, a statically checked language base around extensive type inference with some interactive development support has been making headway. Java and C++ seem to have stagnated but are by no means dead and gone. Google Go, Mozilla Rust, Apple Swift and others have appeared on the scene also fitting into this intermediary range between interactive and statically compiled with varying styles of type inference to achieve static typing while reducing programmer load. Meanwhile the hot frontier in language research seems to be static typing and type inference as the liveliness of the Haskell ecosystem is ample proof.

Just looking at these two trends, I think it's reasonable to draw the conclusion that interpreted, dynamically checked, dynamically dispatched languages like Python and Ruby succeeded at providing more malleable programming environments than the languages which came before them (the Algol family & co). However while making changes in a dynamically checked language is well supported, maintaining correctness is difficult because there is no compiler or type checker to warn you that you've broken something. This limits the utility of malleable environments, because software which crashes or gives garbage results is of no value compared to software which behaves correctly. However, as previously argued, software is not some work of divine inspiration which springs fully formed from the mind onto the screen. Rather the development of software is an evolutionary undertaking involving revision (which is well suited to static type checking) and discovery which may not be.

As a checked program must always be in a legal (correct with respect to the type system) state by definition, this precludes some elements of development by trial and error as the type system will ultimately constrain the flexibility of the program requiring systematic restructuring where isolated change could have sufficed. This is not argued to be a flaw, it is simply a trade off which I note between allowing users to express and experiment with globally "silly" or "erroneous" constructs and the strict requirement that all programs be well formed and typed when with respect to some context or the programmer's intent the program may in fact be well formed.

As program correctness is an interesting property, and one which static model checking including "type systems" is well suited to assisting with, I do not mean to discount typed languages. Ultimately, a correct program must be well typed with respect to some type system whether that system is formalized or not.

Program testing can be used to show the presence of bugs, but never to show their absence!

~ Dijkstra (1970)

Static model checking on the other hand, can prove the presence of flaws with respect to some system. This property alone makes static model checking an indispensable part of software assurance as it cannot be replaced by any other non-proof based methodology such as assertion contracts or test cases.

Given this apparent trade off between flexibility and correctness, Typed Racket, Typed Clojure and the recent efforts at Typed Python are interesting, because they provide halfway houses between the "wild west" of dynamic dispatch & dynamic checking languages like traditional Python, Ruby and Perl and the eminently valuable model checking of statically typed languages. This is because they enable programmers to evolve a dynamically checked system, passing through states of varying levels of soundness towards a better system and then once it has reached a point of stability solidify it with static typing, property based testing and other static reasoning techniques without translating programs to another language which features stronger static analysis properties.

Utility & Rapidity from Library Support

Related to malleability in terms of ultimately delivering a solution to a problem that gets you paid is the ability to get something done in the first place. Gone (forever I expect) are the days when programs are built without using external libraries. Looking at recent languages, package/artifact management and tooling capable of trivializing leveraging open source software has EXPLODED. Java has mvn, Python has pip, Ruby has gem, Haskell has cabal, Node has npm and the Mozilla Rust team deemed a package manager so critical to the long term success of the language that they built their cargo system long before the first official release or even release candidate of the language.

Why are package managers and library infrastructure critical? Because they enable massive code reuse, especially of vendor code. Building a webapp? Need a datastore? The investment of buying into any proprietary database you may choose has been driven so low by the ease with which $free (and sometimes even free as in freedom) official drivers can be found and slotted into place it's silly. The same goes for less vendor specific code... regex libraries, logic engine libraries, graphics libraries and many more exist in previously undreamed of abundance (for better or worse) today.

The XKCD stacksort algorithm is a tongue in cheek reference to the sheer volume of free as in freedom forget free as in beer code which can be found and leveraged in developing software today.

This doesn't just go for "library" support, I'll also include here FFI support. Java, the JVM family of languages, Haskell, Ocaml and many others gain much broader applicability for having FFI interfaces for leveraging the decades of C and C++ libraries which predate them. Similarly Clojure, Scala and the other "modern" crop of JVM languages gain huge utility and library improvements from being able to reach through to Java and leverage the entire Java ecosystem selectively when appropriate.

While it's arguably unfair to compare languages on the basis of the quantity of libraries available as this metric neglects functionally immeasurable quality and utility, the presence of any libraries is a legitimate comparison in terms of potential productivity to comparative absence.

What good is a general purpose building material, capable of constructing any manner of machine, when simple machines such as the wheel or the ramp must be reconstructed by every programmer? Not nearly so much utility as a building material providing these things on a reusable basis at little or no effort to the builder regardless of the ease with which one may custom build such tools as needed.

So What's the Big Deal

I look at this and expect to see two trends coming out of it. The first of which is that languages with limited interoperability and/or limited library bases are dead. Stone cold dead. Scheme, RⁿRS, and Common Lisp were my introductions to the Lisp family of languages. They are arguably elegant and powerful tools, however compared to other tools such as python they seem offer at best equal leverage due to prevailing lack of user let alone newbie friendly library support compared to other available languages.

I have personally written 32KLoC in Clojure. More counting intermediary diffs. That I can find on my laptop. Why? Because Clojure unlike my experiences with Common Lisp and Scheme escapes the proverbial lisp curse simply thanks to tooling which facilitates library and infrastructure sharing at a lower cost than the cost of reinvention. Reinvention still occurs, as it always will, but the marginal cost of improving an existing tool vs writing a new one is in my experience a compelling motivator for maintaining existing tool kits and software. This means that Clojure at least seems to have broken free of the black hole of perpetual reinvention and is consequently liberating to attack real application critical problems rather than distracting programmers into fighting simply to build a suitable environment.

It's not that Clojure is somehow a better language, arguably it ultimately isn't since it lacks the inbuilt facilities for many interesting static proof techniques, but that's not the point. As argued above, the language(s) we use are ultimately incidental to the task of building a solution to some problem. What I really want is leverage from libraries, flexibility in development, optional and/or incremental type systems and good tooling. At this task, Clojure seems to be a superior language.

The Long View

This is not all to say that I think writing languages is pointless, nor that the languages we have today are the best we've ever had let alone the best we ever will have at simultaneously providing utility, malleability and safety. Nor is this to say that we'll be on the JVM forever due to the value of legacy libraries or something equally silly. This is however to say that I look with doubt upon language projects which do not have the benefit of support from a "major player", a research entity or some other group willing to fund long term development in spite of short term futility simply because the literal price of bootstrapping a "new" language into a state of compelling utility is expensive in terms of man-years.

This conclusion is, arguably, my biggest stumbling block with my Oxlang project. It's not that the idea of the language is bad, it's that a tradeoff must be carefully made between novelty and utility. Change too much and Oxlang will be isolated from the rest of the Clojure ecosystem and will loose hugely in terms of libraries as a result. Change to little and it won't be interesting compared to Clojure. Go far enough, and it will cross the borders of hard static typing, entering the land of Shen, Ocaml and Haskell and as argued above I think sacrifice interesting flexibility for uncertain gains.

On Student Startups

When I enrolled in UT Austin's "student startup semenar" one of the guest speaker comments which stood out to me and has stuck most firmly in my mind is that "there are no new ideas, only good execution". This particular lecturer described how he kept a notebook full of random ideas he had for possible businesses, and talked at length about the importance of validating business models through surveys of potential customers as well as discussions with industry peers. The takeaway he left us with was that consequently rather than attempting to operate in "stealth" mode as seems to be fashionable for so many startups developing a product, he argued that ideas are so cheap and the first mover advantage so great due to simple startup execution costs that attempting to cloak a startup's model and/or product generated no measurable advantage and had a concrete cost in terms of potential comment from consumers and peers which is lost as a consequence of secrecy.

Of the dozen or so startups I've interacted with so far, both in and outside the context of the abovementioned startup seminar, I've seen this overvaluing of secrecy over and over again, especially when requesting feedback on an idea. On Freenode's #Clojure channel we have standing joke: the "ask to ask" protocol. Under the ask to ask protocol, some first timer will join the channel and ask if anyone knows about some tool X whereupon some longtime denizen will invoke the ask to ask protocol and tell the newcomer to just ask his real question.

When I see a request from a nontechnical founder for technical feedback over a coffee date or in a private context after an NDA, all I can think of is the ask to ask protocol and the litany against startup secrecy. A coffee date is a commitment of at least an hour of what would otherwise been paid consulting time, and an NDA is a legally binding commitment. For the privilage of signing a contract and giving advice I get what... coffee? A mention in the credits when you finally "break stealth"? What if I was nursing an equivalent idea? I can't know that until after I sign your silly NDA, which is kinda a problem because now you've robbed me of the ability to capitalize on my own prior art.

An email or a forum comment is free. By asking that an engineer go on a coffee date to hear a pitch let alone sign an NDA and then comment, the petitioner (see nontechnical founder) is entirely guilty of at best asking to ask and limiting themselves to one or two responses when several could have been had were the real question posed rather than an ask to ask instance. Start talking about NDAs and I expect you'll get what you pay for.

^d