Hacking like it's 2288

I’m an a diehard Fallout franchise fan. While school currently precludes me from playing, I took some time to automate solving the hacking minigame which is used to unlock computer terminals and secured areas in-game.

When “hacking”, a player is presented with a list of words one of which is the “password” and has five (or fewer) tries to try and find the password. Every time the user selects a candidate, a response of how many characters match between the answer and the candidate is given so that the user can refine their guess.

The strategy for playing this game is uninteresting at best, you choose an initial word either heuristically or randomly, and then based on the number of characters which that word shares with the answer you can refine the whole list of words by removing all words which do not share exactly that many characters with the word you tried. This simple guess and refine algorithm one need only iterate until only one answer remains.

While an obvious algorithm, it’s a little tedious to do this by hand because there’s no good way to keep track of all the candidates which have been removed from consideration. Also computing string overlaps is boring. Given an algorithmic problem, write an algorithmic solition! What do we need to solve this problem in Clojure?

Well first we need our word “intersection” scoring function. To do this, we’ll take the frequencies of the left word, the frequencies of the right word. Since each character can only correspond to another single character, this means that given N as on the laft and M bs on the right, we have (min N M) common characters. We can simply compute this common character count for every character in the left string (characters occuring only in the right string won’t factor in anyway) and take the sum over those counts.

(defn letters-in-common [left right]
  (let [fl (frequencies left)
        fr (frequencies right)]
    (->> (for [[k vl] fl
               :let   [vr (get fr k 0)]]
           (min vl vr))
         (apply +))))

(letters-in-common "foo" "foooo")
;; => 3
(letters-in-common "cat" "dog")
;; => 0
(letters-in-common "cat" "rat")
;; => 2

So now we need a guess making function which we can write in terms of our scoring function. What word do we want to choose as our guess? Well we want to guess the word which will give us the most information about all the other words, that is to say is most similar to as many other words as possible.

Once we’ve made a guess we can use the word similarity score we get back to refine our dictionary and contrain our search space.

This must be the optimal choice by the usual greedy choice argument, since any other word we could guess would tell us less about the other words, and in the case of a tie (two words with equal similarity to all other words) we can’t know without guessing which one is the better choice so we can choose randomly.

(defn make-guess
  "Given a population of words, locates and returns a word with maximum
  \"matching potential\", that is the word which shares the most
  characters with every other word.

  If there is a tie, since the two words have equal potentials it
  doesn't matter which one we choose and so the choice is arbitrary.

  Makes use of scoped memoization for performance on lots of words,
  but realistically this should never be a factor."
  [words]
  (let [-score-fn (memoize
                   (fn [col]
                     (let [[l r] (seq col)]
                       (letters-in-common l r))))
        score-fn  (fn [word]
                    (let [s (sorted-set word)]
                      (->> (for [w words
                                 :when (not= w word)]
                             (-score-fn (conj s w)))
                           (apply +))))]
    (last (sort-by score-fn words))))

Now we need our dictionary pruning function. If a word does not have exactly n letters in common with the last guess where n is the reported similarity of the last guess with the answer, that word cannot be a solution.

We know that for multiple guesses, no word could be an answer which does not have the reported similarity with all previous guesses. Thus we don’t accidentially exclude any possible answers by simply filtering the dictionary as such after each guess.

(defn trim-pop [words guess commonality]
  (for [w     words
        :when (= commonality (letters-in-common w guess))]
    w))

(trim-pop ["foo" "bar" "cat" "rat"] "cat" 2)
;; => ("rat")

So lets put these two to use! We’ll wire up make-guess to letters-in-common which is our scoring function anyway and just let it run on a small dictionary and a goal word. Just to make the point that this algorithm does converge.

;; An example hacking loop. The guess and refine algorithm with a
;; sample input.
(let [words ["LOWER" "CREED" "JAMES" "CAGES" "CARES" "OFFER" "CAVES" "TIRED"]
      answer "LOWER"]
  (loop [words words]
    (when (= (count words) 1)
      (first words)
      (let [guess  (make-guess words)
            score  (letters-in-common guess answer)
            words' (trim-pop words guess score)]
        (println "Guessed" guess "reduced population to" words')
        (recur words')))))

Yay! So that totally works.

And because I don’t think anyone wants to manually run all that state through a REPL while trying to play Fallout 4, here’s a quick REPL wrapper to help with hacking.

(defn autohack [words]
  (if-not (= 1 (count words))
    (let [guess (make-guess words)
          _     (println "Try>" guess)
          score (read)]
      (recur (trim-pop words guess score)))
    (println "It's" (first words))))

Happy wandering!

^d