Such Confuse: HackMIT 2015 Puzzle Guide

30 Jul 2015

This post is third in a series of articles on HackMIT’s 2015 Puzzle! If you haven’t already, be sure to check out part 1 and part 2.

Every year, HackMIT releases a puzzle as a fun thing for hackers to do over the summer. This year’s puzzle was pretty involved, consisting of six parts that required creativity, perseverance, and computer science skills to solve.

I was pretty heavily involved in the design of the puzzle as a whole, and I built Maze, 0xd09eeffec7.dogemit.party, and Stronghold, 0xd09e5ec.dogemit.party, the final two parts of the puzzle. They were quite fun to design, and it was cool to see how different people approached the puzzles.

Maze

I like puzzles that are possible to solve given enough manual effort, but where applying programming skills can make the process a whole lot faster. While brainstorming puzzle ideas that fit this criteria, I came up with the idea of designing something based on a maze.

In the puzzle, the player would be able to see a single cell of a maze at any given time, and the player would be able to move to adjacent accessible cells. There would be hints hidden throughout the maze, and combining all of the hints would yield the secret code, the solution to the puzzle. Of course, we wouldn’t just tell players that it’s a maze — they’d have to figure that out for themselves.

Design

The initial view of the puzzle looks like this:

Green tiles indicate that it’s possible to move in a given direction, and red tiles represent walls. Clicking on the green tiles moves the player in the given direction by redirecting to current_url + [U/D/L/R]. For example, clicking up, down, up will result in the URL being /UDU.

Players attempting to cheat by rewriting the URL to go through walls are greeted with this not-so-friendly message:

There are nine special locations within the maze. Stumbling upon one will result in finding a maze cell that looks like this:

Clicking on the link leads to a hint:

There are 9 hints scattered throughout the maze.

Implementation

The implementation of this puzzle was actually pretty straightforward. First, we wrote a maze generation tool that used a variant of Prim’s algorithm to generate a random spanning tree. Then, we took the output from this tool and decided where to place the 9 hints in the maze. Here’s a map of the maze:

Finally, we put together a frontend that provided an interface for people to explore the maze and find the hints. We made sure that our interface was secure and required that people actually find their way through the maze without doing things like going through walls.

Solution

The hardest part of this puzzle is probably figuring out that it’s a maze. Observing the URL is probably the best way to determine this.

After that, it’s necessary to find the hints. Some people did this by manually exploring the whole maze, mapping it out on paper, Google Docs, or Microsoft Excel as they explored it. Others wrote a script to do a traversal of the entire maze and automatically find all of the hints.

Here’s a Python script based on a solution by Kate Yu that uses depth-first search to find the URLs of all the hint images in the maze:

import re, requests

BASE_URL = "http://0xd09eeffec7.dogemit.party"
stack = ['/']
found, explored = set(), set()

def loc(path):
    curr = [0, 0]
    for elem in path:
        curr[{'U':0, 'D':0, 'L':1, 'R':1}[elem]] += \
            {'U':1, 'D':-1, 'L':-1, 'R':1}[elem]
    return tuple(curr)

while stack:
    path = stack.pop()
    explored.add(loc(path[1:]))
    response = requests.get('%s%s' % (BASE_URL, path)).text
    m = re.search("/static/[a-z0-9]{32}\.png", response)
    if m:
        found.add(m.group())
    for url in re.findall("/[UDLR]+", response):
        if loc(url[1:]) not in explored:
            stack.append(url)

for key in found:
    print "%s%s" % (BASE_URL, key)

The script outputs the URLs of the hints, each of which contains a single letter. Anagramming the letters lsanhurmc yields the solution: much snarl.

Stronghold

I thought it would be cool to have a puzzle that was related to the “other” kind of hacking¹, so I decided to design a crackme, a challenge written to test a programmer’s reverse engineering skills.

The challenge would be reverse-engineering a password validator in a simple web application in order to retrieve a secret from the server.

Design

The start of the puzzle gives a link to the source code of the website, a simple Node.js application.

The application contains a special route, /secret/:password, which dumps the value of the SECRET_CODE environment variable if the password passes validation:

function validate(password) {

  function alphabetic(str) {
    return /^[a-zA-Z]+$/.test(str);
  }

  function toBytes(str) {
    var arr = [];
    for (var i = 0; i < str.length; i++) {
      arr.push(str.charCodeAt(i));
    }
    return arr;
  }

  function everyEven(arr) {
    var ret = [];
    for (var i = 0; i < arr.length; i++) {
      if (i % 2 == 0) {
        ret.push(arr[i]);
      }
    }
    return ret;
  }

  function cksum(bytes) {
    var a = 0;
    var b = 0;
    for (var i = 0; i < bytes.length; i++) {
      a = (a + bytes[i]) % 0xff;
      b = (b + a) % 0xff;
    }
    return (b << 8) | a;
  }

  if (!alphabetic(password) ||
      password.length < 10  ||
      password.length % 2 != 0) {
    return false;
  }

  var bytes = toBytes(password);
  var half = Math.floor(bytes.length / 2);
  var left = bytes.slice(0, half);
  var right = bytes.slice(half);
  var even = everyEven(bytes);

  return (cksum(even)  === 0x0000 &&
          cksum(left)  === 0xd06e &&
          cksum(right) === 0xf00d);

};

And that’s it. The structure of the puzzle is pretty simple.

In terms of designing the validation function itself, the tricky part was making it hard enough while ensuring that solving it did not require using highly specialized techniques. There was a lot of iteration on this where I wrote multiple solutions for each of many candidate validation functions until I was happy with the result.

Solution

This was probably the hardest challenge in the puzzle, and solving it required math or CS skills. There wasn’t some obvious bug in the code or anything — solving the puzzle actually required figuring out a password that passed the validator. I think some people thought that Stronghold was a little too hard:

Even though the puzzle was pretty difficult, a good number of people solved it, and people used a wide variety of approaches.

It is easy to satisfy the first couple constraints on the password by ensuring that it’s an alphabetic password that has even length and is greater than 10 characters long. Satisfying the checksum constraints is a little more tricky.

A few people used linear algebra, either doing it by hand or using Mathematica, to simplify the validate function so that it could be more easily brute forced. Most people used computational approaches.

A naive offline brute force attack that checked all possible passwords takes a ton of time, and it probably won’t finish in time if done on a single CPU core. I didn’t think anyone would be successful with a straightforward brute force solution, but apparently there were people who wrote parallelized brute force implementations and ran them on computing clusters to get the answer quickly. That’s pretty cool.

A single observation, noticing that candidates for the left and right halves of the password can be guessed independently, allows for an optimized brute force approach that terminates in under a minute on a single CPU core. This approach involves guessing the halves independently and checking pairs to find one that gives the desired hash for the overlapping cksum(even).

The fastest solution that I know of uses an SMT solver to find solutions. The Z3 theorem prover from Microsoft Research has a nice Python interface, so that’s what I used to solve the puzzle. Using Z3, it’s possible to express the validation computation symbolically and ask the SMT solver to find passwords that work.

from z3 import *

def cksum(password):
    a = b = BitVecVal(0, 16)
    for byte in password:
        a = (a + ZeroExt(8, byte)) % 0xff
        b = (b + a) % 0xff
    return (b << 8) | a

def alphabetic(password):
    def capital(byte):
        return And(ord('A') <= byte, byte <= ord('Z'))
    def lower(byte):
        return And(ord('a') <= byte, byte <= ord('z'))
    return And(*(Or(capital(i), lower(i)) for i in password))

length = 10

while True:
    s = Solver()
    password = [BitVec('b%d' % i, 8) for i in range(length)]

    s.add(alphabetic(password))

    s.add(cksum(password[:len(password)/2]) == 0xd06e)
    s.add(cksum(password[len(password)/2:]) == 0xf00d)
    s.add(cksum(password[::2]) == BitVecVal(0x0000, 16))

    if s.check().r == Z3_L_TRUE:
        m = s.model()
        print ''.join(chr(m[i].as_long()) for i in password)
        break

    length += 2

Running that code outputs EkNzzzYLJcNk, and navigating to /secret/EkNzzzYLJcNk gives the solution: amaze algorithm.

Statistics

We did all kinds of cool data analysis related to the puzzle, which is written up in part 4 in this series of articles!

Source Code

The full source code for the entire puzzle is available on GitHub.

When we talk about “hacking” at HackMIT, we are talking about the original meaning of the word. In other contexts, it can mean this kind of hacking. ↩

Maze

Design

Implementation

Solution

Stronghold

Design

Solution

Statistics

Source Code

Footnotes