Of Union-Find and voodoo magic

Logo

Earlier today I ran into a simple connected component problem: ensuring that a graph was fully connected, i.e., that there exists a path from any vertex in the graph to any other vertex. “Ha ha!”, thought I, “this is easy to solve with Union-Find!” I reached for my trusty copy of CLRS and sure enough, the first example in the chapter on disjoint-set data structures is the very problem that I was just thinking about.

Once in a while I run into a situation that calls for Union-Find, and every time it brings me back to my carefree undergraduate days of studying computer science. This was possibly the first data structure that truly fascinated me. More than that: this was my first encounter with the idea of introducing both a data structure and an algorithm that worked together to solve a problem in a relatively straightforward, if not completely obvious manner. I knew about lists and trees, and I knew about quicksort and Dijkstra’s shortest path algorithm; but this was likely my first insight into what programming could really mean (although I doubt that I had the maturity to realize it consciously at the time, and it took quite a while to sink in).

function connected(tiles) {
    const passableTiles = tiles.flat().filter(tile => tile.passable);
    const parents = new Map();
    for (const tile of passableTiles) {
        parents.set(tile, tile);
    }

    function findSet(t) {
        for (; parents.get(t) !== t; t = parents.get(t)) {}
        return t;
    }

    function connect(t, u) {
        if (u.passable) {
            const rt = findSet(t);
            const ru = findSet(u);
            if (rt !== ru) {
                parents.set(rt, ru);
            }
        }
    }

    for (const tile of passableTiles) {
        connect(tile, tiles[tile.x - 1][tile.y]);
        connect(tile, tiles[tile.x][tile.y - 1]);
    }

    const roots = new Set();
    for (const tile of passableTiles) {
        roots.add(findSet(tile));
    }

    return roots.size === 1;
}
Connected components with Union-Find

And although I keep Union-Find close to my heart, its actual implementation is further away from my fingertips (hence the appearance of CLRS in the first paragraph). Above is the version that I wrote earlier today, which I did as I was reading through the map generation part of this Roguelike development tutorial (in preparation of this year’s 7DRL challenge). Here the twist is that the graph is implied by the adjacency of “passable” tiles in a grid; the Union-Find structure is simply a JS Map object that maps a tile to its parent, building the trees of equivalence classes, where two tiles are in the same equivalence class if there is a path from one to the other. Every tile begins as its own parent, meaning that it is in its own class, then all tiles are traversed in grid order (column first, then row; so we need only check the two neighbouring tiles to the right and above) to see whether their neighbours are passable. Note that this is a very simplistic implementation that does not do anything fancy like path compression, but the grid is quite small so that will do for now.

Yesterday was the 29th of February, another occasion to reminisce, as I do every four years, except if the year is a multiple of 100, but unless it is also a multiple of 400. First problem of my first algorithm class in college: writing a predicate that evaluates to true if a given year is a leap year. And every four years, except if the year is a multiple of 100, but unless it is also a multiple of 400, I marvel at all the stories of weird bugs that happen only on that day. Surely all the people who failed to take this day into account, or messed up the computation, are terrible programmers! Maybe they overslept and were late to that first lecture, and never heard about this problem?

A kid showing a hand painting to another kid
“In this piece, I wanted to play on the opposition between man’s desire to create and his impulse to destroy.”

A third meaningful early discovery was Leslie Lamport’s vector clocks, as described in the 1978 paper Time, Clocks, and the Ordering of Events in a Distributed System (of course, I did not read the actual paper until much later). The first mind-blower here was being introduced to what seemed like impossibly complicated distributed systems. My understanding of the core idea, as best as I can tell from reading the paper, but not implementing it, is that it is difficult for distributed systems to agree on the current time, which is necessary to order the messages that they send each other. The solution is for each system to keep track of the timestamps of the received messages and ensure that the local clock is in agreement.

The reason why this idea really stayed with me is that it inspired a board game that a friend and I created using this idea as a core mechanic: players are pirates, sailing around the Caribbean Sea, looting Spanish galleons, and earning infamy as they go. However, at the end of the game, the winner is not the player with the highest score, but the one who the other players think has the highest score. One possible action is for a player to meet another one, and tell them about their exploits (but also about those of their peers); like the timestamps in Lamport’s clocks, each player’s view of the scores of the other players is adjusted when communication happens. The goal was to force the players to interact with each other, and balance the need to broadcast their own score, while also helping their competitors along the way. We playtested it a couple times, but the game did not really go anywhere. Evidently we were trying to cram too many other ideas, such as having to bury members of your crew on desert islands with your treasure in order to score points; spending some of your loot as a way to bend the rules; and, of course, voodoo magic. ⚄⚁