Category Archives: Technical

Randomized floodfill

Aside from the more prosaic graph traversals (breadth first, depth first, and the all-encompassing best first), we have a class of randomized traversals that come in useful in roguelikes. The best understood of these we call randomized floodfill.

In its simplest form, imagine a breadth first traversal, your common floodfill. With each node you visit, you push all unvisited neighbors onto the frontier; to visit the next cell, you pop one off the other end of the frontier. If you pick that cell off the frontier at random, you have implemented randomized floodfill.

The simplest class of these is useful for ponds and lakes and other natural formations. Just keep a count of the number of cells you want to visit and stop adding any more to the frontier when you have visited enough (but do be sure to run through the rest of the frontier or you’ll end up with gaps.) A uniform distribution does just fine, but if you experiment with unusual distributions you can get some really exceptional shapes.

If instead of simply pushing neighbors onto the frontier you push edges, you can distinguish between the ways of reaching a cell, if two of its neighbors have already been visited; in this way you can trivially generate labyrinths, for instance.

You can maintain more state than just the number of cells visited so far. You can make some pretty good mountains (although you need to fiddle with the distribution to make great ones) by simply decrementing the height of each new cell you visit. The random order tends to make this work out as a nicely craggy but generally smooth construction.

It’s an open question at the moment whether random visitation can help us solve any of our major pathfinding questions. A best-weighted search, for instance, might fail to find the very best path at times, but it would very probably perform better than A* when the best heuristic isn’t very good. Using randomness to break ties introduces no risk of missing the best solution and might be especially useful for collaborative pathfinding; most possible sets of collaborative moves are indistinguishable each turn from one another, and expanding a node at random is a better bet than always proceeding in the same order.

Give randomized floodfill a try as part of your procedural generation — you’re likely to find that you rarely need anything better than the simplest form.

Engineering PDS

With Dijkstra we have a tried and proven technique for writing powerful AI easily. It does basically everything we need. It’s easy to understand and it’s easy to modify. The result (assuming we don’t provide goals for early termination) can be processed in many interesting ways. This reprocessing leads to chances to recompute with new edge weights, new edge connections, or new starting points; because we don’t just find our path costs and call it quits, this is the Progressive Dijkstra Scan, and in order to make it work, we need the right data structures.

Overlays

Some terminology: a cell is the space in which a monster can stand; a cell contains at least one tile; it exists at a location; in a grid-based game, it is embedded in a grid; it is, in an abstract sense, equivalent to a node or vertex in the graph of the dungeon, and its connections to its neighbors are edges (which is kind of counterintuitive, because you’d think edges would be boundaries if you didn’t know the language of graph theory.)

We’ll start with a graph that represents all the positions in the dungeon, and we won’t worry about the edges yet. We’ll number all the nodes, giving each one its own index. It’s ok if we waste an index on a cell like a wall where nothing can stand and where, for some purposes, we wouldn’t want to have a node; for the use we’ll make of it it’s really quite good to have one node for every cell. One cell corresponds to one easily computed index corresponds to one node. This doesn’t change from turn to turn or monster to monster.

It doesn’t get simpler than that.

To actually allocate one number for every node, we just make an array of length width * height. We’ll be doing that a lot, so our implementation will treat these as a primitive type, called an overlay. I imagine it as a transparent sheet of plastic I lay over the dungeon. It has exactly the same structure as the map, but it’s not stored in the map. We’ll have lots of convenient methods defined for them, and we’ll use them for all kinds of things: distance/cost/Dijkstra maps; scent maps; edge weight modifiers; gas and fluid maps; masks; field of view scans; map memory; the dungeon map itself; an inverse lookup from node numbers to coordinates; representing derived topologies.

If your language has templates, you’ll be using them for this; you’ll have something like class Overlay<TTopology, TType>. In a language like LuaJIT, with its powerful ffi, you’ll write something like overlay.new(topology, ctype), and internally you’ll allocate cells = ffi.new(ctype .. "[?]", 1 + topology.length). (The topology is the shape of your dungeon; more on that in a moment.)

With the right set of operations on overlays, you can trivially make any of these things interact with each other. That’s powerful.

Graph Topology

An overlay is made to fit a topology, the shape of the dungeon. So far we’ve only really looked at the fundamental topology, which describes the shape of the dungeon. Think of it as a conversion between the abstract world of nodes-and-edges and the more concrete world of coordinates and rooms. We’d like the same techniques to work on square maps, hex maps, Penrose tilings, arbitrary non-geometric maps. The fundamental topology is all about extracting the simple, predictable part of the geometry, without worrying about details pertinent to the current dungeon. It needs to know (1) what a coordinate looks like, (2) how to convert a coordinate to an index, (3) how many nodes it has, and (4) how to convert an index back to a coordinate. But since we can just create an overlay of coordinates, (4) is trivial and doesn’t depend on topology at all.

If we have more complicated rules to deal with, they’ll be introduced as derived topologies. Derived topologies exist in a hierarchy like an inheritance hierarchy; if two overlays share the same ancestor, it’s easy to intermingle them. That’s one reason we want the fundamental topology at hand; without it, we wouldn’t be able to use interesting results together.

We have different edges depending on what’s moving around the map. An eel can’t move where a gorilla can. So even without worrying about complications like time and slippery ice, there’s no single answer. That fact ends up being fantastically important, because edges only matter when we actually path through the map. We can apply Dijkstra scans repeatedly to the same set of nodes, but respecting different edges, and get interesting compound results.

We’ll represent edges by code, not data; they’ll list out the index and cost of each reachable neighbor of a given node (at a given time). Most of the time we’ll use one specialized to a particular type of monster, that knows where a waterbound monster or a flying monster can go, and bases its decisions on the map data we already have sitting around for the dungeon. With a good trace compiler like LuaJIT, these functions won’t even cost you anything, because they’ll just get inlined. Consider ways you can compose several functions to represent edges, such as starting with a really basic one that returns all neighbors and a filter that makes sure a particular creature can actually move over the edges it returns.

Deriving Topologies

Now when we want to find paths in the presence of more complicated features, like ice, we need to build on top of the foundation we just laid. The rule concerning ice in Cogs of Cronus, to recap, is that you can move in the direction you just moved, or 45-degrees off of it, but you can’t double back without first skating in a circle. That means there are nine different states you can be in while standing on the same tile of ice: you just moved left, you just moved up-left, you just moved up, you just moved up-right, you just moved right, you just moved down-right, you just moved down, you just moved down-left, or you weren’t moving at all. We need one node for each of those states. We could just make nine nodes for every cell on the map, but that’s bloated and inflexible.

You can look at a cell and tell immediately how many different states you need to distinguish; if it’s ice you need nine; if it’s a bridge you can stand on or go under you need two; for most cells you only need one, and for walls you don’t really need any. This is a one-to-many relationship, where each cell corresponds to zero or more nodes. The simple trick to represent it is to count the total number of states you need and allocate a derived topology with that many nodes. Then you create an overlay on the fundamental topology, and in every cell you store the index into the derived topology of the first node in it.

To recap: Create a numerical overlay of the fundamental topology. Keep a counter and loop over all of the cells in order. For each cell, store the current value of the counter in the overlay, then increase the counter by the number of states you need for the cell. When you’re done, the counter will be the size of a brand new derived topology, and the overlay will tell you how to index it. Given the coordinate (x, y, s), you’ll pass x and y through to the fundamental topology, ending up with something not unlike derived_index = one_to_many_overlay:get(x, y) + s. (The one_to_many_overlay has a reference to the fundamental topology, so it already knows how to calculate the index from x and y.)

In addition to handling cases like ice, derived topologies offer a kind of compression when the dungeon is full of walls, or when you’re looking at a creature that can only move in a rare kind of cell. You derive the topology once and it’s good until the terrain changes. Since that’s a cheap operation, you can do it every turn if your terrain changes that often. And since all overlays of the derived topology make use of the same one-to-many map, it’s not costly in terms of memory.

If you have an overlay on a derived topology, you can convert it to an overlay of the fundamental topology by reducing the various states into one. You can take the lowest value in any of them, or you can take the value in one particular one, or whatever makes sense; the translation should accept a function that controls how it happens.

You also need to write new edge functions for the derived topology, although you can certainly start from existing ones. For ice, the edge function will need to pick one of the nine target cells based on which direction it’s moving to get there, and it’ll have to consider backwards motion as being blocked entirely. You can start with a function that enumerates the fundamental topology and then filter it further and decide which substate of the target cell you’re actually moving to.

Timing

These can handle one simple class of timing problems easily as long as the measure of cost you’re using is a time cost. Imagine a door that will open on turn 10. From a neighboring cell that’s reachable on turn 1, it will take until turn 11 to move into it, for a cost of 10; from a neighboring cell that’s reachable on turn 9, it’ll take two turns to move into it, for a cost of 2. As long as you’re not dealing with spaces becoming dangerous (crushers and the like), this simple scheme will always work: When you pop a node N off the frontier and try to visit a neighbor D that has a timed door, fetch the cost c of reaching N, and calculate the next moment after c at which the door will be open; use that instead of c as you continue with the scan. Instead of leaving the node the very moment we reach it, we account for having to wait until the neighbor we’re going to becomes available.

The more general intuition is that we don’t really care how long it takes to cross an edge; we really care when we’ll get to the other side. It’s actually really helpful to return the end time instead of the delta from the edge function. The edge function can handle everything for you, and your implementation of Dijkstra doesn’t need to understand doors at all.

What if you have two adjacent cells in a corridor, slightly out of sync, and instead of blocking you they would kill you? This is really pushing the bounds of best first search: The soonest you get to the first cell might, in fact, be too soon, and we can’t allow that. So you do a bit of fudging and you pretend that the first cell can’t be entered until it will actually line up with the second one; it’s a dirty hack, but it’s the only way to do it without giving up on best first search. The problem, of course, is that with nowhere to wait you can’t assume it’s always best to reach the first one as soon as possible.

Algorithms on Overlays

How about a simple operation like taking the sum or product of two overlays? You don’t even need an edge function for that! You just make sure they both use the same topology and loop over the array. This is exactly what overlays are for: The great majority of operations are just as simple as this. Finding minima and maxima, totals, sums and products, clipping one overlay against another, using a condition on one overlay to pick a value from two other overlays — they’re all trivial.

Given an overlay (which has a reference to its topology) and an edge function, we can write even more interesting algorithms. Often we’ll want another overlay or two for a workspace — we can mark nodes as visited, for instance, or keep a doubly linked list of them (Brogue’s pathfinding actually works this way.) With that trick in hand, it’s trivial to write floodfill; add the first cell to a frontier list, and then repeatedly remove one cell from the frontier, mark it as visited in the workspace, use the neighbor function to visit all the neighbors, and add all the ones that aren’t actually visited onto the frontier. Repeat until the frontier is empty.

You can even do fluid simulation this way, but you’ll want your edge function to indicate how wide each edge is, too. (In a later article on fluid simulation, I’ll lay out the details.)

To implement Dijsktra, start with the same strategy as floodfill, but instead of visiting the frontier in arbitrary order, keep the frontier sorted. That’s all it takes. And if you are treating your costs as time, you can pass the time associated with the node you’re processing through into your edge function, so it can use it for dynamic dungeon features like timed doors. The costmap, which is your output, can also double as the ‘visited’ map, so you don’t even need an extra workspace for that.

Make sure that workspaces are cheap; it’s a good idea to have a caching or pooling strategy instead of relying on your language’s built-in allocator. You’ll be making and destroying a whole lot of them otherwise.

A Pathfinding Primer

There’s a thick mist to cut through on the way to good game AI. A data structure, after all, means nothing without a host of algorithms, and the algorithms can’t be described without their data. Everything depends on knowing everything else.

A graph consists of nodes and edges; an edge takes you from one node to another. Usually by a node we really mean a state that our system can enter; maybe it’s a position a monster can be in (quite concrete) or maybe it’s the position of everything in the game (a little more abstract). The edges just describe the way we can move from one state to another; they’re decisions we can make. That’s why graphs matter so much in game AI. They’re important everywhere, but in games we want to make decisions, and graphs make that easy to talk about.

Much of the cleverness comes in in deciding what counts as a node and what counts as an edge. That’s up to you. Sometimes you want edges to go both ways and sometimes you don’t; maybe you can only jump down a cliff, or maybe you don’t allow backwards motion because time only runs one direction. Often it doesn’t even matter (for reasons we’ll see) since the question you’re asking carries with it a sense of direction. There are lots of transformations you can make without changing what your graph means. There are bipartite graphs, for instance, where there are two disjoint sets of nodes, and every edge takes you from one set to the other; for example, you might have a node for a person and a node for a club, and edges tell you who’s in which clubs. If it’s easier you can turn the edges in any graph into nodes and end up with a bipartite graph, or turn one set of nodes in a bipartite graph back into edges, and you can even swap edges and nodes. Or sometimes you want to take a big complicated graph and simplify it by merging lots of nodes into one, losing some detail. The point is that if you ask how it has to be, you’re going to get a lot of waffling. You can do whatever you want with graphs, but that doesn’t help until you know what you can do.

Your graph might not even exist. Oh, sure, in the mathematician’s sense it exists, but it might not exist anywhere in computer memory. It might not even be possible to compute the whole thing before the universe dies. Sometimes the nodes actually exist, but you’ll just work out the edges in code; on a grid based map, there’s really no reason to construct data representing the step from one cell to another. (If it helps, remember that data doesn’t mean anything without code to read it; so pretending that an edge has to be stored as a pointer or an instance of a class, is pretending that it somehow doesn’t take code to follow that edge. In fact, you should always start with the assumption that edges are code, and only do something special if that’s clearly not true. This turns out to be especially convenient if you can easily undo an edge, since certain really thorny problems will then succumb to brute force.) It’s ok for it to be infinite if you don’t need to look at the whole thing to answer your question, but I’ll leave that for the section on Dijkstra.

What You’re Searching For

You’re looking for a path. A path is a sequence of edges. Each edge leads away from the node where the last one left you, and the first edge sets out from the source and the last one leads into the destination, if there is one. The informal language lines up with the formal language here.

When you want to find a path, you’re not just looking for the destination. If you were, you wouldn’t need a pathfinder at all; the path has to tell you how to get there, but the destination is just a location, and usually you’re starting with a destination in mind. (But not always! You can pick a path even when you have no destination in mind, so you should be able to search for a path in that case, too.) A path is a sequence of steps you plan to take, which means it’s a sequence of edges. It will never intersect itself, so it’s actually just a set of edges; but though a set is simpler in math, sequences are usually easier in computing. But sometimes the intuition that it doesn’t have to be a sequence will come in useful.

How you represent the path in memory as you build it is up to you, too. You might keep a sequence of steps to take (that is, a literal sequence of edges), or you might mark cells on the map to indicate which step you’re supposed to take out of them, or maybe you’ll keep the sequence of cells the path visits and figure out which direction to go based on that. If you use Dijkstra maps, you might not even have that: instead, you just have a score on each cell, and you construct a path as you go by moving to the best neighboring score. You can think of random walks and Brownian motion as being sequences of edges represented by the random seed!

(This isn’t any different from other data structures. Practice on a few. If you read that a list is “a first element and the rest of the list,” it means that there’s a chunk of code you can run to get the first element, and a chunk of code to get the rest of the list, and maybe all that code does is look something up in memory but that’s an implementation detail. If the rest of the list is represented by code and some state, you get a lazy list instead. Try it with the definition of any data structure you’re familiar with.)

Your search space is the set of all reasonable possibilities — so in the case of paths, you’re not searching “on a map” or “in a dungeon.” You’re searching through the set of all connected sequences of edges. But the practical sense of searching for a path through the map is identical, so you don’t usually need to think about it until you set about hybridizing multiple algorithms.

The Edges of a Dungeon

What do the edges actually look like? Everything depends on game rules. There are no edges leading from a floor into a wall cell, for instance, or through a corner that’s too “narrow” for the player to pass through. But there might be edges leading across teleports, and there can even be edges that depend on the time component of the node (remember that there will be many nodes for each cell if you need to track more state, and that that state might include time).

In Cogs of Cronus, I’ve got ice, which is slippery. That means that in addition to location you have momentum, and momentum affects where you can move next and is determined by where you moved from. So, for ice cells at least, there are actually nine different nodes: one for each direction of motion. From any of the eight neighboring cells, stepping onto the ice will lead you (for pathfinding purposes, at least) onto the node that represents that direction of motion; if you are able to stop and you wait then you move to the ninth node, which represents stillness. A little bit of thought will tell you a lot about how your monsters can think about the dungeon.

Try thinking about how you would label each node if you had to drop them all in a bag and then tell them apart later. You might name it by an <x, y> pair, or you might want to have a time component too, <x, y, t>, or you might have something like <x, y, t, <potion a, potion b, potion c>>. Even if some possible labels would never get used, this can give you some hints to help you figure out what you need for your particular problem. You always want to put as little into the label as you can. If you don’t need to know how much time has passed, don’t label the nodes that way. If you find yourself staring at graph paper, confused, you might ask yourself what you’d have to do to keep a path from intersecting itself. Sometimes there’s a bit of state you forgot.

Suppose you’re doing some cooperative pathfinding. A few different graphs make sense, and you’ll use more than one. There’s the graph (call it D) that represents the position of a single entity in the dungeon; the edges are the ways it can move from cell to cell, or wait, or teleport, or whatever. But then you can think of another graph where (let’s take the two monster case for a moment) the label of each node is <monster A is at position a and monster B is at position b>, where positions a and b are actually nodes in the simpler graph. The edges of this cooperative graph, if you imagine the monsters as moving simultaneously, will correspond to all possible pairs of edges of D (one move for each monster). If you imagine that one moves and then the other moves, then you get twice as many nodes (“A is here, B is there, and it’s A’s turn,” “A is here, B is there, and it’s B’s turn.”) but you have far fewer edges. Either way this is a huge search space and it takes some good tricks to do it well.

Search Strategies

A breadth-first search proceeds outward in shells, like an onion, visiting the source first, then all the nodes one edge away, then all the nodes two edges away, and so on out. When roguelike devs talk about floodfill, this is what they mean. To perform a breadth-first search you have to keep track of all of the “open” nodes, which are the ones in the current shell, so you can come back to them and visit their children. It’s a good approach when (as in the floodfill case) you actually want to visit every node in the graph.

A depth-first search just sets out on a path and keeps going until it hits a dead-end; then it backs up one step and tries something else. Backtracking algorithms (like Knuth’s Dancing Links) perform depth-first searches of combinations of constraints. Depth-first needs to remember its path back, which means you either need a stack or you need to mark nodes with the direction to their parent. It’s still up to you what order you pick the children of any node in; making a good guess can save a lot of time.

A best-first search also, like a breadth-first search, keeps track of open nodes; instead of running through all the nodes in one “shell,” though, it repeatedly picks the “best,” or most promising, open node. What the best open node is depends entirely on the problem you’re solving, so best-first search is a huge category. The other strategies can both be treated as forms of best-first, since in the case of breadth-first you can say the best node is the one the fewest steps from the start, and in the case of depth-first you can say the best node is the one that’s the most steps away. But best-first searches really shine when you have a different measure, such as distance, hazard, or time cost. It’s not just that you find your destination faster with a best-first search. This really useful property emerges: The sequence of steps from the source to the destination, when you find it, will also be the best path (or one of several ties for best), because if there had been a better path you would have expanded it first! (You try the best first, so of course the first is the best.)

Practical pathfinding algorithms expand the best nodes first. It is faster, easier to prove that the result is optimal, and more versatile. But it is useful to see the varieties as being more akin to breadth-first or to depth-first.

Dijkstra is more like breadth-first; it checks all of the options closer to the source before proceeding further. A* is more like depth-first; it sends runners out towards the destination, and only defaults to a Dijkstra-like strategy when they crash into obstacles. A* requires more information about global geometry than Dijkstra, since it has to know which edges are closer to the destination in order to send runners out towards it (this knowledge is called the “heuristic,” since it’s a rule of thumb for judging how far is left to go). We also normally let Dijkstra keep going after finding the destination, or we don’t even have a destination in mind, in order to get costs to every cell on the map; A* we normally terminate as soon as it finds the destination, and treat the result as the best path.

But you can terminate Dijkstra after finding a destination, or (if there is more than one), after finding a certain number of them, and you can also use a heuristic in order to find that destination faster. Make both of those simple changes and you’ve turned Dijkstra into A*.

In any search you can terminate after visiting a certain number of nodes, or after going a certain maximum distance, although then you can’t gurantee that you will find a result even if one exists; it might happen to be further away than you’re willing to look. For an AI this is generally quite ok. You can get passable results by running shorter searches most of the time, and proceeding to longer searches only if you have to, creating the appearance of indecision.

With Dijkstra we often want to visit every node, so we have to simplify our graph as far as possible. We usually want one node per cell of the dungeon, and we want edges to correspond to steps we take, and we don’t want to worry about things like backtracking. With A*, because we zero in on a path that looks quite likely, we can travel a more elaborate graph, or at least keep more information about each node. The rule of thumb is that with Dijkstra, we get exactly one number per node to treat as state, and that’s the score that we were already using to direct the best-first search.

The score, which is the cost of the best path so far, is the one extra piece of state you can track in Dijkstra without allocating additional space. (Really, though, it’s just that you’ve already allocated extra space for it.) If the number is time, then you can use that very carefully to let edge costs change with time. Monsters can wait for doors to open, or even wait for a floating platform to arrive. If that number represents an accumulated “hazard,” the chance of taking enough damage to be killed, then perhaps we believe that a certain kind of risk no longer matters after a certain threshold is passed (imagine separate shields and health, where some damage only affects shields.)

Neither A* nor Dijkstra allows us to speak of any property of the path as a whole. They cannot, for instance, prefer paths that look the most like straight lines when they move through open rooms; that must be achieved by post-processing. But if you are willing to track some extra state you can certainly minimize twistiness, defined by the number of times you have to change direction, by the same trick I use for ice in Cronus: give yourself eight nodes per dungeon cell and charge extra whenever the next step is in a different direction than the last. You can track the properties of the path, such as whether it crosses a certain kind of cell, and even to take that into account in choosing the best step to take next — but only as a tiebreaker, as with lexicographic ordering.

Gradients

It is possible, in some simple cases, to find paths without doing any pathfinding at all, as such. When you have a gradient map of any kind, representing (perhaps) the distance from valuable items, or the most recent time that a cell was visible to the player, or when cells were last traversed by fleeing monsters, it is possible to find a local optimum by simply rolling down (or up, if that’s how you roll,) the map. Find the neighbor with the best score and move there by whatever edge connects you.

If this gradient has been generated by a Dijkstra search, then the result will in fact be an optimal path to the destination. Influence maps and scent maps also produce gradients. It is possible to perform arithmetic on them, cell by cell, adding or multiplying them together; consider that our computers are more than a thousand times faster than they were thirty years ago, when arithmetic on single numbers was cheap enough not to worry about, and conclude that arithmetic on thousands of cells must now be cheap, too.

You can pick a random cell in a gradient map, if you treat each cell as a score, by picking a random number between one and the sum of all cells, then looping over them and subtracting them from that number until it falls below zero.

You can treat any gradient map as a cost map for each step in another pathfinding scan. You can also treat one as the initial state when pathfinding; the number in each cell tells you how much it costs to start there. (Brogue makes extensive use of this.) Making a Dijkstra scan, manipulating the resulting gradient, and then running another scan on the result, is a potent technique.

Whitespace Reasoning

Our agents need to make sense of a complicated world. They need to outwit the player every so often and they need to make convincingly organic mistakes, not systematic outcomes on obvious clockwork. There are simpler problems to deal with; they need to maneuver and plan in a world that changes, not only with time within a playthrough, but over the scale of development. You can’t afford to hold the hand of every monster you’ve ever added, every time you add another.

(A word about the brain, by way of metaphor. We think the purpose of the brain is to deal with the world, but its first purpose is to deal with the body. It doesn’t know, until it learns, how tall you are, how many colors you can distinguish, or even how many limbs you actually grew. Its job is to cope with whatever the body throws at it. If the brain had to change every time the body changed, every mutation would be abortive and every amputation, mortal. Equate the body with the game world, not with the monster’s body; good AI lets you change the game world without worrying about breaking carefully crafted monster behaviors. It’s about looser coupling.)

I’ll start by discussing the general principle of objectification (how you pretend that your beliefs about the world are actually out in the world, and how that idea can simplify monster AI), the more specific principle of whitespace reasoning (which is the point of bringing up objectification at all), and then start to discuss Dijkstra scans from the very particular perspective of the roguelike developer. In later posts, I’ll develop these themes more completely; this is something of a cross-section.

Objectification

It is profitable for an actor not to think about what it can do. It should project its capacities onto the world around it, asking what the world can do for it. This is a very practical view, but it seems strange because we use it automatically, without understanding it. An everyday object like a hammer seems to be for driving nails and, if it has a claw, for pulling them; this is the only property of a hammer that sets it apart from other objects of similar heft. But this isn’t an ability of the hammer at all — you’re projecting your own ability (that is, to swing) onto it. Your actors need to objectify the world, and each other, and themselves.

Because video game actors spend most of their time moving around, it makes sense for them to objectify the space around them. You can see this as offloading thought from the actor onto the space around the actor — whitespace reasoning. Instead of imagining that your actors think for themselves, pretend that the space they’re standing in does the thinking for them. The floor knows there’s food in the fridge; the road knows there’s work at the factory; the grass knows there’s a wolf nearby. You can think in terms of ants and their pheromones if it helps you, but I suggest that you not think about it in those terms too long. Instead, think of places that feel strange even when they’re not serving their usual function. Consider how you feel standing in the middle of an empty street, or when you’ve wandered into the wrong neighborhood, or when you’re stuck in the woods. I feel queasy cutting across a queue, during the moment I intersect it geometrically.

You want to imagine that your actors feel this way.

You’ll most often see this done by analogy with ants or hounds, though. In a scent map everything interesting leaves a smell, and smells diffuse and fade with time. Most models you’ll see go in for a very literal interpretation based on physical diffusion, which is curious, and an example of taking the analogy too far. They also tend to stop right here, because the analogy doesn’t get you any further. Influence maps are scent maps by a different name, and I haven’t seen them used in any more sophisticated way. They’re often worse, using an exponential falloff function for which there is simply no justification, but which makes reasoning about the map much more difficult.

Neither of these, of course, does more than compute distance, with a few wrinkles that are hard to formalize. Both, for instance, propagate signals slowly — a more distant change takes long to affect local values. (The reduction of jittering caused by slower propagation can justly be adduced in its favor, but jitter is likely to reemerge in unusual situations; it is better dealt with directly.) Both perform poorly, but as they are generally used sparingly, this is no great loss. We’ll be using a more general formulation that is faster, easier to integrate with other techniques, and much easier to reason about formally.

Enter Dijkstra

If you haven’t read it, read Brian Walker’s The Incredible Power of Dijkstra Maps. Brogue shows just how exciting a sprinkling of pathfinding techniques can be. The algorithm given in the article performs suboptimally, but it hardly even matters. It’s amazing what you can get away with, and it’s worth it even if you can’t put together the most efficient scan. You can do more with distance maps than immediately presents itself.

Brogue’s safety maps, which are used when monkeys run away from the player, are calculated by scanning the distance from the player, multiplying by a constant smaller than -1, and scanning again. This is a totally different world than anything done with influence maps or scent maps. The first scan detects cells that are most distant from the player; the negation turns those cells into goals, and the constant causes the monster to be willing to dash by the player to get further away. Several nonlocal effects get rolled into one map, and the monsters exhibit global thinking — all without looking further than neighboring cells. Space itself has done the thinking for them.

In Brian’s fourth case (“Desire Driven AI”), you’ll want to make sure to run another scan after adding the desire maps together, and then add some multiple of the original sum back in. Why? You don’t want to get stuck in a local rut, but you really would like to stop by water on your way to a more important goal. These Dijkstra maps treat every cell as a potential source; the scan doesn’t take a single source and minimize its cost, nor even a list of sources. Instead it lets you give every cell an effective cap. If you can’t get there cheaper from somewhere else, then its own influence kicks in. This is absolutely invaluable for safety maps and similar global reasoning.

What’s in a node?

We needn’t assume, even in a grid-based game, that one grid cell is one node. It’s certainly possible to reduce the number of nodes we need to look at; you can set one node at every doorway, for instance, and assume it’s easy to path across rooms. Each node, in that case, would represent more than one grid cell, speeding up pathfinding considerably. But that kind of optimization works against us, here; we’re looking for new interpretations of space, and that that trimming would lock us into one model.

More than one node can belong to a single grid cell. This is much more useful: It lets us track state other than position. We don’t want to track any kind of global state this way because it results in an exponential explosion of nodes, but we can track local state very easily. So keeping track of which doors have been opened is probably a bad idea, but if doors close automatically after two turns it’s fine (because we don’t need separate nodes for that state when we’re further than two steps from the door.) If there’s slippery ground, we can have nine nodes for one cell, one node for each direction of motion; stepping from a neighbor onto such a cell, actually brings us to the node that includes the proper direction.

For every two sets of state we want to track, we need to track their direct product. If we need to track whether a door is open and which way we’re sliding on ice, we need 2*9=18 nodes. If we need to track eight doors and sliding, we need 256*9=2304 nodes, all for one cell. We obviously need to control this carefully, but the general technique is too powerful to give up entirely; it’s also quite inexpensive, used carefully.

Dijkstra and A*

The explosion of nodes representing various states is the major reason Dijkstra ends up falling flat for us: Every node has to actually exist, and we expect to visit all of them. When that’s just one node per cell on an 80×25 map, that’s 2000 nodes; smaller than a couple lines of a high-resolution bitmap. When it’s sixteen nodes per cell on a 400×300 map, we hit 1.83 million nodes, about a thousand times bigger and a thousand times slower. Or keep the map small, still just 80×25, but give the player 26 potions to drink: each turn, the player can either move or drink one potion. We end up needing at least 226 * 2000 nodes, because at any given time each potion will have been used or not; in practice we need even more, so we can track the state of the effects caused by drinking potions. If each cell takes one byte (which is unreasonably compact), that’s 125 gigabytes. Good luck.

Now, one way to simplify things is to construct nodes only when we need them; until your search algorithm tries drinking the Potion of Levitation, don’t actually allocate cells for that case, and then only allocate them on the nodes you visit. But we have to add a cutoff — either limit the number of nodes we expand, or limit the maximum distance we’ll search. Neither is entirely desirable, particularly since we’ll still only explore the immediate neighborhood. If the puzzle is remotely difficult, and the potion-drinking searcher needs to make it across the map, we’re pretty much certain not to find an answer.

A* and Dijkstra solve the same basic problem, but A* relies on a few extra hints in order to avoid visiting every cell. (This isn’t a huge gain for us in the 80×25 case; whitespace reasoning is all about visiting every cell, and it’s really quite cheap to do so, as long as we avoid an explosion of states. We do need A* for some very unusual problems, and we’ll come to those in a moment.) The first hint is a destination; we have a particular node (or set of nodes) that will short-circuit our search, if we happen to bump into them. (We can add a destination to Dijkstra, too; the result is half way to A*). The second hint has to do with the order in which we search. Instead of looking all around our starting point, we take advantage of geometry to start out walking blindly towards the destination, based on an admissible heuristic, which is a guess as to how much further we need to go before we get to the destination. It’s not actually a guess: It’s got to be the absolute best case. On a simple grid based map, it’s the number of steps we’d have to take if there were no walls anywhere.

If the heuristic is always exactly right, then the search will always go straight to the destination, giving the best possible path as fast as possible. The worse the heuristic underestimates the real cost, the longer it will take us to find the optimal path, but we will eventually find it. If the heuristic ever, even once, overestimates the real cost, it becomes possible that we will never find an optimal path. A heuristic that can produce an overestimate is inadmissible.

On a map with teleports we cannot, in general, use any better heuristic than the distance to the nearest teleport (plus the distance from the destination to the nearest teleport to it). If the map is totally non-euclidean and nodes are connected arbitrarily, as they might be in interactive fiction or on a MUD, then all bets are off. We simply cannot generate a valid heuristic; there’s no grid to overlay on the nodes, anymore.

On the other hand, we’re allowed to track extra state with our A* nodes. We actually can attempt to solve a puzzle that involves drinking potions in a certain order. (Risto Saarelma wrote an AI for Vector Racer, in which players manually adjust the x and y components of a car’s velocity to navigate a track drawn on graph paper; it involved an A* search over the quartet <x, y, xv, yv>. This kind of thing is valid and really quite practical.) The only problem is that our heuristic is going to be terrible.

Dijkstra to the rescue. We can actually perform a Dijkstra scan under better-than-possible conditions and use the result as our heuristic — it’s an absolutely glorious possibility. It means that while our A* search is making the hard decisions, it doesn’t have to deal with things it’s bad at, like navigating a maze. Dijkstra, which is great at navigating mazes, has already handled that part. If the A* just follows the result of Dijkstra, whenever it’s possible, it’ll get to its destination. All that A* needs to handle, in this case, is the extra state that Dijkstra couldn’t handle. It’s a perfect pairing; the weakness of one plays off the other.

Take group coordination. It’s easy to do approximate group coordination with Dijkstra alone; monsters can path towards their target, try to stay a certain distance from each other, or follow a leader just fine. But they’ll bump into each other and you need to add game rules, like displacement, to resolve it. To do a Dijkstra scan for two monsters, over a map with N nodes, you now need N*(N-1) nodes — each monster can be anywhere. Even if you force them to stay within five steps of each other (the implementation of which is an interesting challenge), it doesn’t help matters considerably, and you risk missing an otherwise valid solution. But A* can handle the two monster case just fine, and for many more monsters, too, as long as it has useful Dijkstra scans to help it find its way.

(Other refinements to A*, like Jump Point Search, amount to a simplification of the graph the search is being performed on; they’re really only useful when certain assumptions about symmetry hold, and we’re not willing to make those assumptions. Instead, we’ll go on designing around the more general cases.)

Future Topics

In a later post, I’ll be discussing the design and implementation of the PDS (Progressive Dijkstra Scan) suite and the novel possibilities that these techniques open up. Maps no longer have to be static; monsters can understand levers, wait for each other to finish walking through narrow tunnels, and make sensible decisions about their inventories. More important than anything, though, the bad decisions they still inevitably make will make sense to players — and the bugs (for there will be bugs), will be hilarious.