The Intuition Behind NIM and Grundy Numbers in Combinatorial Game Theory

Правка en8, от Shisuko, 2019-04-15 18:13:53

There are already many wonderful resources online for learning the basics to combinatorial game theory, but the intention I have for this guide in particular is to try to make them more intuitive. Why does XOR work for Nim? Why does mex produce the Grundy numbers? When it was taught to me, it was more or less just as 'magic' to memorize, but now I think I have a decent understanding of why they are true, so now that I have a grasp of the intuition behind it, I decided I should record my thoughts into a blog.

Nim games

Suppose we are playing a Nim game. As we know, Nim games usually follow this sort of template:

  • There are $$$n$$$ piles of stones, and each pile contains $$$p_1, p_2, p_3, \cdots p_n$$$ stones, where each of the $$$p_i$$$ is some nonnegative integer. This collection of piles defines our game state.

  • There are two players who take turns removing stones. The current turn player may remove as many stones as they wish, so long as all the stones come from the same pile. More formally, they choose a pile $$$i$$$ and a number of stones to remove $$$0 < s \leq p_i$$$, then replace $$$p_i$$$ with $$$p_i-s$$$. You cannot pass your turn doing nothing. Then, it is the other player's turn to remove stones, and so on.

  • The game ends when there are no stones left on the board. The winner is the one who took the last stone; alternative, the loser is the first one who cannot take a stone on their turn.

  • The game state is called a winning state if there exists a winning strategy for it. That is, the current turn player has some series of moves such that they will always be able to win, no matter what the other player does---a checkmate, so to speak, or more specifically, when a game of chess is locked down and destined to end in a checkmate in a few moves; even if the checkmate isn't in position yet, it is still inevitable. Conversely, a game state is called a losing state if no matter what the current turn player does, their opponent will always have a winning strategy to eventually win.

The question then usually goes something like this: Given an initial game state $$$p_1, p_2, p_3, \cdots p_n$$$, is the first player going to win, or the second player, assuming both play perfectly? I.e., is the initial game state a winning or a losing state?

There exists a fairly standard $$$O(n)$$$ formula to solving the Nim game, though I'm not sure how many people actually know why it is true. While I am normally against memorization without proof, it is far, far easier to explain the actual strategy compared to the effort needed to prove that it works, and understanding why it works doesn't really add too much insight in the contest problems themselves, so I can see why most people don't bother with the proof. But, we are nerds, and we like knowing why the things we do work, and so here is my attempt to naturally and intuitively derive for ourselves what this magic formula is.

You can only take stones from one pile at a time, so it seems like a good idea to get some information from each pile, then find some way to compose that information together to describe the board state as a whole. It's rather mathy, but as an analogy, think about multiplicative functions---to evaluate a multiplicative function at $$$n$$$, we decompose $$$n$$$ into its prime powers, more easily evaluate the function at each prime power $$$p^k$$$, then compose the answers for each $$$p^k$$$ together to get the answer for $$$n$$$.

So, what we will do is:

  • For each game state $$$g$$$, let us assign to it some integer "Nim-value" $$$G$$$ which tells us whether this is a winning state or a losing state.

  • Given two game states $$$a$$$ and $$$b$$$, let $$$a+b$$$ be the game state that is simply the piles of $$$a$$$ together with the piles of $$$b$$$. If their Nim-values are $$$A$$$ and $$$B$$$ respectively, then define $$$A\oplus B$$$ to be the Nim value of the game state $$$a+b$$$. We are assuming (and hoping) that there does in fact exist some operator that allows us to compose together the Nim-values of different games together easily.

Now, let's tackle the simplest possible game---one that consists of a single pile with $$$p$$$ stones in it.

  • Intuitively, since the only information that distinguishes the piles from one another is the number of stones in them, let's say that the Nim-value of a game that consists of only a single pile is $$$p$$$.

  • We see that a single pile is a winning state if $$$p>0$$$, and is a losing state if $$$p=0$$$. That's something we'd like to see reflected in the Nim-values as well---a Nim-value $$$G$$$ corresponds to a winning game state when $$$G>0$$$, and corresponds to a losing game state when $$$G=0$$$.

Next step, let's find a way to compose Nim-values together with this operator $$$A \oplus B$$$. We should notice the following properties about it though:

  • $$$(A \oplus B) \oplus C = A\oplus(B \oplus C)$$$, and $$$A \oplus B = B \oplus A$$$. That is to say, $$$\oplus$$$ is associative and commutative. Since we're just combining the piles together, order shouldn't really matter, so this should be apparent.

  • $$$A \oplus 0 = A$$$. That is to say, the identity for this operator is 0. We obviously want this to be true since if you place an empty pile next to an existing game, you should still effectively have the same game after all.

  • $$$A \oplus A = 0$$$. That is to say, the inverse of each game state is itself. Think about why this is true---suppose we have two piles that each have $$$p$$$ stones in them. The second player always has a winning strategy here, which is to just mirror the first player's move! For some generic game state $$$A$$$, note that the game state $$$A\oplus A$$$ guarantees a respective 'mirror image' for each pile; if the first player makes a move on some pile, the second player can always take the same number of stones from the 'mirror image' of that pile. Thus, the first player will always be the one to run out of moves first, therefore this is a losing state, and therefore $$$A \oplus A = 0$$$.

You'll notice that together, these properties define a commutative group over the nonnegative integers! In fact, take a look at that last property in particular, that $$$A \oplus A = 0$$$. The inverse of every element is itself, i.e. every non-zero element in the group has order 2. This is actually a huge hint because it turns out the only family of groups that have this property is that with the set $$$\{0,1\}$$$ and the operation of addition modulo 2, or some ordered $$$n$$$-tuple of $$$\{0,1\}$$$ with the operation of addition modulo 2 done between each corresponding component; aka, the logical XOR and the bitwise XOR.

Bingo! We have a winner. It seems our operator $$$A\oplus B$$$ is just the bitwise XOR of the two Nim-values. In fact, since XOR is just addition modulo 2, in literature the Nim-values are usually called the Nim-sums of the game state. However most of what I gave consists of heuristics and intuitions---now that we have this operator as a candidate, let's try to formally prove that our magic formula works.

Given a game of Nim whose current game state consists of $$$n$$$ piles with $$$p_1, p_2, p_3, \cdots, p_n$$$, then the current game state is a winning state if the Nim-sum $$$p_1 \oplus p_2 \oplus p_3 \oplus \cdots \oplus p_n$$$ is non-zero, and a losing state if it is equal to zero, where $$$\oplus$$$ is the bitwise XOR operation.

  1. The empty game is a losing state with Nim-sum 0. This follows from our observations earlier.

  2. If the current Nim-sum is 0, then any move will make it non-zero. Recall that a move replaces $$$p_i$$$ with $$$p_i-s$$$. So, keeping in mind that the XOR-inverse of each integer is itself, that means the new Nim-sum after a move is made becomes $$$p_1 \oplus p_2 \oplus p_3 \oplus \cdots \oplus p_n \oplus p_i \oplus (p_i-s)$$$; we remove $$$p_i$$$ from the Nim-sum, then add in $$$p_i-s$$$. But, if the current Nim-sum was 0, then this is just $$$p_i \oplus (p_i-s)$$$, and this is only equal to 0 when $$$p_i=p_i-s$$$, but the rules state you have to always take at least one stone on your turn, so $$$s \neq 0$$$, and it is impossible for this new Nim-sum to still be 0.

  3. If the current Nim-sum is non-zero, then there always exists a move to make it equal to 0. Suppose the current Nim-sum is $$$N>0$$$. We need to choose $$$p_i$$$ and $$$s$$$ such $$$p_i \oplus (p_i-s)=N$$$, subject to the constraint that $$$0 < s \leq p_i$$$. Consider the fact that $$$N$$$, since it is nonzero, must have a greatest significant bit, the $$$g$$$th bit. Furthermore, at least one of the piles $$$p_i$$$ will have the $$$g$$$th bit as a 1 as well; if all the piles had a 0 there, then it couldn't have been the greatest significant bit of their XORsum after all. So, we choose one such pile as our $$$p_i$$$, and we construct our $$$p_i-s$$$ as the following:

  • All bits greater than the $$$g$$$th bit we keep the same as $$$p_i$$$.

  • The $g$th bit is set to 0.

  • All bits less than the $$$g$$$th bit we make match those of $$$N \oplus p_i$$$.

Notice that this construction is always less than $$$p_i$$$, since we know their first difference is at the $$$g$$$th bit, therefore this is a valid choice of $$$p_i-s$$$, from which we can easily recover what the $$$s$$$ should be. We can see that the new Nim-sum is 0 because

  • All bits in $$$N$$$ greater than the $$$g$$$th bit were already 0 (since the $$$g$$$th bit was defined to be the greatest significant bit). Then, we know that all bits in $$$p_i \oplus (p_i-s)$$$ greater than the $g$th bit are also 0, since that was how we constructed it.

  • The $$$g$$$th bit in $$$N$$$ is 1, by definition as the greatest significant bit, and the $$$g$$$th bit in $$$p_i \oplus (p_i-s)$$$ is also 1, since it is 1 in $$$p_i$$$ and 0 in $$$p_i-s$$$.

  • All bits in $$$p_i \oplus (p_i-s)$$$ less than the $$$g$$$th bit are guaranteed to match that of $$$N$$$, since $$$p_i \oplus (p_i-s) = p_i \oplus (p_i \oplus N) = N$$$ in this range.

The best way to convince yourself and see that this does in fact set the new Nim-sum to 0 is to write it out yourself and try it with some concrete values.

Now, with these tools in hand, we can see that:

  • If the current turn player has a zero state, they have no choice but to 'disturb' it into a nonzero state.

  • The next player then, given a nonzero state, always is able to 'restore' it back into a zero state.

  • Thus, if the players know what they are doing, the first player is helpless and always stuck with a game state whose Nim-sum is 0; since this remains invariant and there are only a finite number of moves, eventually when the game ends with the empty game whose Nim-sum is 0, we know the current turn player, and loser, is the first player.

  • Similarly, if the current turn player has a nonzero state, they can restore it to a zero state and pass the board over to the second player, who by our logic, must be the loser, thus the first player is the winner.

This proves our theorem, and we can determine who should be able to win in $$$O(n)$$$.

Extensions on NIM and Grundy Numbers

No self-respecting problem-setter would ever give just a vanilla Nim game in 2019, though. Usually there will be some variation on the rules somehow. Consider the following, for instance.

Say you are similarly given a game state of $$$n$$$ piles with stones $$$p_1, p_2, p_3, \cdots p_n$$$. The normal rules of Nim apply, except there is an extra move: instead of removing stones from a pile, the current turn player may choose to spend their turn adding any number of stones to a pile instead. They can add as many stones as they want, so long as, again, they are all added to the same pile. Assume this addition move may only be done a finite number of times per player.

It turns out that... this is exactly the same as normal Nim! Why? Assume that the current turn player is in a losing state, under the normal set of Nim rules. However, if they add $$$s$$$ stones to some pile, the other player can just spend their turn removing $$$s$$$ stones from that same pile, effectively undoing the last move, and then the losing player is back exactly where they started. A winning player, meanwhile, can just play the game like normal Nim, only deviating from the strategy to undo any additions the opposing player makes, as necessary.

Other games though are not quite as simple. A common tweak is putting a limitation on the number of stones you can get from a pile. What if there is a condition that you can only take a perfect square number of stones from a pile? What if you can take no more than half the stones in a pile?

Let us turn back to our original strategy---decompose the game state into several piles, find some property for each of the piles, then compose them again together to get the answer.

So, given only a single pile with $$$p$$$ stones, how do you know if it is a winning state or a losing stats? Earlier, we simply operated on $$$p$$$, but we'll need something a bit more sophisticated here.

Let's reframe this as a directed graph. Let there be $$$p+1$$$ vertices in this graph, labelled $$$0$$$ to $$$p$$$. Each vertex corresponds to some game state, where its label indicates how many stones are in the pile at this state. Then, draw a directed edge from $$$u$$$ to $$$v$$$ if it is possible to transition from $$$u$$$ stones to $$$v$$$ with a valid move. For instance, if you may only take a square number of stones from the pile, then vertex 11 would have a directed edge to vertices 10, 7, and 2, since I could take 1, 4, or 9 stones from that pile. The terminal vertices---the ones with outdegree 0---are our loss states.

We could imagine solving this with DP then. Let the terminal vertices be, by definition, loss states. Then, if a vertex points to at least one losing state, it is a win (the current turn player takes that moves and puts the opposing player in a loss state). If a vertex only points to winning states, then it is a loss (the current turn player has no ways out).

Let's take this a step further though and treat it like our Nim-values from earlier; instesd of just assigning win/loss, let's assign some integer value to each vertex such that 0 is a loss and nonzero is a win. This is usually called the Grundy numbers, or more cutely, the Nimber of a pile. I will be using Grundy numbers because that's what I'm used to, but I think Nimber is an absolutely fantastic word.

Let the $$$G(u)$$$ be the Grundy number of a pile with $$$u$$$ stones. The Grundy number of a terminal state is 0; otherwise, $$$G(u)$$$ is recursively defined as the minimum excludant of the Grundy numbers of all the states it points to. We usually write minimum excludant as mex, and it basically means "return the smallest nonnegative integer that is not in this set." Let's consider a basic example, such as the Nim game where you can only take a square number of stones from a pile. If there is a pile with 5 stones, then we can take $$$1^2=1$$$ stone and transition to a pile with 4 stones, or take $$$2^2=4$$$ stones and transition to a pile with 1 stone. So, $$$G(5)=mex(G(5-1),G(5-4))=mex(G(4),G(1))=mex(2,1)=0$$$. So, it turns out that a pile with 5 stones in it is a loss state (you can verify the values of $$$G(4)$$$ and $$$G(1)$$$ for yourself, but even without Grundy numbers, you should be able to see why 5 stones is a losing state).

How about when there are multiple piles though? Amazingly, we can apply the same strategy we did earlier for Nim, except on the Grundy numbers. The important Sprague-Grundy theorem states that these games are equivalent to playing Nim, but instead of getting the Nim-sum by taking the XOR of the piles, we take the XOR of their Grundy numbers.

Let's consider that square number Nim again. We see that the Grundy numbers for $$$0, 1, 2, 3, 4, 5, 6, 7$$$ are $$$0, 1, 0, 1, 2, 0, 1$$$, respectively. You can verify this. Therefore, if we have a game state with piles of $$$1, 2, 3, 4, 6$$$ stones, the Nim-sum of this game is $$$G(1)\oplus G(2) \oplus G(3) \oplus G(4) \oplus G(6) = 1\oplus 0 \oplus 1 \oplus 2 \oplus 1 = 3$$$, which is non-zero, therefore the first player has a winning strategy.

Note that in our normal game of Nim, each state points to every state smaller than it (since taking any number of stones is a valid move). Therefore, $$$G(p)=p$$$, which is consistent with our results about the original Nim.

When I first learned this, I thought it was magic! But the helpful commenters over at the math subreddit helped give me this wonderful intuition. The mex function is so weird, what's that all about?

Basically, the mex function is a promise. The promise is that: if $$$G(p)=g$$$, then it means every integer from $$$0$$$ to $$$g-1$$$ is accounted for in the list of valid moves. Imagine that every pile in our game was transformed from having $$$p$$$ stones but with weird rules, to having $$$G(p)=g$$$ stones but with normal Nim rules. You can see that we can do this because in Nim, we can transition from a pile of size $$$p$$$ to any other pile of size less than $$$p$$$; in our transformed Grundy number world, the mex property guarantees that we can transition from our 'pile' of size $$$g$$$ to any 'pile' of size less than $$$g$$$. The mex encodes information about which states are available to us, and having access to every state from 0 to $$$g-1$$$ looks a lot like Nim. So, by looking at the piles this transformed mex world, we removed the weird restrictions and really are just playing Nim!

However, unlike in Nim, it's possible to transition from a pile of size $$$g$$$ to a pile of size greater than $$$g$$$. Consider in the square number game earlier, if we have a pile of size 5 and we take away one stone, we end up with 4; in the transformed Grundy number world, we went from a pile of size 0 to one of size 2!

But not to worry, we already accounted for this! Remember we said that Nim where you can add stones is just the same as normal Nim. Suppose I started with a Grundy number of $$$g$$$, then after making a move I ended up with a larger Grundy number in the new pile. But, recall that the Grundy number of some state is the minimum excludant of all the states it can transition to; it's our promise from earlier. So, if we transition from $$$g$$$ to some larger number $$$h$$$, we have been promised that from $$$h$$$, we can visit any of the numbers from 0 to

Unable to parse markup [type=CF_MATHJAX]

we can always just transition back to $$$g$$$! Just like in our Nim-with-addition from earlier, we can always just undo any additions that the losing player might try.

Therefore, if we use Grundy numbers, Nim with any weird restrictions on the amount of stones we can take per pile can be reduced to Nim with addition, which itself can be reduced to normal Nim.

The complexity in these kinds of programming problems in contests usually then come downs to finding some efficient formula for the Grundy numbers.

I hope this guide helped you understand where the XOR came from (the self-inverse property) and where the Grundy numbers being mex came from (mex is a promise). I was very fortunate to talk to people who gave me a better understanding of this, and I hope to share this knowledge with the community too :)

Теги #game-theory, #nim-game, #xor, grundy, sprague-grundy

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en18 Английский Shisuko 2020-08-04 12:41:11 0 (published)
en17 Английский Shisuko 2020-07-30 17:18:24 21 Tiny change: 'rundy:\n\n**Project Euler**\n\nhttps:' -> 'rundy:\n\nhttps:'
en16 Английский Shisuko 2020-07-30 17:13:26 733 Added list of problems (saved to drafts)
en15 Английский Shisuko 2020-07-30 11:53:45 204
en14 Английский Shisuko 2020-03-03 14:52:04 1424 Revised the wording of some parts.
en13 Английский Shisuko 2019-04-16 05:53:18 0 Grammar fixes (published)
en12 Английский Shisuko 2019-04-16 05:45:09 39 (saved to drafts)
en11 Английский Shisuko 2019-04-15 18:20:26 0 (published)
en10 Английский Shisuko 2019-04-15 18:18:57 1 Tiny change: ' 0 to $h-1. And, si' -> ' 0 to $h-1$. And, si'
en9 Английский Shisuko 2019-04-15 18:17:56 173
en8 Английский Shisuko 2019-04-15 18:13:53 688 Tiny change: 'I thought ut was magi' -> 'I thought it was magi'
en7 Английский Shisuko 2019-04-15 17:57:42 1232 Tiny change: 'the set $\left\{0,1\right\}$ and th' -> 'the set $\\{0,1\\}$ and th'
en6 Английский Shisuko 2019-04-15 17:30:05 1510
en5 Английский Shisuko 2019-04-15 13:57:07 27
en4 Английский Shisuko 2019-04-15 13:50:06 1 Tiny change: 'nstance, $\mex(0,1,3,' -> 'nstance, $mex(0,1,3,'
en3 Английский Shisuko 2019-04-15 13:49:11 25622
en2 Английский Shisuko 2019-04-15 11:59:56 13982
en1 Английский Shisuko 2019-03-19 18:43:00 4359 Initial revision (saved to drafts)