* * * * * * * * * a b c
k piles of stones
on a move, player selects pile, removes any number stones from there
last player to move wins
use recursion ? pros/cons … at least consider it as an option
avoid code duplication (e.g. negamax)
avoid recomputation (e.g. memoization)
def fib(n): if n<=1: return n return fib(n-2)+fib(n-1) for j in range(40): print(j, fib(j)) how long does this take to execute? memoize, don't recompute def dpfib(n): F = [0,1] for j in range(2,n+1): F.append(F[j-1] + F[j-2]) return F[n]
so far, tic-tac-toe, with simple minimax,alphabeta,negamax
does this work for nim ?
consider game state search tree for nim (10 10 10 10 10)
move from root state (10 10 10 10 10)?
1-10 from 1st pile or 2nd pile or … or 5th pile
so root state has 50 children
among its 50 children, 5 have lost 1 stone, 5 have lost 2, …, 5 have lost 10, so the number of grandchildren is 5*(49 + 48 + ... 40)
this tree is too big … what can we do ?
answer: exploit isomorphisms and memoize
as with tic-tac-toe, use a transposition table, generate only one node per equivalence class (set of isomorphic states)
eg. nim (1 2 3)
x remove 1 from pile a, y remove 1 from pile b
x remove 1 from pile b, y remove 1 from pile a
tree: these 2 paths lead to different nodes, but same state
eg. tictactoe
x a1, o a2, x a3
x a3, o a2, x a1
again, tree: these 2 paths lead to different nodes, but same state
* sort each position in non-decreasing order * here are all possible sorted positions: 0 0 0 0 0 1 0 0 2 0 0 3 0 1 1 0 1 2 0 1 3 0 2 2 0 2 3 0 3 3 1 1 1 1 1 2 1 1 3 1 2 2 1 2 3 1 3 3 start 0 0 0 L 0 0 1 ? 0 0 2 ? 0 0 3 ? 0 1 1 ? ... next: states that get to losing state win 0 0 0 L 0 0 1 W 0 0 2 W 0 0 3 W 0 1 1 ? ... next: states that get only to winning states lose 0 0 0 L 0 0 1 W 0 0 2 W 0 0 3 W 0 1 1 L <- ... next: states that get to losing state win 0 0 0 L 0 0 1 W 0 0 2 W 0 0 3 W 0 1 1 L 0 1 2 W <- 0 1 3 W <- 0 2 2 0 2 3 ... next: states that get only to winning states lose 0 0 0 L 0 0 1 W 0 0 2 W 0 0 3 W 0 1 1 L 0 1 2 W 0 1 3 W 0 2 2 L <- 0 2 3 ... finished: 0 0 0 L 000 0 0 1 W 001 0 0 2 W 002 0 0 3 W 003 0 1 1 L 011 0 1 2 W 012 0 1 3 W 013 0 2 2 L 022 0 2 3 W 023 0 3 3 L 033 1 1 1 W 111 1 1 2 W 112 1 1 3 W 113 1 2 2 W 122 1 2 3 L 123 1 3 3 W 133
dag: directed acyclic graph
instead of search tree, use search dag
ie. different paths from root can lead to same child
above tictactoe example: the 2 paths lead to same node
above nim example: the 2 paths lead to same node
pro: fewer nodes in dag than tree, so less time to examine all nodes
con: dag code usually more complicated than tree code, especially if information from child is backed up to parent
tree: each child has 1 parent
dag: each child can have many parents
compress search space by grouping nodes into equivalence classes
eg. nim: put isomorphic states in same class
eg. (1 2 3) ≡ (1 3 2) ≡ (2 1 3) ≡ (2 3 1) ≡ (3 1 2) ≡ (3 2 1)
number states ? (1+1)*(2+1)*(3+1) = 24 number non-isomorphic states ? 14 000 001 002 003 011 012 013 022 023 111 112 113 122 123
root node (222)
all arcs go from right to left, eg. 222 has children 122, 022
exercise: put an x under each losing state, put an arrow on each winning edge
recompute? no, compute once, then write and lookup
memoization effective if states can be ordered so that value of a state depends only on values of previous states
order states by number of stones
so each move results in state with fewer stones
in order, fewest stones to most,
compute win/loss value for each state
definition
position is losing if every move from that position is to a winning position
vacuously: any position with no moves is losing
position is winning if is not losing, ie. if some move from that position is to a losing position
algorithm
value[ 0 0 ... 0 ] <- Loss (state with no stones is losing)
(consider states by increasing number of total stones)
for j in range[1 .. max possible number of stones]:
for each state k with j stones:
if value[ k ] not yet defined:
for each state w that can move to k: value[ w ] <- win
correctness? because we consider states by increasing number of stones, if the value of a state is not yet defined by the time we get to it, it must be the case that every move from that state is to a state whose value we have already seen, and (since we would have set the value otherwise) all those values must be wins, so the value of the undefined state must be a loss
for each state: value(state) = False for k in range(1, sum_pilesizes): for each k-stone state: if not value(state): for each state t reachable by adding stones to a pile of s: value(t) = True
showing sorted equiv classes only 0 stones: 000 F, update 001 002 003 T 1 stones: 001 T, no update 2 stones: 002 T 011 F 012 013 111 112 113 T 3 stones: 003 T 012 T 111 T [exercise: show trace output for 4 stones] ...
each state causes an update at most one time
number of states at most product_pilesizes
each update takes time at most sum_pilesizes
total time O(product_pilesizes * sum_pilesizes)
- simple/nim/nim.py
when number of positions (or equivalence classes) eventually drops to a manageable number
eg. solving checkers
1994 (1996): Schaeffer's checkers program Chinook first bot to be world champion of some game
2007: Schaeffer's program solves checkers (a draw)
key idea to solving checkers: endgame databases (dp on steroids)
see picture of checkers search space in above paper
quote from above paper:
Using retrograde analysis, subsets of a game with a small number of pieces on the board can be exhaustively enumerated to compute which positions are wins, losses or draws. Whenever a search reaches a database position, instead of using an inexact heuristic evaluation function, the exact value of the position can be retrieved.
theorem a nim position losing if and only if xorsum of pile sizes is 0
eg: xorsum(1 2 3) = 01 ^ 10 ^ 11 = 00, so (1 2 3) is losing.
eg: xorsum(3 5 3) = 011 ^ 101 ^ 111 = 001, so (3 5 7) is winning.
how to find a winning move? from (3 5 7) can move to
(0 5 7) xorsum 010 winning, no
(1 5 7) xorsum 011 winning, no
(2 5 7) xorsum 000 losing, yes
exercise: find all other winning moves from (3 5 7)
show for each state P with xorsum 0, each move is to state with xorsum nonzero
consider move from pile j with p_j > 0
xorsum(rest of piles) xorsum(p_j) = 0, so xorsum(rest of piles) = p_j
so if we change p_j to some other value q_j, we have xorsum(rest of piles) not equal to q_j, so xorsum(rest of piles) xorsum(q_j) is not 0
show for each state P with xorsum nonzero, some move is to state with xorsum 0
let c_j be column of leftmost 1 in binrep xorsum(P)
let p_x be any pile with 1 in c_j
z = xorsum(P ^ p_x) has 0 in all columns c_1 … c_j, so z < p_x, so can reduce p_x to z, leaves state with xorsum zero
def nimreport(P): # all nim winmoves from P, use formula total = xorsum(P) # xor all elements of P if total==0: print(' loss') return for j in P: tj = total ^ j # xor if j >= tj: print(' win: take',j - tj,'from pile with',j) [ code: simple/nim/bignim.py ]
find all winning moves with nim-theorem? example: all win-moves from (1 2 6)? ------------------------------------- is there a win-move from pile 1? what we have what we want 1 0 0 1 ? ? ? 2 0 1 0 0 1 0 6 1 1 0 1 1 0 xsum 1 0 1 0 0 0 unique solution: replace pile 1 with (pile 1 xor+ xsum) = (0 0 1 xor+ 1 0 1) = 1 0 0 can we change pile 1 from 001=1 stone to 100=4 stones? no, not a legal nim-move :( so no win-move from pile 1 ------------------------------------------- similarly, no win-move from pile 2 (exercise) ------------------------------------- is there a win-move from pile 3? what we have what we want 1 0 0 1 0 0 1 2 0 1 0 0 1 0 6 1 1 0 ? ? ? xsum 1 0 1 0 0 0 unique solution: replace pile 3 with (pile 3 xor+ xsum) = (1 1 0 xor+ 1 0 1) = 0 1 1 is this a legal move? change pile 3 from 110=6 stones to 011=3 stones? yes! unique win-move from pile 3: remove 6-3 = 3 stones
all winmoves from (1 2 6) rule: * j <- in xor-total, leftmost column with a 1 * for each pile with a 1 in column n: change the pile-size p <- p xor xor-total example 1 0 0 1 2 0 1 0 6 1 1 0 xor-total 1 0 1 <-- nonzero, so exists winmove ^ | j 1-pile winmove? no: 1-pile has 0 in column j after move, need new total xorsum 0, so need to leave k = (xorsum of all other piles) stones (xorsum of all other piles) = 110 ^ 010 = 100 = 4 remove 4 stones from 1-pile? 4 > 1 *no* 2-pile winmove? no: 2-pile has 0 in column j remove 001 ^ 110 = 7 stones? 7 > 2 *no* 6-pile winmove? yes: 6-pile has 1 in column j change 6-pile to 110 xor xor-total ... to 110 xor 101 = 011 = 3 *yes* after remove 3 from 6-pile: 1 0 0 1 2 0 1 0 3 0 1 1 sum 0 0 0 <--- losing state, as desired - exercise: find all winmoves from (3 5 6 7)
all winmoves from (11 14 37 43) 11 1 0 1 1 14 1 1 1 0 37 1 0 0 1 0 1 43 1 0 1 0 1 1 xor-total 0 0 1 0 1 1 <-- non-zero, so winning move exists ^ | * leftmost column with xor-sum 1 * for each row with a 1 in that column, set the new pilesize to be old-size xor xor-total e.g. 11-pile: p <- p xor xor-total = 11 xor 11 = 0 by removing 11 stones e.g. 14-pile: p <- 14 xor 11 = 5 by removing 9 stones e.g. 43-pile: p <- 43 xor 11 = 32 by removing 11 stones e.g. 37-pile: column j not 0, no winning move
nim is one of the games whose math gave rise to combinatorial game theory
combinatorial game:
includes games where whoever cannot move loses
variant: whoever cannot move wins
math of cgt is essential to solving go endgame puzzles
some cgt games
amazons
breakthrough
clobber
dots and boxes
* * * only win: * take 3 from * * pile c, leave * * * (1 2 3) a b c
4 stones: 013 T 022 F 023 122 222 223 T 112 T ...
3 0 1 1 5 1 0 1 6 1 1 0 7 1 1 1 method: leftmost column with xorsum 1? sum 1 1 1 here, first column ... piles 5,6,7 have 1 in this column, so these three columns each give winning move 3 0 1 1 0 1 1 5 1 0 1 -> 0 1 0 change 5-pile to 2-pile: 6 1 1 0 1 1 0 remove 3 stones 7 1 1 1 1 1 1 sum 1 1 1 0 0 0 3 0 1 1 0 1 1 5 1 0 1 1 0 1 6 1 1 0 -> 0 0 1 change 6 pile to 1-pile: 7 1 1 1 1 1 1 remove 1 stone sum 1 1 1 0 0 0 3 0 1 1 0 1 1 5 1 0 1 1 0 1 6 1 1 0 1 1 0 7 1 1 1 -> 0 0 0 change 7-pile to 0-pile: remove 7 stones sum 1 1 1 0 0 0