recall minimax

for a game-tree, find root minimax value from leaf values

why minimax?

it gives best worst-case score, i.e. against all possible opponent strategies

now let's pick a game and use minimax to solve (find minimax value) arbitrary states

need a game with relatively small state space, say tic-tac-toe

ancient, many variants: terni lappilli 3-men's morris

Ovid: Ars Amatoria III, lines 365-369

Est genus in totidem tenui ratione redactum scriptula, quot menses lubricus annus habet; Parua tabella capit ternos utrimque lapillos, in qua uicisse est continuasse suos. There is another game divided into as many parts as there are months in the year; A small board has three pieces on either side, the winner must get all the pieces in a straight line.

usual minimax pseudocode

good coding practise: avoid code duplication

do we need 2 cases (p1 move? p2 move?)

negamax: at each code, compute minimax value for

**player to move**now only 1 case

if score for other player needed, use this:

score_for_p2(node) == negative( score_for_p1(node) )

so, in negamax format, at each node:

select move that maximizes neg( score(child) )

**warning**

when using negamax, make sure that leaf scores are for

**player-to-move**(minimax assumes all node scores are for 1st-player, so maximizer)to convert leaf scores in a minimax tree to equivalent leaf scores in a negamax tree:

negate leaf scores whose distance-to-root is odd

leaf scores whose distance-to-root is even do not need to be changed (because their player-to-move is max)

minimax algm, negamax format

def eval(s): for player who would move next, return score of s def negamax(s): if terminal(s): return eval(s) else: return max(for all children c of s, -negamax(c))

at each node, score is for *** player-to-move *** p1 to move * / | \ / | \ p2 to move * * * / | \ / \ / | \ p1 to move * * * * * * * * p1 score 2 3 -1 1 6 -4 2 9 p1 score 1 / | \ / | \ p2 score 1 -1 4 / | \ / \ / | \ p1 score * * * * * * * * 2 3 -1 1 6 -4 2 9

for x to move, x . o draw (part of) the x . . game tree o . . then draw (part of) the minimax tree then draw the proof tree, also called solution tree (show best move winner)

part of game tree

x . o x . . o . . xxo x.o x.o x.o x.o x.. xx. x.x x.. x.. o.. o.. o.. ox. o.x xxo xxo xxo xxo x.o x.o x.o x.o xoo x.o x.o x.o xo. x.o x.. x.. xx. xx. xx. xx. x.x xox x.x x.x ... o.. o.. oo. o.o o.. o.. o.. o.. o.. o.. oo. o.o xxo xxo xxo xxo xxo xxo xxo xxo xxo xxo x.o x.o xxo x.o x.o xx. x.x x.. xx. x.x x.. xx. xxx xx. ... o.. ox. o.x oo. oo. oox o.o o.o oxo o.. o.. o.x (two more levels)

part of minimax tree

x . o this state has x . . x minimax o . . value 1 o: 1 -1 1 1 1 xxo x.o x.o x.o x.o x.. xx. x.x x.. x.. o.. o.. o.. ox. o.x x: -1 0 0 0 1 1 1 1 1 -1 1 1 xxo xxo xxo xxo xoo x.o x.o x.o xoo x.o x.o x.o xo. x.o x.. x.. xx. xxo xx. xx. x.x xox x.x x.x ... o.. o.. oo. o.o o.. o.. oo. o.o o.. o.. oo. o.o o: 0 1 1 1 1 1 1 1 1 -1 0 -1 xxo xxo xxo xxo xxo xxo xxo xxo xxo xoo xoo xoo xxo x.o x.o xx. x.x x.. xx. x.x x.. xxx xx. xx. o.. ox. o.x oo. oo. oox o.o o.o oxo o.. ox. o.x (two more levels)

proof tree

for winner, x . o show winning move x . . o . . | for opponent, x . o show all moves x x . o . . / / \ \ / / \ \ x o o x . o x . o x . o x x . x x o x x . x x . o . . o . . o o . o . o | | | | x o o x . o x . o x . o x x x x x o x x x x x x o . . o . x o o . o . o

def negamax(psn, ptm): # position, player-to-move if psn.has_win(ptm): return 1 # previous move created win L = psn.legal_moves() if len(L) == 0: return 0 # board full, so draw so_far = -1 # best score so far for cell in L: psn.brd[cell] = ptm nmx = negamax(psn, opponent(ptm)) so_far = max(so_far,-nmx) psn.brd[cell] = Cell.e # reset brd by erasing cell return so_far # solves 3x3 ttt with 740 170 calls # 9! = 362 880 # nodes at most 986 410 # can we do better ?

ttt search space is not a tree: it has undirected cycles

ttt search space is directed-acyclic (no directed cycles) graph, or

**dag**move sequences that yield the same position are called

**transpositions**

transposition example

x o . o . x o x . / \ x o . x o o o o x o . x o x . o x . | | x o . x o o o o x o . x o x x o x x \ / x o o o o x o x x

how many ttt states need to be examined to find minimax value of corner opening move?

a) about 8! b) far fewer than 8!

answer b) why?

nodes can be pruned

win detection: in child minimax for loop, abort if winning child found (prune remaining siblings)

solve ttt with 129 988 nodes

early draw detection: return draw once neither player can win

save minimax values in

**transposition table**, by positionbefore solving position

is position in table (transposition) ?

is isomorphic position in table (symmetry) ?

how big is ttt search space?

root, 9 children, 9*8 grandchildren, 9*8*7 greatgrandchildren, …

total number nodes < 1 + 9 + 9*8 + 9*8*7 + ... + 9*8*7*6*5*4*3*2*1 = 986 410

why < ?

some terminal nodes not at max depth (because win is a leaf node)

transpositions

if we treat search space as a tree (ignore transpositions, allow a position in multiple nodes) then above is a reasonable estimate

how much of a space reduction do we get if allow each position to appear in at most one node?

number different possible positions = 3

^{9}= 19 683each of 9 board positions is empty/ x / o

represent each location with 0 / 1 / 2

so each board position is represented by a 9-digit base 3 number

good news: many of these positions are not reachable

also good news: many of these states isomorphic

from root state: only 3 non-iso moves, so expect 6 500 nodes

alphabeta search from root examines 3025 non-iso states

ttt board representation example

eg. this position represented by this number: x . o x o . 102 120 010 (base 3) = 8 427 (decimal) . x . an unreachable ttt position: x . o x . o x . o

minimax pruning example, min/max format

recall min/max format: p1 max, p2 min, show p1 values a / \ b c / \ /|\ d e f g h 7 5 3 ? ? assume dfs with children explored left-to-right * / \ 5 <=3 / \ /|\ 7 5 3 ? ? after examining node we know ... d val(b) <= 7 e val(b) = min(7,5) = 5 c val(c) <= 3 now, no need to learn more about val(c) ... val(c) <=3 < val(b) ... p1 prefers b to c ... so prune remaining children of c from our search

recall negamax format: value(t) is for *** player to move from t *** p1_score( t ) = - p2_score( t ) def abnega(v, alpha, beta): if terminal(v): return eval(v) so_far = neg_infinity # best score so far for each child c of v: so_far = max(so_far, - abnega( c, - beta, - alpha) ) alpha = max( alpha, so_far ) if alpha >= beta: break return so_far abnega(root, neg_infinity, infinity)

alphabeta trace example

A / \ B C / \ /|\ D E F G H 7 5 3 1 9 directory /games-puzzles-algorithms/simple/alphabeta python3 alphabeta.py < t1.in A -999 999 -999 B -999 999 -999 D -999 999 -999 D leaf value 7 D now best child of B D improved alpha( B ) to -7 E -999 7 -999 E leaf value 5 E now best child of B E improved alpha( B ) to -5 B -5 999 -5 B now best child of A B improved alpha( A ) to 5 C -999 -5 -999 F 5 999 -999 F leaf value 3 F now best child of C F improved alpha( C ) to -3 alpha >= beta, prune remaining children of C C -3 -5 -3 A 5 999 5

αβ search with no enhancements can be weak

eg. class code,

`ttt -n 4 -m 4 -k 3 alpha_beta`is weak player

good news: many enhancements strengthen αβ

eg. move ordering: to maximize pruning, consider children in order from strongest to weakest

child strength not known apriori; use heuristic to guess

threat-search: check for wins or lose-threats (forced moves) first

win-threat: if player has winning move, make it

lose-threat (ttt): if opponent has next-move-win, player must block at that cell: all other moves lose and can be pruned

tictactoe alphabeta revisions:

initialize (-) infinity = (-) 1 (why ?)

answer: then search stops once win/loss found

before starting search, order children so that win-threats precede lose-threats precede rest of moves (why ?)

answer: that gives this order of execution

if there is a winning move, make it

elif the opponent can win, block one such move,

else find best other move

heuristic + limited depth αβ search basis of many strong chess programs