# jemdoc: addcss{rbh.css}, addcss{jacob.css}
= tictactoe game
~~~
{}{raw}
ttt
eg
ngx
eg
tneg
trans
prune
num_states
αβneg
enhance
~~~
~~~
{}{raw}
tictactoe
~~~
~~~
 recall minimax
 for a gametree, find root minimax value from leaf values
 why minimax?
 it gives best worstcase score,
i.e. against all possible opponent strategies
 now let's pick a game
and use minimax to solve (find minimax value) arbitrary states
 need a game with relatively small state space, say
[https://en.wikipedia.org/wiki/Tictactoe tictactoe]
 ancient, many variants: terni lappilli ~ [https://en.wikipedia.org/wiki/Three_Men%27s_Morris 3men's morris]
~~~
#Mille facesse iocos; turpe est nescire puellam
# ludere: ludendo saepe paratur amor.
#Make up a thousand games; it is a shame for a woman
# not to know how to play: play often brings love.
~~~
{Ovid: Ars Amatoria III, lines 365369}{}
Est genus in totidem tenui ratione redactum
scriptula, quot menses lubricus annus habet;
Parua tabella capit ternos utrimque lapillos,
in qua uicisse est continuasse suos.
There is another game divided into as many parts
as there are months in the year;
A small board has three pieces on either side,
the winner must get all the pieces in a straight line.
~~~
~~~
{}{raw}
negamax
~~~
~~~
 usual
[https://en.wikipedia.org/wiki/Minimax\#Pseudocode
minimax pseudocode]
 good coding practise: avoid code duplication
 do we need 2 cases (p1 move? p2 move?)
 negamax: at each code, compute minimax value for *player to move*
 now only 1 case
 if score for other player needed, use this:
 score_for_p2(node) == negative( score_for_p1(node) )
 so, in negamax format, at each node:
 select move that maximizes neg( score(child) )
*warning*
 when using negamax, make sure that leaf scores are for *playertomove*
(minimax assumes all node scores are for 1stplayer, so maximizer)
 to convert leaf scores in a minimax tree to equivalent leaf scores in
a negamax tree:
 negate leaf scores whose distancetoroot is odd
 leaf scores whose distancetoroot is even do not need to be changed
(because their playertomove is max)
~~~
~~~
{minimax algm, negamax format}{}
def eval(s):
for player who would move next, return score of s
def negamax(s):
if terminal(s):
return eval(s)
else:
return max(for all children c of s, negamax(c))
~~~~
~~~
{}{raw}
minimax example, negamax format
~~~
~~~
{}{}
at each node, score is for *** playertomove ***
p1 to move *
/  \
/  \
p2 to move * * *
/  \ / \ /  \
p1 to move * * * * * * * *
p1 score 2 3 1 1 6 4 2 9
p1 score 1
/  \
/  \
p2 score 1 1 4
/  \ / \ /  \
p1 score * * * * * * * *
2 3 1 1 6 4 2 9
~~~
~~~
{}{raw}
ttt example trees
~~~
~~~
{}{}
for x to move, x . o
draw (part of) the x . .
game tree o . .
then draw (part of)
the minimax tree
then draw
the proof tree, also called solution tree
(show best move winner)
~~~
~~~
{part of game tree}{}
x . o
x . .
o . .
xxo x.o x.o x.o x.o
x.. xx. x.x x.. x..
o.. o.. o.. ox. o.x
xxo xxo xxo xxo x.o x.o x.o x.o xoo x.o x.o x.o
xo. x.o x.. x.. xx. xx. xx. xx. x.x xox x.x x.x ...
o.. o.. oo. o.o o.. o.. o.. o.. o.. o.. oo. o.o
xxo xxo xxo xxo xxo xxo xxo xxo xxo xxo x.o x.o
xxo x.o x.o xx. x.x x.. xx. x.x x.. xx. xxx xx. ...
o.. ox. o.x oo. oo. oox o.o o.o oxo o.. o.. o.x
(two more levels)
~~~
~~~
{part of minimax tree}{}
x . o this state has
x . . x minimax
o . . value 1
o: 1 1 1 1 1
xxo x.o x.o x.o x.o
x.. xx. x.x x.. x..
o.. o.. o.. ox. o.x
x: 1 0 0 0 1 1 1 1 1 1 1 1
xxo xxo xxo xxo xoo x.o x.o x.o xoo x.o x.o x.o
xo. x.o x.. x.. xx. xxo xx. xx. x.x xox x.x x.x ...
o.. o.. oo. o.o o.. o.. oo. o.o o.. o.. oo. o.o
o: 0 1 1 1 1 1 1 1 1 1 0 1
xxo xxo xxo xxo xxo xxo xxo xxo xxo xoo xoo xoo
xxo x.o x.o xx. x.x x.. xx. x.x x.. xxx xx. xx.
o.. ox. o.x oo. oo. oox o.o o.o oxo o.. ox. o.x
(two more levels)
~~~
~~~
{proof tree}{}
for winner, x . o
show winning move x . .
o . .

for opponent, x . o
show all moves x x .
o . .
/ / \ \
/ / \ \
x o o x . o x . o x . o
x x . x x o x x . x x .
o . . o . . o o . o . o
   
x o o x . o x . o x . o
x x x x x o x x x x x x
o . . o . x o o . o . o
~~~
~~~
{xkcd}
 [https://xkcd.com/832/ minimax strategy diagram]
~~~
~~~
{}{raw}
simple ttt negamax
~~~
~~~
{}{}
def negamax(psn, ptm): # position, playertomove
if psn.has_win(ptm):
return 1 # previous move created win
L = psn.legal_moves()
if len(L) == 0:
return 0 # board full, so draw
so_far = 1 # best score so far
for cell in L:
psn.brd[cell] = ptm
nmx = negamax(psn, opponent(ptm))
so_far = max(so_far,nmx)
psn.brd[cell] = Cell.e # reset brd by erasing cell
return so_far
# solves 3x3 ttt with 740 170 calls
# 9! = 362 880
# nodes at most 986 410
# can we do better ?
~~~
~~~
{}{raw}
transpositions
~~~
~~~
 ttt search space is not a tree: it has undirected cycles
 ttt search space is directedacyclic (no directed cycles) graph, or *dag*
 move sequences that yield the same position are called *transpositions*
~~~~
~~~
{transposition example}{}
x o .
o . x
o x .
/ \
x o . x o o
o o x o . x
o x . o x .
 
x o . x o o
o o x o . x
o x x o x x
\ /
x o o
o o x
o x x
~~~
~~~
{}{raw}
pruning ttt game trees
~~~
~~~
 how many ttt states need to be examined to find minimax value of corner opening move?
 a) about 8! ~ ~ b) far fewer than 8!
 answer b) ~ ~ ~ ~ ~ why?
 nodes can be pruned
 win detection: in child minimax for loop,
abort if winning child found (prune remaining siblings)
 solve ttt with 129 988 nodes
 early draw detection: return draw once neither player can win
 save minimax values in *transposition table*, by position
 before solving position
 is position in table (transposition) ?
 is isomorphic position in table (symmetry) ?
~~~
~~~
{}{raw}
number ttt states
~~~
~~~
 how big is ttt search space?
 root, 9 children, 9\*8 grandchildren, 9\*8\*7 greatgrandchildren, ...
 total number nodes {{< 1 + 9 + 9*8 + 9*8*7 + ... + 9*8*7*6*5*4*3*2*1 = 986 410}}
 why < ?
 some terminal nodes not at max depth (because win is a leaf node)
 transpositions
 if we treat search space as a tree
(ignore transpositions, allow a position in multiple nodes)
then above is a reasonable estimate
 how much of a space reduction do we get if allow each position
to appear in at most one node?
 number different possible positions {{=}} {{3^{9} = 19 683}}
 each of 9 board positions is empty\/ x \/ o
 represent each location with 0 \/ 1 \/ 2
 so each board position is represented by a 9digit base 3 number
 good news: many of these positions are not reachable
 also good news: many of these states isomorphic
 from root state: only 3 noniso moves, so expect 6 500 nodes
 alphabeta search from root examines 3025 noniso states
~~~
~~~
{ttt board representation example}{}
eg. this position represented by this number:
x . o
x o . 102 120 010 (base 3) = 8 427 (decimal)
. x .
an unreachable ttt position:
x . o
x . o
x . o
~~~
~~~
{minimax pruning example, min/max format}{}
recall min/max format:
p1 max, p2 min, show p1 values
a
/ \
b c
/ \ /\
d e f g h
7 5 3 ? ?
assume dfs with children explored lefttoright
*
/ \
5 <=3
/ \ /\
7 5 3 ? ?
after examining node we know ...
d val(b) <= 7
e val(b) = min(7,5) = 5
c val(c) <= 3
now, no need to learn more about val(c)
... val(c) <=3 < val(b)
... p1 prefers b to c
... so prune remaining children of c from our search
~~~
~~~
{}{raw}
alphabeta search, negamax format
~~~
~~~
{}{}
recall negamax format:
value(t) is for *** player to move from t ***
p1_score( t ) =  p2_score( t )
def abnega(v, alpha, beta):
if terminal(v):
return eval(v)
so_far = neg_infinity # best score so far
for each child c of v:
so_far = max(so_far,  abnega( c,  beta,  alpha) )
alpha = max( alpha, so_far )
if alpha >= beta:
break
return so_far
abnega(root, neg_infinity, infinity)
~~~
~~~
{alphabeta trace example}{}
A
/ \
B C
/ \ /\
D E F G H
7 5 3 1 9
directory /gamespuzzlesalgorithms/simple/alphabeta
python3 alphabeta.py < t1.in
A 999 999 999
B 999 999 999
D 999 999 999
D leaf value 7
D now best child of B
D improved alpha( B ) to 7
E 999 7 999
E leaf value 5
E now best child of B
E improved alpha( B ) to 5
B 5 999 5
B now best child of A
B improved alpha( A ) to 5
C 999 5 999
F 5 999 999
F leaf value 3
F now best child of C
F improved alpha( C ) to 3
alpha >= beta, prune remaining children of C
C 3 5 3
A 5 999 5
~~~
~~~
{}{raw}
search enhancements
~~~
~~~
 {{αβ}} search with no enhancements can be weak
 eg. class code, +ttt n 4 m 4 k 3 alpha_beta+ is weak player
 good news: many enhancements strengthen {{αβ}}
 eg. move ordering: to maximize pruning,
consider children in order from strongest to weakest
 child strength not known apriori; use heuristic to guess
 threatsearch: check for wins or losethreats (forced moves) first
 winthreat: if player has winning move, make it
 losethreat (ttt): if opponent has nextmovewin,
player must block at that cell: all other moves lose and can be pruned
 tictactoe alphabeta revisions:
 initialize () infinity = () 1 ~ ~ (why ?)
 answer: then search stops once win/loss found
 before starting search, order children so that
winthreats precede losethreats precede rest of moves (why ?)
 answer: that gives this order of execution
 if there is a winning move, make it
 elif the opponent can win, block one such move,
 else find best other move
 heuristic \+ limited depth {{αβ}} search
basis of many strong chess programs
 [http://www.iis.sinica.edu.tw/~tshsu/tcg/2014/slides/slide10.pdf slides: tt etc enhancements]
~~~