two-player games: adversarial search, minimax, alpha-beta

two-player games

in a puzzle, 1 player makes all moves
- to solve puzzle, search state space for best possible outcome
2-player games? chess, Go, hex, … how to solve?
ideas from chess
- 1770 the Turk automaton
- 1913 zermelo on the application of counting theory to the theory of chess
- 1947? Alan Turing, Claude Shannon discussed chess algs, e.g. minimax
- 1997 Deep Blue defeats Kasparov 3.5–2.5
- UAlberta connection Murray Campbell BSc (1979) MSc (1981)

adversarial search

simplest multi-player games? e.g. Go, tic-tac-toe, checkers, chess
- 2 players
- alternate turn
- zero-sum (only 1 winner)
adversarial search: searching a 2-player game tree
consider a player about to move
in order to pick best move, need to know how opponent will respond
worst case for player, but simplest to analyse (requires no opponent modelling)
- assume opponent perfect, always makes best possible move
what does it mean to solve a game?
- state position and player-to-move
- reachable state any state reachable by sequence of legal moves
- strategy for a player, a function that, for any reachable state, returns a legal move
- in 2-player game, to solve a state is to find a strategy with best worst-case performance, ie. that guarantees the best possible score, over all possible opponent strategies (so, assuming opponent always makes a best possible move)
initial move for any such strategy can be found by minimax search, score is player-to-move's minimax score
play that follows a minimax strategy called perfect play
if you missed this lecture, watch this practice this

minimax

to find minimax value, explore game-state tree in any order that finds values of children before value of node
minimax algorithm: search space to find root state minimax value

minimax algm, max/min format
root node is a max node
each child of a max node is a min node
each child of a min node is a max node

def eval(s):
  for player MAX, return score of s

def minimax(s):
  if terminal(s):
    return eval(s)
  if s.player() == MAX:
    return max(for all children c of s, minimax(c))
  if s.player() == MIN:
    return min(for all children c of s, minimax(c))

minimax example, min/max format

node: state
edge (root,    level 1):  p1 move
edge (level 1, level 2):  p2 move
leaf label: p1 score

for each node,
  p1 minimax value ?
best play?
                    *
                /   |   \
              /     |     \
            *       *       *
          / | \    / \    / | \
         *  *  *  *   *  *  *  *
         2  3 -1  1   6 -4  2  9

                    1
                /   |   \
              /     |     \
           -1       1      -4
          / | \    / \    / | \
         *  *  *  *   *  *  *  *
         2  3 -1  1   6 -4  2  9

p1 best play from root: middle

p2 best play at level 1:
  after p1 left:   right (p1 scores -1)
  after p1 middle: left  (p1 scores  1)
  after p1 right:  left  (p1 scores -4)

pruning game trees

when solving a state, usually not necessary to examine the whole tree

once one winning move is found, remaining moves can be ignored
this motivates alpha-beta search
pruning example

minimax pruning example, min/max format
recall min/max format:
  p1 max, p2 min, show p1 values

         a
      /     \
     b       c
    / \     /|\
   d   e   f g h
   7   5   3 ? ?

assume dfs with children explored left-to-right
         *
      /     \
     5      <=3
    / \     /|\
   7   5   3 ? ?

after examining node     we know ...
  d                      val(b) <= 7
  e                      val(b) = min(7,5) = 5
  c                      val(c) <= 3

now, no need to learn more about val(c)
... val(c) <=3   <   val(b)
... p1  prefers   b   to   c
... so prune remaining children of c from our search

alpha-beta search

αβ-search = minimax search with αβ pruning
α: value, best p1 option so far, on path current-to-root
β: value, best p2 option so far, on path current-to-root
minimax, so can be used as a solver
- leaf scores must be true scores
- must be able to reach all leaf nodes in reasonable time
can also be used as heuristic player
- find fast heuristic, use on all leaves at a fixed depth
- eg. simple chess player

examples

video example
to see our algorithm run on this example, go to directory simple/alphabeta and run alphabeta.py on input t2.in
practice here