two-player games: adversarial search, minimax, alpha-beta

2pg   adv srch   mmx   prune   αβ  

two-player games

  • in a puzzle, 1 player makes all moves

    • to solve puzzle, search state space for best possible outcome

  • 2-player games? chess, Go, hex, … how to solve?

  • ideas from chess

adversarial search

  • simplest multi-player games? e.g.  Go, tic-tac-toe, checkers, chess

    • 2 players

    • alternate turn

    • zero-sum (only 1 winner)

  • adversarial search: searching a 2-player game tree

  • consider a player about to move

  • in order to pick best move, need to know how opponent will respond

  • worst case for player, but simplest to analyse (requires no opponent modelling)

    • assume opponent perfect, always makes best possible move

  • what does it mean to solve a game?

    • state position and player-to-move

    • reachable state any state reachable by sequence of legal moves

    • strategy for a player, a function that, for any reachable state, returns a legal move

    • in 2-player game, to solve a state is to find a strategy with best worst-case performance, ie. that guarantees the best possible score, over all possible opponent strategies (so, assuming opponent always makes a best possible move)

  • initial move for any such strategy can be found by minimax search, score is player-to-move's minimax score

  • play that follows a minimax strategy called perfect play

  • if you missed this lecture, watch this   practice this

minimax

  • to find minimax value, explore game-state tree in any order that finds values of children before value of node

  • minimax algorithm: search space to find root state minimax value

minimax algm, max/min format
root node is a max node
each child of a max node is a min node
each child of a min node is a max node

def eval(s):
  for player MAX, return score of s

def minimax(s):
  if terminal(s):
    return eval(s)
  if s.player() == MAX:
    return max(for all children c of s, minimax(c))
  if s.player() == MIN:
    return min(for all children c of s, minimax(c))

minimax example, min/max format

node: state
edge (root,    level 1):  p1 move
edge (level 1, level 2):  p2 move
leaf label: p1 score

for each node,
  p1 minimax value ?
best play?
                    *
                /   |   \
              /     |     \
            *       *       *
          / | \    / \    / | \
         *  *  *  *   *  *  *  *
         2  3 -1  1   6 -4  2  9

                    1
                /   |   \
              /     |     \
           -1       1      -4
          / | \    / \    / | \
         *  *  *  *   *  *  *  *
         2  3 -1  1   6 -4  2  9

p1 best play from root: middle

p2 best play at level 1:
  after p1 left:   right (p1 scores -1)
  after p1 middle: left  (p1 scores  1)
  after p1 right:  left  (p1 scores -4)

pruning game trees

when solving a state, usually not necessary to examine the whole tree

  • once one winning move is found, remaining moves can be ignored

  • this motivates alpha-beta search

  • pruning example

minimax pruning example, min/max format
recall min/max format:
  p1 max, p2 min, show p1 values

         a
      /     \
     b       c
    / \     /|\
   d   e   f g h
   7   5   3 ? ?

assume dfs with children explored left-to-right
         *
      /     \
     5      <=3
    / \     /|\
   7   5   3 ? ?

after examining node     we know ...
  d                      val(b) <= 7
  e                      val(b) = min(7,5) = 5
  c                      val(c) <= 3

now, no need to learn more about val(c)
... val(c) <=3   <   val(b)
... p1  prefers   b   to   c
... so prune remaining children of c from our search

alpha-beta search

  • αβ-search = minimax search with αβ pruning

  • α: value, best p1 option so far, on path current-to-root

  • β: value, best p2 option so far, on path current-to-root

  • minimax, so can be used as a solver

    • leaf scores must be true scores

    • must be able to reach all leaf nodes in reasonable time

  • can also be used as heuristic player

    • find fast heuristic, use on all leaves at a fixed depth

    • eg. simple chess player

examples
  • video example

  • to see our algorithm run on this example, go to directory simple/alphabeta and run alphabeta.py on input t2.in

  • practice here