rock paper scissors

simul   rps   opt-mod   stoch   eg   eg2   mmx   matrix   vn equi  

two-player simulataneous move

  • so far we have discussed only 2-player alternate-turn games (Black move, White move, Black move, White move) such as go, ttt, checkers, chess, hex …

  • let's talk about the classic 2-player simultaneous-turn game: rock-paper-scissors

rock-paper-scissors

how humans play

  • typically, when humans play, each player guesses the opponent's next move and then play that move's killer

  • consider a 3-game match between Alice and Bob

  • Alice knows that Bob likes to play rock, so she plays paper

  • game 1: A paper, B rock, A-B 1-0

  • Bob now thinks that Alice might play paper again, so he plays scissors

  • game 2: A rock, B scissors, A-B 2-0

  • Bob thinks that Alice might play scissors again, so he plays rock

  • game 3: A paper, B rock, A-B 3-0

opponent modelling

  • what happened in that A-B match?

  • here is Alice's model (prediction of Bob's next move based on his previous moves) of Bob's behaviour:

    • in game 1 he plays rock

    • after a game that he loses, Bob will play the move that killed Alice's winning move

  • Alice's opponent model was perfect, so Alice won all games

  • what happens if Bob realizes what Alice is doing, namely, if he learns Alice's opponent model?

  • answer: Bob can change his strategy to one that exploits Alice's strategy

  • for a hilarious illustration of adaptive opponent modelling in a 2-player game, see the Sicilian poisoned chalice scene in The Princess Bride

stochastic

  • the downside of basing your strategy on an opponent model is that you can lose badly if your model is inaccurate

  • is there a way to avoid opponent modelling altogether and just be conservative? can we use the minimax principle? what strategy should Alice follow if she wants to maximize her expected score against all possible opponent strategies ? is this possible?

  • yes! here's how

  • instead of playing deterministically (for each given Bob-game history, play a particular move), Alice will play stochastically

  • before the match, Alice picks probability values r, p, s (each is positive, their sum is one)

  • then, in each game, she plays rock with prob. r, paper with prob. p, scissors with prob. p

Alice stochastic example: .5, .3, .2

  • assume r p s = .5 .3 .2 respectively

  • whenever Bob plays rock

    • A plays rock .5, result draw, expected payoff (wins - losses) for Alice is .5*0 = 0

    • A plays paper .3, result win, expected payoff for Alice is .3*1 = .3

    • A plays scissors, result loss, expected payoff for Alice is .2*-1 = -.2

    • so A's total expected payoff when Bob plays rock is 0 + .3 + -.2 = .1

  • exercise: what is A's expected payoff when Bob plays paper?

  • exercise: what is A's expected payoff when Bob plays scissors?

  • conclusion: with r p s = .5 .3 .2, Alice will have a positive expected payoff if Bob plays rock all the time

Alice stochastic example: 1/3, 1/3, 1/3

  • assume r p s =   1/3   1/3   1/3   respectively

  • exercise: what is A's expected payoff when Bob plays rock?   (ans: 0)

  • exercise: what is A's expected payoff when Bob plays paper?   (ans: 0)

  • exercise: what is A's expected payoff when Bob plays scissors?   (ans: 0)

  • conclusion: with r p s =   1/3   1/3   1/3   Alice will have a 0 expected payoff

does Alice have a better minimax strategy? no

  • assume Alice plays strategy S = 1/3 1/3 1/3

  • with S Alice has expected payoff 0 for each move of Bob

  • so with S Alice has expected payoff 0 against every Bob strategy

  • also, rock-paper-scissors is fair: each player has the same move options and payoffs

  • so Bob has a strategy T = 1/3 1/3 1/3

  • against T, every Alice strategy has expected payoff for Bob 0 (and so also 0 for Alice, because rock-paper-scissors is a zero-sum game)

  • so Alice cannot do better than expected payoff 0 against every possible Bob strategy

  • her minimax score for this game is at least 0 (by following S)

  • her minimax score for this game is less than or equal to 0 (because Bob can play T)

  • so Alice's minimax score for rock-paper-scissors is this: expected payoff 0

  • strategy S gets this score for Alice

payoff matrix

the entries of this matrix show Alice's payoff in rock-paper-scissors

                 R  P  S     <- Bob's moves
Alice's ->    R  0 -1  1
moves         P  1  0 -1
              S -1  1  0

finding a von Neumann equilibrium