# jemdoc: addcss{rbh.css}, addcss{jacob.css} = 2-player 0-sum imperfect information games ~~~ {}{raw} rps   mmx   lp   nash   cfr   ~~~ ~~~ {}{raw}

rock paper scissors

~~~ ~~~ - play with a classmate - is your strategy equiprobabe ? -- yes: why ? -- no: what is your strategy ? how did you model your opponent ? ~~~ ~~~ {rps payoff matrix}{} two players: row, column payoffs for row column's strategies r p s r 0 -1 1 row's strategies p 1 0 -1 s -1 1 0 ~~~ ~~~ {}{raw}

minimax

~~~ ~~~ - minimax value: the minimum, over all player\/opponent strategy pairs, of the maximum the opponent can score - for 2-player 0-sum games ({{so op't score \+ player score = 0}}) -- if each player plays a fixed strategy for the whole game, minimax value is easily computed - e.g. assume row plays equiprobable: rps each 1\/3 - assume col plays rps resp. probs y1 y2 y3 - then expected row payoff {{1/3(y2-y3) + 1/3(y1-y3) + 1/3(y2-y1) = 0}} - so row's minimax value is at least 0 (some other row strategy might do better) - but, by symmetry (payoff matrix is symmetric), col's expected value is 0 by playing equiprobable - so, for this game, row's minimax value is 0, and can be achieved by playing equiprobable ~~~ ~~~ {}{raw}

2p 0sum minimax via linear programming

~~~ ~~~ {}{} - assume col knows row's strategy - then, for this row-strat, some pure col strategy is best for col - so row wants to pick a mixed strategy that maximizes the minimum score row gets, over all pure col strats - so row wants to pick x1 x2 x3 to maximize min x2 - x3, (row payoff against col strat r) x3 - x1, (" " " " " p) x1 - x2 (" " " " " s) - so row wants to solve the LP max z st z <= x2 - x3 z <= -x1 + x3 z <= x1 - x2 x1 + x2 + x3 = 1 x1,x2,c3 >= 0 - similarly, col wants to solve an LP that turns out to be the dual of the above - notice that the above LP has a feasible solution and z is bounded - so the duality theorem of LP tells us that each problem has the same solution - we have already seen that (1/3 1/3 1/3) is the primal solution ~~~ ~~~ {}{raw}

nash equilibrium

~~~ ~~~ - [https://en.wikipedia.org/wiki/Nash_equilibrium wiki] - for 2-player 0-sum games, the above LP result implies that the minimax solution (assuming each player has a fixed mixed strategy for the whole game) is the only Nash equilibrium - [https://en.wikipedia.org/wiki/Nash_equilibrium\#Prisoner.27s_dilemma prisoner's dilemma] -- if player's trust each other, they can cooperate and do better than the minimax value -- but this combination of strategies is not a Nash equilibrium - [https://en.wikipedia.org/wiki/Nash_equilibrium\#Occurrence sufficient conditions] ~~~ ~~~ {}{raw}

counterfactual regret

~~~ ~~~ - [http://modelai.gettysburg.edu/2013/cfr/ intro] - [http://webdocs.cs.ualberta.ca/~hayward/670gga/jem/burch.pdf Burch slides] - [http://jeskola.net/cfr/ tammelin's cfr (with demo)] ~~~