next up previous contents
Next: 4.3 Simulation and Enumeration Up: 4. How Computers Play Previous: 4.1 Expert Systems   Contents

4.2 Game-Theoretic Optimal Strategies

Kuhn [11] along with Nash and Shapley [12] have demonstrated that ``optimal strategies" using randomization exist for simplified poker. An optimal strategy always takes the best worst-case move, and this means two things: ``the player cannot do better than this strategy if playing against a good opponent, and furthermore the player does not do worse even if his strategy is revealed to his opponent" [10]. For example, consider the two-player game of Roshambo (Rock, Paper, Scissors). The optimal strategy is to select a move uniformly at random ( i.e. $[\frac{1}{3}, \frac{1}{3}, \frac{1}{3}]$) irrespective of the game history.

Finding an optimal approach is not so easy in a complex game like poker; there is a major stumbling block. Due to the enormous branching factor (see Figure 4.2), both the calculation and storage of the game-theoretic optimal strategy would be extremely expensive. Additionally the branching factor numbers are only for the two player environment - the multi-player environment is even more complex due to the addition of more imperfect information and many more possible betting interactions. As demonstrated by the attention devoted to multi-player situations in the poker literature, such considerations are quite important.

Additionally, the game-theoretic optimal approach is not necessarily the best. Clearly, as the game-theoretic optimal strategy is fixed, it cannot take advantage of observed weaknesses in the opponent. Doing so would risk falling into a trap and losing. Consider Roshambo for an example. After witnessing your opponent playing Rock 100 times in a row, deciding to play Paper risks your opponent anticipating your action (the situation may be intended as a trap or your opponent may know your strategy). The existence of the risk, no matter how small, would violate the optimality of the strategy (the second guarantee, that the player cannot do worse).

Because of this, even against bad players an optimal strategy is likely to only break even. In contrast, a maximal strategy using opponent modeling (which does not assume perfect play from its opponents) would identify weaknesses and exploit them for profit (significantly more than an optimal strategy). There is some risk, because deviation from the optimal opens the door for your opponent to exploit it. But, if your knowledge of the opponent is good, the potential gains outweigh the risk. A game-theoretic optimal strategy would, however, make an excellent default or baseline to complement such an ``adaptive" strategy.

Figure 4.2: Branching Factor for Structured Betting Texas Hold'em With a Maximum of 4 Bets/Round
% latex2html id marker 1007\footnotesize Assuming only two play...
...yers remaining
($792$\ end in an uncontested win).

next up previous contents
Next: 4.3 Simulation and Enumeration Up: 4. How Computers Play Previous: 4.1 Expert Systems   Contents
Denis Papp