8.4.2 Specific Opponent Modeling (SOM)

Next: 8.5 Summary Up: 8.4 Opponent Modeling Experiments Previous: 8.4.1 Generic Opponent Modeling Contents

8.4.2 Specific Opponent Modeling (SOM)

Specific modeling is our first attempt at using the observed betting history to distinguish different types of players. It is clearly what human experts use, although our approach is crude and only captures the essence. We measure action frequencies based on a rough description of context and use these frequencies to appropriately adjust the weight array on future actions.

The experiment's focus was on two copies of SOM (specific opponent modeling) with respect to two copies of GOM. Like the previous experiment, the other six players were two copies each of BPL, BPT and BPM. The results can be seen in Figure 8.4.

**Figure 8.4:** Specific Opponent Modeling Experiment
$\begin{figure} \centerline {\psfig{figure=THE3b.ps,height=4in}}\end{figure}$

Very quickly, the two opponent modeling programs asserted their superiority over the non-modeling versions. SOM is able to attain a comparable degree of success to GOM based on observed frequencies rather than a good default model. While we cannot conclude on the basis of this one experiment that there is a statistically significant difference between the two modeling versions, anecdotal evidence is somewhat promising: on IRC GOM achieved a winning rate of 0.07 small bets per hand and SOM achieved a winning rate of 0.08 small bets per hand.

The advantage of good opponent modeling is clear. Loki with opponent modeling is a noticeably stronger program than without it. However, our implementation of specific modeling does not appear to produce a significant advantage over generic modeling. We recognize that the granularity of the context for gathering action frequencies may be so coarse that the error undermines the gain. For example, Loki does not recognize that, for some players, calling a bet is automatic given they have already called a bet previously in that round - the second action contains no information. Some other variables which may make the identification of the context stronger include the previous action, number of opponents and number of callers remaining. However, the number of cases becomes very large so it would seem necessary to somehow combine the action frequencies of ``similar" scenarios.

Additionally the re-weighting system is not adjusted based on the specific opponent (which only provides simple frequencies). It would be better if we could predict how each opponent would handle each specific subcase rather than using our value system for assigning strength/potential with a simple linear function. For example, some opponents may over-value flush draws, or may find it more imperative to bet a mediocre hand with low potential. The present re-weighting system itself neglects handling certain information. When a check or call is observed we could infer that the opponent does not have a hand above a certain threshold and can apply an inverse re-weighting function to the weight array. This would prevent Loki from being too pessimistic and would allow it to bet more often after several checks (currently, since checks are ignored and Loki uses HS_n, a checking opponent still poses the same threat as one who has not yet acted).

Next: 8.5 Summary Up: 8.4 Opponent Modeling Experiments Previous: 8.4.1 Generic Opponent Modeling Contents

Denis Papp
1998-11-30