Loki is a poker-playing program developed at the University of Alberta starting in 1997. The name refers to the Norse god of mischief and discord. At the beginning of the work presented in this thesis, Loki (henceforth referred to as Loki-1) was already a intermediate level poker player (see [4], [5], [19]). It had the infrastructure to incorporate advanced features for the next step towards the goal of creating a high-performance poker program capable of defeating the best human players. However, its rigid, deterministic, hand-tuned betting strategy was becoming a limiting factor for its future development. This thesis presents the work done to improve the knowledge representation and the betting strategy of Loki-1.
The first improvement is a probabilistic representation of poker betting decisions. A betting strategy attempts to determine which betting action is most profitable in a given situation, based on the evaluation function of the program. Loki-1's evaluation function was a deterministic strategy since it always returned a single value: the ``best'' betting action. A deterministic strategy is vulnerable to being predictable, which gives a skilled opponent the opportunity to find a counter-strategy that takes advantage of this fact. The new version of Loki (henceforth referred to as Loki-2) returns three probabilities, one for each betting action (fold, call/check, and raise/bet). Loki-2 can then randomly select the betting decision based on this probability distribution. This new evaluation function is a mixed (randomized) strategy that adds unpredictability to Loki-2's play without sacrificing much in immediate expectation. This routine also merges all the expert knowledge components used in Loki-1, since the probability triple representation can be used throughout the program.
The second improvement is the use of simulation (search) to compute the expected value of betting alternatives. The Loki-2's betting strategy uses a simulation-based approach: selective sampling simulation. This approach consists of simulating the outcome of a hand many times. In every simulation trial, a likely instance of the hidden information (opponents' hands) is generated, and the hand is played out once for each betting alternative as the first action of Loki-2 in the trial. The results of all the trials are averaged and the betting action with the highest expectation is returned. To select the most likely (selective sampling) opponents' hands and actions from the sample space during a simulation, Loki-2 uses all the information available about the game and the opponents. The simulation refines the quality of the evaluation function and the selective sampling increases the information gained with each trial.
Experimental results will be presented to demonstrate that both enhancements represent a notable improvement in Loki's playing ability. In all the self-simulation experiments performed, Loki-2 outperformed Loki-1. Loki-2 has also consistently increased its bankroll playing against human opponents on an Internet poker server; at a rate that appears to be significantly higher than Loki-1's.
This thesis is organized as follows. Chapter 2 introduces poker terminology, describes the game of Texas Hold'em (the poker variation played by Loki) and discusses other work done in computer poker. Chapter 3 describes Loki-1 in detail. Chapter 4 discusses the probabilistic representation of betting actions (probability triples) used to improve Loki-2's betting strategy and opponent modeling. It also discusses the new design of the program. Chapter 5 discusses the selective sampling simulations in Loki-2. Chapter 6 is an overview of related work in selective sampling simulations. Conclusions and future work are presented in Chapter 7.