next up previous contents
Next: 2. Poker Up: Dealing with Imperfect Information Previous: List of Tables   Contents

1. Introduction

Game playing is an ideal environment for examining complex topics in machine intelligence because games generally have well-defined rules and goals. Additionally, performance, and therefore progress, is easily measured. However, the field of computer game playing has traditionally concentrated on studying chess, and other two-player deterministic zero-sum games with perfect information, such as checkers and Othello. In these games, players always have complete knowledge of the entire game state since it is visible to both participants. High performance has been achieved with brute-force search of the game trees, although there are some exceptions, such as the game of Go where the game tree is far too large. Although many advances in computer science (especially in searching) have resulted, little has been learned about decision-making under conditions of uncertainty. To tackle this problem, one must understand estimation, prediction, risk management, the implications of multiple opponents, deception, counter-deception, and the deduction of decision-making models of other players.

Such knowledge can be gained by studying imperfect information games, such as bridge and poker, where the other players' cards are not known and search alone is insufficient to play these games well. Poker in particular is a popular and fascinating game. It is a multi-player zero-sum game with imperfect information. The rules are simple but the game is strategically complex. It emphasizes long-term money management (over a session of several contiguous interdependent games), as well as the ability to recognize the potential of one specific game and to either maximize gain or minimize loss. Most games are analogical to some aspect of real life, and poker can be compared to ``policy decisions in commercial enterprises and in political campaigns" [8].

Poker has a number of attributes that make it an interesting domain for research. These include multiple competing agents (more than two players), imperfect knowledge (your opponents hold hidden cards), risk management (betting strategies and their consequences), agent modeling (detecting and exploiting patterns or errors in the play of other players), deception (bluffing and varying your style of play), and dealing with unreliable information (your opponents also make deceptive plays). All of these are challenging dimensions to a difficult problem.

Certain aspects of poker have been extensively studied by mathematicians and economists. There are two main approaches to poker research. One approach is to use simplified variants that are easier to analyze [10] [11] [12]. For example, one could use only two players or constrain the betting rules. However, one must be careful that the simplification does not remove the challenging components of the problem. The other approach is to pick a real variant, but to combine mathematical analysis, simulation and ad hoc expert experience. Expert players, often with mathematical skills, are usually involved in this approach [13] [14] [15].

However, little work has been done by computing scientists. Nicolas Findler worked on and off for 20 years on a poker-playing program for Five-Card Draw [6] [7] [8], however he focused on simulating the thought processes of human players and never achieved a program capable of defeating a strong player. Koller and Pfeffer [10] have investigated poker from a theoretical point of view. They implemented the first practical algorithm for finding optimal randomized strategies in two-player imperfect information competitive games. However, such a system will likely not win much more from a set of bad players than a set of perfect players, failing to exploit the property that human players make many mistakes ( i.e. it presumes the opponent always plays the best strategy).

One of the interesting aspects of poker research is that opponent modeling can be examined. It has been attempted in two-player games, as a generalization of minimax, but with limited success [5] [9]. Part of the reason for this is that in games such as chess, opponent modeling is not critical for computers to achieve high performance. In poker, it is essential for the best results. Working under the assumption that our opponents will make mistakes and exhibit predictability, opponent modeling should be accounted for and built into the program framework.

Although our long-term goal is to produce a high-performance poker program that is capable of beating the best human players, for our first step we are interested in constructing a framework with useful computer-oriented techniques. It should minimize human expert information and easily allow the introduction of an opponent modeling system, and still make a strong computer poker program. If we are successful, then the insights we gain should have wide applicability to other applications that require similar activities.

We will present new enumeration techniques for determining the strength and potential of a player's hand, will demonstrate a working program that successfully plays `real' poker, and demonstrate that using opponent modeling can result in a significant improvement in performance. Chapter 2 will introduce terminology (there is a glossary in Appendix C) and will give the rules of poker and of Texas Hold'em (the poker variant we have chosen to study). Chapter 3 describes how humans play poker. Chapter 4 discusses various ways to approach the problem using a computer, and details the architecture we have selected. Chapter 5 describes the enumeration techniques we use for hand evaluation. Chapter 6 describes our betting strategy. Chapter 7 details the opponent modeling system. Chapter 8 discusses the experimental system and some results. Chapter 9 discusses the conclusions and future work.

Parts of Chapters 5 and 6 have been published in Advances in Artificial Intelligence [4], and parts of Chapter 7 have been published in AAAI-98 [3]. Our poker-playing program is called Loki and has been demonstrated at AAAI-98. It is written in C and C++ and runs with real-time constraints (in typical play, an action should not take more than a few seconds). The primary mechanism for performance testing is self-play, however we also play against human opponents through an Internet poker server. The interface between the program and the server is written in PERL.

next up previous contents
Next: 2. Poker Up: Dealing with Imperfect Information Previous: List of Tables   Contents
Denis Papp