Manfred Dworschak (DER SPIEGEL) wrote: > > Darse, > > Thanks for your quick reply. I was just wondering if you could give me a > brief and simple introduction into how PsOpti works. (Perhaps there is a > written document available so you wouldn't have to bother explaining all > that stuff to me? I've read your paper on this subject, but that's too > technical for my purposes.) > > Ok. What I'd like to learn is this: > > - How does it work? > - What is it that makes it superior in comparison with its predecessors? > Well, that is a difficult question to answer briefly, but I'll try to give a high-level overview. Good strategy involves a fine balance of bluffing and betting for value. Excellent play must involve a certain amount of bluffing (including bluff raises), as well as check-raising and slowplaying (where strong hands are played as though they are weak, delaying the raises to trap the opponent). Deciding precisely how to do this is a hard problem. Older versions of the program were designed to play the multi-player "ring game", typically with 10 players. Good strategy for that kind of game is very different from the 2-player game. When you have only one opponent, there is a lot more deception and trickery involved in good play. We shifted our focus to the 2-player game because the weaknesses of the earlier approaches were very evident in this game. A game-theoretic solution produces a sophisticated strategy that is well-balanced in all respects. It is also safe and "robust", because it is guaranteed to obtain the theoretical value of the game against *any* other strategy. By "strategy", we basically mean a recipe for how often to raise, call, or fold in any particular situation (this may involve randomness, such as choosing to raise 80% of the time and call 20%). The drawback is that this type of strategy is fixed -- it can't adapt to the style of a particular opponent. Although it will break even against any opponent, it might only win at a slow rate against a weak or predictable player. However, it would be an excellent default to use against a strong or unknown opponent, while making observations that will enable it to make the appropriate adjustments later. Game theory has been around since the 1940s, and it is now used in a wide variety of applications. It is the foundation of modern economics, and has been used for a diverse range of topics, such as the inspection of nuclear facilities, models of biological evolution, the FCC auctions for radio and television bandwidth, and much more. The limitation of game theory is that traditionally it could only be applied to very small problem sizes, or small models of a given problem. For example, you might be able to completely solve a poker variation having only eight cards in the deck and one betting round. What we did was show one way that a very large problem could be modeled with a set of much smaller problems, each of which is computationally tractable. It is impossible to compute the complete game-theoretic solution for Texas Hold'em (the poker variant used to determine the World Champion), because the 2-player game has more than a quintillion states (ie. 10 to the 18th power, or a billion billion). We used abstraction techniques to create smaller games that have about 30 million states, but retain most of the key properties of the real game. The solutions to these smaller games produce an approximation of the game-theoretic optimal solution for the real game. Since game theory is such an important area of mathematics, and since the techniques we used can be generalized to many other domains with imperfect information, the result may be significant. This may be why our peer computer scientists awarded us the Distinguished Paper Award for IJCAI/AAAI (the premier conference in Artificial Intelligence). The resulting poker program plays *a lot* better than previous attempts. It mixes-up its play very well, and doesn't lose by much, even to very strong opponents. It can be quite frustrating to play against, and it is difficult to learn good counter-strategies against the program. The next step for us is a program that can quickly learn an opponent's biases and tendencies, and smoothly adapt to exploit those weaknesses. Our recent results have been very encouraging, and we continue to make progress toward our goal of creating a program that plays poker better than all human beings. - Darse.