Manfred Dworschak (DER SPIEGEL) wrote:
> 
> Darse,
> 
> Thanks for your quick reply. I was just wondering if you could give me a
> brief and simple introduction into how PsOpti works. (Perhaps there is a
> written document available so you wouldn't have to bother explaining all
> that stuff to me? I've read your paper on this subject, but that's too
> technical for my purposes.)
> 
> Ok. What I'd like to learn is this:
> 
> - How does it work?
> - What is it that makes it superior in comparison with its predecessors?
> 


Well, that is a difficult question to answer briefly, but I'll try to
give a high-level overview.


Good strategy involves a fine balance of bluffing and betting for value.
Excellent play must involve a certain amount of bluffing (including
bluff raises), as well as check-raising and slowplaying (where strong
hands are played as though they are weak, delaying the raises to trap
the opponent).  Deciding precisely how to do this is a hard problem.

Older versions of the program were designed to play the multi-player
"ring game", typically with 10 players.  Good strategy for that kind
of game is very different from the 2-player game.  When you have only
one opponent, there is a lot more deception and trickery involved in
good play.  We shifted our focus to the 2-player game because the
weaknesses of the earlier approaches were very evident in this game.

A game-theoretic solution produces a sophisticated strategy that is
well-balanced in all respects.  It is also safe and "robust", because
it is guaranteed to obtain the theoretical value of the game against
*any* other strategy.  By "strategy", we basically mean a recipe for
how often to raise, call, or fold in any particular situation (this
may involve randomness, such as choosing to raise 80% of the time and
call 20%).

The drawback is that this type of strategy is fixed -- it can't adapt
to the style of a particular opponent.  Although it will break even
against any opponent, it might only win at a slow rate against a weak
or predictable player.  However, it would be an excellent default to
use against a strong or unknown opponent, while making observations
that will enable it to make the appropriate adjustments later.


Game theory has been around since the 1940s, and it is now used in a
wide variety of applications.  It is the foundation of modern economics,
and has been used for a diverse range of topics, such as the inspection
of nuclear facilities, models of biological evolution, the FCC auctions
for radio and television bandwidth, and much more.

The limitation of game theory is that traditionally it could only be
applied to very small problem sizes, or small models of a given problem.
For example, you might be able to completely solve a poker variation
having only eight cards in the deck and one betting round.  What we did
was show one way that a very large problem could be modeled with a set
of much smaller problems, each of which is computationally tractable.

It is impossible to compute the complete game-theoretic solution for
Texas Hold'em (the poker variant used to determine the World Champion),
because the 2-player game has more than a quintillion states (ie. 10 to
the 18th power, or a billion billion).  We used abstraction techniques
to create smaller games that have about 30 million states, but retain
most of the key properties of the real game.  The solutions to these
smaller games produce an approximation of the game-theoretic optimal
solution for the real game.

Since game theory is such an important area of mathematics, and since
the techniques we used can be generalized to many other domains with
imperfect information, the result may be significant.  This may be why
our peer computer scientists awarded us the Distinguished Paper Award
for IJCAI/AAAI (the premier conference in Artificial Intelligence).


The resulting poker program plays *a lot* better than previous attempts.
It mixes-up its play very well, and doesn't lose by much, even to very
strong opponents.  It can be quite frustrating to play against, and it
is difficult to learn good counter-strategies against the program.

The next step for us is a program that can quickly learn an opponent's
biases and tendencies, and smoothly adapt to exploit those weaknesses.
Our recent results have been very encouraging, and we continue to make
progress toward our goal of creating a program that plays poker better
than all human beings.

   - Darse.