Show Abstract.
Decision-making problems with two agents can be modeled as two player games, and a Nash equilibrium is the basic solution concept describing good play in adversarial games. Computing this equilibrium solution for imperfect information games, where players have private, hidden information, is harder than solving perfect information games. Both domains may technically be classified as easy, with algorithms that require polynomial time and space, but the difference in degree means that for extremely large games, both computation time and storage requirements may exceed available resources.
In this thesis, I present four main contributions towards fast and memory efficient techniques for solving extensive-form games, which describe a class of imperfect information games with sequential decision making. First, the thesis introduces an analysis of counterfactual regret minimisation (CFR), an algorithm for solving extensive-form games, and presents tighter regret bounds that describe the rate of progress. CFR is a popular, state-of-the-art algorithm, and the improved bounds give some indication as to why the algorithm has good practical performance.
Second, I describe and analyse a game-solving algorithm, CFR + , which has faster empirical performance than CFR, and compare it to a number of theoretically faster algorithms. We wrote and released an efficient, distributed implementation of CFR + to solve heads-up limit Texas hold’em, making it the first competitively played imperfect information game to be solved.
Third, the thesis presents a series of theoretical tools for using decomposition, and creating algorithms which operate on small portions of a game at a time. I describe an algorithm, CFR-D, which can solve games without ever storing a complete strategy, and an algorithm for re-solving a portion of a game to complete an incomplete equilibrium within the full game. Both algorithms have a formal guarantee of correctness.
Finally, I describe continual resolving, an algorithm which is an imperfect information analogue to heuristic search in perfect information games. Continual re-solving uses decomposition to play games by doing depth-limited solving with a heuristic evaluation, with a theoretical bound on solution quality. We implemented continual re-solving in the game of no-limit heads-up no-limit Texas hold’em, and beat a group of 33 professional players in a human trial.
Hide Abstract.