Research Projects

Information for Prospective Students

My group offers research opportunities for both undergraduate and graduate students.

If you are already at University of Alberta, please contact me or one of my students. If you want to come to Alberta, please read about Computing Science Graduate Studies and click the "How to Apply" button.

Ongoing Research

The research in our group covers a number of topics in heuristic search, especially in games and in automated planning. Recently, there is much interest in machine learning approaches to these topics.

For recent results, you can look at some of our Publications and Talks.

Project Proposals

Heuristic search and learning with imperfect domain models

Up to now, most work in heuristic search has taken place in domains where the state space is completely known, at least in an implicit way. In reinforcement learning, both model-free and model-based approaches exist. We are interested in how to design efficient forward search algorithms and learning mechanisms in cases where the model of the state space is missing, incomplete, or partially wrong.

Integrating exact solution methods with Alpha Zero type architectures in two player games

Most exact game solvers have traditionally used relatively simple but fast heuristics to control their search. Alpha Zero architectures have been extremely successful for heuristic play, and the quality of their heuristics has far outweighed their orders of magnitude slower speed.

Much less is known about how effective these new machine-learned heuristics are for solvers. Can they overcome their severe speed disadvantages in a setting where complete search of all moves is required in order to guarantee that a proof is correct?

Evaluation of deep RL learning against exact solutions in two player games

Architectures such as AlphaGo and Alpha Zero have brought an unprecedented boost in the playing strength of Go programs. But how close to perfect are they? As far as we know, the limits to further improvement have still not been reached. Judging from self-play data, even the best current programs must still make mistakes.

While most Go positions are far too complicated to allow a complete analysis, full-scale late stage Go endgame puzzles can be solved exactly with the search method of decomposition search. This method implements a divide and conquer approach based on concepts from combinatorial game theory.

I this project we want to study how modern Go programs learn to play such difficult puzzles, and whether there are limits to how well they can learn them.

Utilizing state similarity in Monte Carlo Tree Search

Utilizing the similarity of states is at the core of generalization in search and learning. For example, Alpha Zero architectures learn a deep neural net from self play, which generalizes to play new, unseen positions at a very high level. Less effort has been put into methods that use similarity directly in the search. One recent successful example that does that is our Memory-Augmented Monte Carlo Tree Search (M-MCTS) algorithm (paper). In this project we seek to deepen our understanding of M-MCTS and similar ideas.

Efficient algorithms in combinatorial game theory

We want to study ways of scaling up search algorithms for combinatorial games, for example by using approximations to temperature, or heuristic local searches. Also see our previous work on algorithms in CGT.

Go Endgame Solver

Go on several small boards such as 5x5 and 5x6 has been solved by Erik van der Werf. Develop a stronger endgame solver based on ideas such as our efficient solution to the graph history interaction problem, and combinatorial game theory. First, re-solve these boards more efficiently. Then solve 6x6 and other larger boards.

Random walk methods and machine learning for classical planning

Two topics of interest: Improve our successful line of planners Arvand and Jasper which use random walk ideas. Apply modern machine learning methods to classical planning.

Created: Aug 25, 2012 Last modified: Aug 13, 2019
Martin Müller