Michael Bowling, Peter McCracken, Michael James, James Neufeld, and Dana Wilkinson. Learning Predictive State Representations Using Non-Blind Policies. In Proceedings of the Twenty-Third International Conference on Machine Learning (ICML), pp. 129–136, 2006.
Predictive state representations (PSRs) are powerful models of non-Markovian decision processes that differ from traditional models (e.g., HMMs, POMDPs) by representing state using only observable quantities. Because of this, PSRs can be learned solely using data from interaction with the process. The majority of existing techniques, though, explicitly or implicitly require that this data be gathered using a blind policy, where actions are selected independently of preceding observations. This is a severe limitation for practical learning of PSRs. We present two methods for fixing this limitation in most of the existing PSR algorithms: one when the policy is known and one when it is not. We then present an efficient optimization for computing good exploration policies to be used when learning a PSR. The exploration policies, which are not blind, significantly lower the amount of data needed to build an accurate model, thus demonstrating the importance of non-blind policies.
@InProceedings(06icml-psr-exploration, title = "Learning Predictive State Representations Using Non-Blind Policies", author = "Michael Bowling and Peter McCracken and Michael James and James Neufeld and Dana Wilkinson", booktitle = "Proceedings of the Twenty-Third International Conference on Machine Learning (ICML)", year = "2006", pages = "129--136", AcceptRate = "20\%", AcceptNumbers = "140 of 700" )