Hengshuai Yao
Ph. D. Candidate
Department of Computer Science
University of Alberta
Email: hengshua[at]cs[dot]ualberta[dot]ca
Address: Room 305, RLAI Lab
I am a Ph.D student in the Reinforcement Learning and Artificial Intelligence Lab at University of Alberta. My advisor is Professor Csaba Szepesvari. I received my Master of Engineering's degree in Computer Science and Engineering Department of Tsinghua University in 2006. I received my Bachelor of Science's degree in Mathematical Department at Shandong University of Technology China in 2003.

My research interests include reinforcement learning, machine learning, information retrieval and data mining. I enjoy studying and inventing efficient algorithms for learning and inferring from big data. My past experience include projects about prediction and control, robot soccer, link analysis, personalization ranking and battling spams, recommendation systems, trending topics detection, and large scale computation, etc.
  • Our daughter YingYing was born on March 22, 2014.
  • Our paper for trending topic detection was accepted by WWW 2014.
  • I was on Yahoo! FYI on October 30, 2013. Marissa anounced our team won the CEO challenge award.
  • I presented our internship project "Trending Now" to Marissa with a teammate, October 2013.
  • Internship at Yahoo! extended another three months.
  • I started an internship at Yahoo! Search since July 22nd, 2013.
  • Our linear action model based approximated policy iteration algorithm was accepted by AAAI, 2012.
  • Our son Alexander was born on November 25, 2011.
  1. Yao, H., Szepesvari, Cs., Pires, B. A., and Zhang, X. 2014. Pseudo-MDPs and Factored Linear Action Models. IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (IEEE ADPRL), Best student paper nomination, Orlando, Florida, USA. pdf
  2. Lee, C., Yao, H., He, X., Su, C., and Chang, J-Y. 2014. A System to Predict Future Popularity: Learning to Classify. WWW (poster), Seol,Korea. pdf
  3. Yao, H., Szepesvari, Cs., Sutton, R., and Bhatnagar,S. 2014. Universal Option Models. NIPS. Montreal, Quebec, Canada. pdf
  4. Yao, H. and Schuurmans, D. 2013. Reinforcement Ranking. arXiv:1303.5988.pdf
  5. Yao, H. 2012. MaxRank: Discovering and Leveraging the Most Valuable Links for Ranking. arxiv 1210.1626. pdf
  6. Yao, H. and Szepesvari, Cs. Approximate Policy Iteration with Linear Action Models. Twenty-Sixth Conference on Artificial Intelligence. AAAI. Toronto, Canada. 2012. pdf
  7. Yao, H. Off-policy learning with linear action models: an efficient "One-Collection-For-All-Solution". In workshop on "Planning and Acting with Uncertain Models" at the 28th ICML, Bellevue, Washington, USA. 2011. pdf
  8. Yao, H. Linear least-squares Dyna-style planning. Technical Report TR11-04, Department of Computing Science, University of Alberta. 2011.
  9. Yao, H., Bhatnagar, S., and Diao, D. Multi-step linear Dyna-style planning. Advances in Neural Information Processing Systems (NIPS) 22, Vancouver, BC, Canada. 2009. retyped pdf
  10. Yao, H., Bhatnagar, S., and Szepesvari, Cs. LMS-2: towards an algorithm that is as cheap as LMS and almost as efficient as RLS. The Forty-eighth IEEE Control and Decision Conference (CDC), Shanghai, China. December 2009.pdf
  11. Yao, H., Sutton, R. S., Bhatnagar, S., Diao, D., and Szepesvari, Cs. Dyna(k): A multi-step Dyna planning. Abstraction in Reinforcement Learning. Montreal, Canada. June 2009. pdf
  12. Yao, H., Bhatnagar, S., and Szepesvari, Cs. Temporal difference learning by direct preconditioning. Multidisciplinary Symposium on Reinforcement Learning (MSRL), Montreal, Canada. June 2009. pdf
  13. Yao, H., and Liu, Z-Q. Preconditioned temporal difference learning. The 25th International Conference on Machine learning (ICML), Helsinki, Finland. June 2008. pdf
  14. Yao, H., and Liu, Z-Q. Minimal residual approaches for policy evaluation in large sparse Markov chains. The Tenth International Symposium on Artificial Intelligence and Mathematics (ISAIM), Fort Lauderdale, USA. January 2008. pdf
  • "Trending Now" Internship winner (CEO challenge award); project presented to Marissa, 2013.
  • Yahoo! Invention Award. December, 2013.
  • J Gordin Kaplan Graduate Student Award, University of Alberta, 2012.
  • Hong Kong Government Scholarship for graduate students. 2007.
  • Fourth place, simulation league of World RoboCup, Lisbon, Portugal, 2004.
  • Champion, The 2001 and 2002 Shandong University of Technology Mathematical Modeling Contest.
  • WWW 2015, pc member (on web search)
  • The First Workshop on Heterogeneous Information Access at WSDM 2015, pc member.
  • NeuroComputing, reviewer.
  • IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning(ADPRL) 2014, pc member.
  • Journal of Machine Learning Research, reviewer.
  • Pattern of Recgonition Letters, reviewer.
Last Updated: November, 2014