Hengshuai Yao
Ph. D. Candidate
Department of Computer Science
University of Alberta
Email: hengshuai[at]gmail[dot]com
Address: Room 305, RLAI Lab, University of Alberta, Edmonton, AB, T6G 2E8
I am a PhD student in the Reinforcement Learning and Artificial Intelligence Lab at University of Alberta. My advisor is Professor Csaba Szepesvari. I received my Master of Engineering's degree in Computer Science and Engineering Department of Tsinghua University in 2006. I received my Bachelor of Science's degree in Mathematical Department at Shandong University of Technology China in 2003.

My research interests include reinforcement learning, machine learning, information retrieval and data mining. I enjoy studying and inventing efficient algorithms for learning and inferring from big data. My past experience include projects about prediction and control problems, robot soccer, link analysis algorithms, personalization ranking and battling spams, recommendation systems, trending topics detection, and large scale computation, etc.
  • Our daughter YingYing was born on March 22, 2014.
  • Our paper for trending topic detection was accepted by WWW 2014.
  • I finished my internship on December 27, 2013.
  • I was presenting "Trending Now" on Yahoo! FYI on October 30, 2013. Marissa anounced our team has won the CEO challenge award!
  • I presented our internship project "Trending Now" to Marissa with a teammate in October 2013.
  • Internship at Yahoo! extended another three months.
  • I will start an internship with Yahoo! Search since July 22nd, 2013.
  • Our linear action model based approximated policy iteration algorithm was accepted by AAAI, 2012.
  • Our son Alexander was born on November 25, 2011.
Conferences
  • Yao, H.. and Szepesvari, Cs. 2014. Pseudo-MDPs and a New Method for Nonlinear Feature Construction. in preparation.
  • Lee, C., Yao, H., He, X., Su, C., and Chang, J-Y. 2014. A System to Predict Future Popularity: Learning to Classify. WWW, Seol,Korea.
  • Yao, H., Szepesvari, Cs., Sutton, R., and Bhatnagar,S. 2013. Universal Option Models. submitted.
  • Yao, H. and Schuurmans, D. 2013. Reinforcement Ranking. arXiv:1303.5988.
  • Yao, H. 2012. MaxRank: Discovering and Leveraging the Most Valuable Links for Ranking. arxiv 1210.1626.
  • Yao, H. and Szepesvari, Cs. Approximate Policy Iteration with Linear Action Models. Twenty-Sixth Conference on Artificial Intelligence. AAAI. Toronto, Canada. 2012. pdf
  • Yao, H. Off-policy learning with linear action models: an efficient "One-Collection-For-All-Solution". In workshop on "Planning and Acting with Uncertain Models" at the 28th ICML, Bellevue, Washington, USA. 2011. pdf
  • Yao, H. Linear least-squares Dyna-style planning. Technical Report TR11-04, Department of Computing Science, University of Alberta. 2011.
  • Yao, H., Bhatnagar, S., and Diao, D. Multi-step linear Dyna-style planning. Advances in Neural Information Processing Systems (NIPS) 22, Vancouver, BC, Canada. 2009. retyped pdf
  • Yao, H., Bhatnagar, S., and Szepesvari, Cs. LMS-2: towards an algorithm that is as cheap as LMS and almost as efficient as RLS. The Forty-eighth IEEE Control and Decision Conference (CDC), Shanghai, China. December 2009.pdf
  • Yao, H., Sutton, R. S., Bhatnagar, S., Diao, D., and Szepesvari, Cs. Dyna(k): A multi-step Dyna planning. Abstraction in Reinforcement Learning. Montreal, Canada. June 2009. pdf
  • Yao, H., Bhatnagar, S., and Szepesvari, Cs. Temporal difference learning by direct preconditioning. Multidisciplinary Symposium on Reinforcement Learning (MSRL), Montreal, Canada. June 2009. pdf
  • Yao, H., and Liu, Z-Q. Preconditioned temporal difference learning. The 25th International Conference on Machine learning (ICML), Helsinki, Finland. June 2008. pdf
  • Yao, H., and Liu, Z-Q. Minimal residual approaches for policy evaluation in large sparse Markov chains. The Tenth International Symposium on Artificial Intelligence and Mathematics (ISAIM), Fort Lauderdale, USA. January 2008. pdf
Journals
  • Yao, H., Rafiei, D., and Sutton, R. 2013. A Study of Temporal Citation Count Prediction using Reinforcement Learning. accepted. IEEE Transactions on Systems, Man, and Cybernetics, Part B.
  • "Trending Now" Internship winner (CEO challenge award); project presented to Marissa, 2013.
  • Yahoo! Invention Award. December, 2013.
  • J Gordin Kaplan Graduate Student Award, University of Alberta, 2012.
  • Hong Kong Government Scholarship for graduate students. 2007.
  • Fourth place, simulation league of World RoboCup, Lisbon, 2004.
  • The 2001 and 2002 Shandong University of Technology Mathematical Contest in Modeling: 1st Class Award.
  • The 2002 and 2003 Excellent Academic Scholarship (1st class) of Shandong University of Technology
Last Updated: March 2014