Yasin Abbasi-Yadkori

Science and Engineering Faculty
Queensland University of Technology
Brisbane, Australia
Gmail: "yasin dot abbasi"

I have moved to Queensland University of Technology, but I will maintain this page for a while.

I am a postdoctoral fellow at Queensland University of Technology with Dr. Peter Bartlett. I completed my PhD at University of Alberta under the supervision of Dr. Csaba Szepesvari.

Research interests: Markov decision processes, bandit problems

Peer-Reviewed Publications

Y. Abbasi-Yadkori, P. Bartlett, and A. Malek, Linear Programming for Large-Scale Markov Decision Problems, arXiv:1402.6763 [math.OC], 2014. pdf

Y. Abbasi-Yadkori, P. Bartlett, and V. Kanade, Tracking Adversarial Targets, To appear in ICML 2014. pdf

Y. Seldin, P. Bartlett, K. Crammer, and Y. Abbasi-Yadkori, Prediction with Limited Advice and Multiarmed Bandits with Paid Observations, To appear in ICML 2014.

Y. Abbasi-Yadkori, P. Bartlett, V. Kanade, Y. Seldin, and Cs. Szepesvari, Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions, Neural Information Processing Systems (NIPS), 2013. pdf

    Preliminary Version: Y. Abbasi-Yadkori, P. Bartlett, and Cs. Szepesvari, Online Learning in Markov Decision Processes with Adversarially Chosen Transition Probability Distributions, arXiv:1303.3055 [cs.LG], 2013. pdf

Y. Abbasi-Yadkori, D. Pal, and Cs. Szepesvari, Online-to-Confidence-Set Conversions and Application to Sparse Stochastic Bandits, Artificial Intelligence and Statistics (AISTATS), 2012. pdf

Y. Abbasi-Yadkori, D. Pal, and Cs. Szepesvari, Improved Algorithms for Linear Stochastic Bandits, Neural Information Processing Systems (NIPS), 2011. pdf

    Preliminary Version: Y. Abbasi-Yadkori, D. Pal, and Cs. Szepesvari, Online Least Squares Estimation with Self-Normalized Processes: An Application to Bandit Problems, arXiv:1102.2670 [cs.AI], 2011. pdf

Y. Abbasi-Yadkori and Cs. Szepesvari, Regret Bounds for the Adaptive Control of Linear Quadratic Systems, Conference on Learning Theory (COLT), 2011. pdf

K. Hajebi, Y. Abbasi-Yadkori, H. Shahbazi, and H. Zhang, Fast Approximate Nearest-Neighbor Search with k-Nearest Neighbor Graph, International Joint Conference on Artificial Intelligence (IJCAI), 2011. pdf

Y. Abbasi-Yadkori, J. Modayil, and Cs. Szepesvari, Extending Rapidly-Exploring Random Trees for Asymptotically Optimal Anytime Motion Planning, International Conference on Intelligent Robots and Systems (IROS), 2010. pdf

P. Hooper, Y. Abbasi-Yadkori, R. Greiner, and B. Hoehn, Improved Mean and Variance Approximations for Belief Net Responses via Network Doubling, Conference on Uncertainty in Artificial Intelligence (UAI), 2009. pdf

B. Poczos, Y. Abbasi-Yadkori, Cs. Szepesvari, R. Greiner, and N. Sturtevant, Learning when to stop thinking and do something!. International Conference on Machine Learning (ICML), 2009. pdf

M. Ravanbakhsh, Y. Abbasi-Yadkori, M. Abbaspour, and H. Sarbazi-Azad, A Heuristic Routing Mechanism Using a New Addressing Scheme, International Conference on Bio-Inspired Models of Network, Information and Computing Systems, 2006. pdf

Workshop Papers

Y. Seldin, Cs. Szepesvari, P. Auer, Y. Abbasi-Yadkori, Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments, European Workshop on Reinforcement Learning (EWRL), 2013. pdf

Y. Abbasi-Yadkori, A. Antos, and Cs. Szepesvari, Forced-Exploration Based Algorithms for Playing in Stochastic Linear Bandits, COLT Workshop on On-line Learning with Limited Feedback, 2009. pdf

Theses

Ph.D. thesis: Online Learning for Linearly Parametrized Control Problems. Department of Computing Science, University of Alberta, September 2012. pdf

M.Sc. thesis: Forced-Exploration Based Algorithms for Playing in Bandits with Large Action Sets. Department of Computing Science, University of Alberta, Spring 2009. pdf