I am interested in all aspects of artificial intelligence. So far,
most of my research has been on decision making under uncertainty, and I
am currently focusing on machine learning. My approach has been often to
study the complexity of the problem, together with design and analysis of
algorithms. However, I am also very interested in applying my research
and seeing for example that the algorithms/systems that I help design/analyze
and build solve important real world problems. I like to contribute both to the
science/theory and to the practice.
I am working on a number of projects at UA with Russ Greiner and other faculty and students during my postdoc:
The Markov decision process (MDP) framework for decision making, planning, and control is surprisingly rich in capturing the essence of purposeful activity in various situations. These models and associated problems arise in many areas, including medical decision making, maintenance planning, robot navigation, and so on, and have been an area of continuing interest and development in operations research and control engineering, and recently in Artificial Intelligence. MDPs are also a source of clean, challenging and a few long standing computational problems! My Ph.D. thesis (abstract in text format ) focuses on the computational complexity and the design and analysis of algorithms for infinite horizon MDPs and related models and problems, such as shortest paths and generalizations such as (monotone) linear feasibility problems (see MDP problems in context).I was advised by Richard Anderson and Steve Hanks. I continue to work on these problems while at U. of Alberta.
Publications:
oTalk on MDP games (research in progress: reduces mean-payoff game to a 2-player shortest paths problem)
oTalk on Augmented Value iteration and Policy Iteration on deterministic MDPs (research in progress)
oTalk on value iteration on deterministic MDPs
The problem arises when during training finding the value of a feature has a
cost (time or money), and there is a budget. This is a problem for
example in clinical trials (e.g. the polyomix project) in which there is an
imposed research budget, but also applies to other active learning scenarios
with feature costs at learning time. The first challenge was to define
the problem so that there is a satisfactory notion of optimality for the
complete range of budgets. Our answer is to address the problem in a Bayesian
framework. With Russ Greiner and Daniel Lizotte
we have explored the problem first in a simplified multi-armed bandit setting,
and then in the context of learning linear classifiers such as Naive Bayes. For recent papers and results, check out the budgeted
learning page.
Adaptive Image Interpretation
(UA)
With Vadim Bulitko, Russ Greiner and Ilya Levner (student), I am investigating the following: given a library of image processing operators, which sequence of operators to choose and execute, based on the given image, so that accurate interpretation, e.g. accurate detection of objects of interest, occurs fast. The problem involves several aspects of both learning and decision making under uncertainty. My contributions so far have been in defining the learning and control problems precisely, and in proposing algorithms with certain performance guarantees. The system is currently being trained to recognize various tree types in forests in the iNRIIS domain. Our ultimate goal for this system is to be fairly easily portable (trainable with sufficient accuracy and speed) to different image interpretation applications.
Comparing model selection criteria (UA)
Efficient
Evaluation Strategies for Stochastic And-Or trees (UA)
How should one evaluate a boolean expression composed of ANDs and ORs only, in order to minimize the number of queries (or cost)? If each variable appears at most once in the expression, then the expression corresponds to an And-Or tree. If there is a success probability associated with each variable evaluation, we ask for a strategy to minimize the expected cost (assume independence). The deterministic And-Or tree (evaluation) problem is solvable by dynamic programming in poly time, and the And-Or graph problem (variables can repeat) is NP-hard. With Magdalena Jankowska (student), Ryan Hayward and Russ Greiner, I am investigating computational complexity and algorithms for stochastic And-Or trees. Here’s a manuscript containing our recent results (submitted).
Together with Oren Etzioni, Dick Karp, and Oren Zamir (then graduate student) we investigated new text clustering algorithms geared especially towards the web. A novelty of one of the algorithms was in using suffix trees for clustering based on common phrases. Versions of the basic algorithm was used in Husky Search and grouper work. Here is the paper presented at KDD97.
In a previous work, Oren Etzioni, Steve Hanks, Tao Jiang and I investigated a fairly new problem: the Metacrawler's problem! Metacrawler is a parallel search engine which queries other information databases (other search engines) on the web. Right now the services provided by the databases are free, so the metacrawler sends out queries to all of them at once, to save time and to increase the chances of finding what the user is after. In a world where the information sources charge for their service, the Metacrawler may have to trade time against monetary costs against the value of the piece of information being sought against the chances of obtaining the information. We developed a model of the problem, investigated a number of optimization criteria under the model, established NP-hardness, and came up with approximation algorithms under some of those criteria. Professors Karp and Waarts worked on a different optimization criterion and our combined paper appears in Foundations of Computer Science'96. The expaned journal version with more results appears in SICOMP, vol. 29, no. 5, 2000.
My Ph.D. qualification project, was in computational biology advised under Larry Ruzzo, and Dick Karp. I investigated the problem of quickly finding approximately matching clones in restriction mapping experiments, a problem in physical mapping of DNA. I came up with a trie based algorithm to speed up the process and analyzed some of the tradeoffs. I think that the ideas behind the algorithm follow a general scheme for approximate matching in other contexts as well. I presented the work at DIMACS/PENN workshop in Computational Biology, in May 1996.
The latest edition of an introductory AI text book called ``The Elements of Artificial Intelligence: An Introduction Using LISP'' by Steven Tanimoto was published in 1995. Professor Tanimoto and I added to the number of solutions in the teacher's manual accompanying the latest edition.