**Learning Effective Performance Systems**
Many learning systems search through a space of possible
performance elements, seeking an element whose expected utility, over the
distribution of problems, is high. As the task of finding the globally
optimal element is often intractable, many practical learning systems instead
hill-climb to a local optimum. Even this is problematic as the learner
typically does not know the underlying distribution of problems, which
is essential for computing the utility of an element, and hence for determining
which element is optimal. This research addresses the task of approximating
this hill-climbing search when the utility function can only be estimated
by sampling. The paper presents a general
algorithm, PALO, that returns an element that is, with provably high probability,
essentially a local optimum; see picture.
That material also summarize how PALO has been to applied to various tasks.
More detailed applications include

The following papers contain related work

*PALO Overview*
#
A Probabilistic Hill-Climbing Algorithm

###
Russell Greiner

###
*Artificial Intelligence*, 84:1--2 (July 1996), p. 177--204.

Many learning systems search through a space of possible performance elements,
seeking an element whose expected utility, over the distribution of problems,
is high. As the task of finding the globally optimal element is usually
intractable, many practical learning systems use hill-climbing to find
a local optimum. Unfortunately, even this is problematic as the underlying
distribution of problems, required to determine an element's expected utility
and hence essential for comparing different elements, is typically unknown.
This paper addresses the task of approximating this hill-climbing search
when the utility function can only be estimated by sampling. We present
an algorithm that returns an element that is, with provably high probability,
essentially a local optimum. We then demonstrate the generality of this
algorithm by sketching three meaningful applications, that respectively
find an element whose efficiency, accuracy or completeness is nearly optimal.
These results suggest approaches to solving the utility problem from explanation-based
learning, the multiple extension problem from nonmonotonic reasoning and
the tractability/completeness tradeoff problem from knowledge representation.
*Expanded from*

*Efficient Strategy*
###
Russell Greiner and Igor Jurisica

###
*Proceedings of the Annual National Conference on Artificial
Intelligence (AAAI-92)*, San Jose, July 1992.

Many "learning from experience" systems use information extracted from
problem solving experiences to modify a performance element PE, forming
a new one PE' that can solve these and similar problems more efficiently.
However, as transformations that improve performance on one set of problems
can degrade performance on other sets, the new PE' is not always better
than the original PE; this depends on the distribution of problems. We
therefore seek the the performance element whose expected performance,
over this distribution, is optimal. Unfortunately, the actual distribution,
which is needed to determine which element is optimal, is usually not known.
Moreover, the task of finding the optimal element, even knowing the distribution,
is intractable for most interesting spaces of elements. This paper presents
a method, PALO, that side-steps these problems by using a set of samples
to estimate the unknown distribution, and by using a set of transformations
to hill-climb to a local optimum: Given any parameters $\epsilon, \delta
> 0$, PALO produces an element PE' whose expected performance is, with
probability $\geq 1-\delta$, within $\epsilon$ of a local optimum. This
process is based on a mathematically rigorous form of utility analysis:
in particular, it uses statistical techniques to determine whether the
result of a proposed transformation will be better than the original system.
We also present an efficient way of implementing this learning system in
the context of a general class of performance elements; and include empirical
evidence that this approach can work effectively.
*Short version:*

*the proof that computing the optimal strategy
is intractable appears in*
###
Russell Greiner

###
*Artificial Intelligence*, 50:1 (1991) 95-115.

A backward chaining process uses a collection of rules to reduce a given
goal to a sequence of data-base retrievals. A "derivation strategy" is
an ordering on these steps, specifying when to use each rule and when to
perform each retrieval. Given the costs of reductions and retrievals, and
the *a priori* likelihood that each particular retrieval will succeed,
one can compute the expected cost of any strategy, for answering a specific
query from a given knowledge base. Smith \cite{ControlBC} presents an algorithm
that finds the minimal cost strategy in time (essentially) linear in the
number of rules, for any disjunctive, irredundant knowledge base. This
paper proves that the addition of redundancies renders this task NP-hard.
Many Explanation-Based Learning systems work by adding in redundancies;
this shows the complexities inherent in their task.

*for a ***database** slant to these results:
###
Russell Greiner

###
*Proceedings of the Eleventh Symposium on Principles of
Database Systems (PODS-92)*, San Diego, June 1992.

A query processor QP uses the rules in a rule base to reduce a given query
to a series of attempted retrievals from a database of facts. The QP's
"expected cost" is the average time it requires to find an answer, averaged
over its anticipated set of queries. This cost depends on the QP's "strategy",
which specifies the order in which it considers the possible rules and
retrievals. This paper provides two related learning algorithms, PIB and
PAO, for improving the QP's strategy, ie, for producing new strategies
with lower expected costs. Each algorithm first monitors the QP's operations
over a set of queries, observing how often each path of rules leads to
a sufficient set of successful retrievals, and then uses these statistics
to suggest a new strategy. PIB hill-climbs to strategies that are, with
high probability, successively better; and PAO produces a new strategy
that probably is approximately optimal. We describe how to implement both
learning systems unobtrusively, discuss their inherent time and space complexities,
and use methods from mathematical statistics to prove their correctness.
We also discuss additional applications of these approaches to several
other database tasks.

*Accurate Theory*
###
*ECAI Workshop on Theoretical Foundations of Knowledge
Representation and Reasoning*, Springer Verlag (LNAI series), 1993.

A default theory can sanction different, mutually incompatible, answers
to certain queries. We can identify each such theory with a set of related
credulous theories, each of which produces but a single response to each
query, by imposing a total ordering on the defaults. Our goal is to identify
the credulous theory with optimal "expected accuracy" averaged over the
natural distribution of queries in the domain. There are two obvious complications:
First, the expected accuracy of a theory depends on the query distribution,
which is usually not known. Second, the task of identifying the optimal
theory, even given that distribution information, is intractable. This
paper presents a method, OptAcc, that side-steps these problems by using
a set of samples to estimate the unknown distribution, and by hill-climbing
to a local optimum. In particular, given any error and confidence parameters
\epsilon, \delta>0, OptAcc produces a theory whose expected accuracy is,
with probability at least 1-\delta, within \epsilon of a local optimum.

*Categorical Horn Approximation*
Proceedings of the Third International Conference on Knowledge Representation
and Reasoning (KR92), p. 383--392, Boston, October 1992.
While the task of answering queries from an arbitrary propositional
theory is intractable in general, it can typically be performed efficiently
if the theory is Horn. This suggests that it may be more efficient to answer
queries using a "Horn approximation"; ie, a horn theory that is semantically
similar to the original theory. The utility of any such approximation depends
on how often it produces answers to the queries that the system actually
encounters; we therefore seek an approximation whose expected "coverage"
is maximal. Unfortunately, there are several obstacles to achieving this
goal in practice: (i) The optimal approximation depends on the query distribution,
which is typically not known a priori; (ii) identifying the optimal approximation
is intractable, even given the query distribution; and (iii) the optimal
approximation might be too large to guarantee tractable inference. This
paper presents an approach that overcomes (or side-steps) each of these
obstacles. We define a learning process, AdComp, that uses observed queries
to estimate the query distribution "online", and then uses these estimates
to hill-climb, efficiently, in the space of size-bounded Horn approximations,
until reaching one that is, with provably high probability, effectively
at a local optimum.

*Effective Landmark Set*
###
Russell Greiner and Ramana Isukapalli

###
*IEEE Transactions on Systems, Man and Cybernetics, Part
B*, June 1996, p 437-449.

To navigate effectively, an autonomous agent must be able to quickly and
accurately determine its current location. Given an initial estimate of
its position (perhaps based on dead-reckoning) and an image taken of a
known environment, our agent first attempts to locate a set of landmarks
(real-world objects at known locations), then uses their angular separation
to obtain an improved estimate of its current position. Unfortunately,
some landmarks may not be visible, or worse, may be confused with other
landmarks, resulting in both time wasted in searching for invisible landmarks,
and in further errors in the agent's estimate of its position. To address
these problems, we propose a method that uses previous experiences to learn
a selection function that, given the set of landmarks that might be visible,
returns the subset which can probably be found correctly, and so provide
an accurate registration of the agent's position. We use statistical techniques
to prove that the learned selection function is, with high probability,
effectively at a local optimal in the space of such functions. This report
also presents empirical evidence, using real-world data, that demonstrate
the effectiveness of our approach.
*Expanded from ...*

*Measuring the Effectiveness
of a Representation*
###
*Proceedings of the Twelfth International Joint Conference
on Artificial Intelligence (IJCAI91)*, August 1991.

This report discusses what it means to claim that a representation is an
effective encoding of knowledge. We first present dimensions of merit for
evaluating representations, based on the view that usefulness is an external
property, and is necessarily relative to a specified task. We then provide
methods (based on results of mathematical statistics) for reliably measuring
effectiveness empirically, and hence for comparing different representations.
We also discuss weak but guaranteed methods of improving inadequate representations.
Our results are an application of the ideas of formal learning theory to
concrete knowledge representation formalisms.

*Probably Approximately Optimal*
###
*Artificial Intelligence*, 82:1-2,
April 1996 .

A "satisficing search problem" consists of a set of probabilistic experiments
to be performed in some order, seeking a satisfying configuration of successes
and failures. The expected cost of performing the experiments depends both
on the success probability of the individual experiments, and on the "search
strategy", which specifies the order in which the experiments are to be
performed. A strategy is optimal if its expected cost is minimal. Earlier
work has provided "optimizing functions" that find optimal strategies for
certain classes of search problems, when given the probabilities. This
paper extends those results by addressing the complexities of identifying
an approximately optimal strategy when the probability values are not known,
within a fully general model. We present an algorithm PAO that first obtains
estimates of the probabilities by observing a computed number of trials
of each undetermined experiment, and then uses these estimates (and the
proper optimization function) to identify a strategy whose cost is, with
high probability, close to optimal. We also show that PAO can gather the
necessary statistics while solving relevant problems.
*Expanded from ...*

*Return to Greiner's
home page*