Theory Revision
Large-scale expert systems have found widespread use.
However, developers have found that the cost of maintaining a knowledge
base, over its lifetime, can be as high as the initial cost of its development.
One response is to use machine learning techniques to correct the knowledge
base as problems emerge. Unfortunately, standard induction methods seem
ill-suited to this task, as they are designed to use training data to construct
a knowledge base from scratch, and the rate at which training data
is generated by field users is typically too low to support regeneration
of the knowledge base each time a revision is needed. As domain knowledge
is available (in the form of the knowledge base), it makes sense to take
advantage of this information (even if imperfect) to bias the induction
process. Techniques for theory revision are suited for the
task of knowledge maintenance, since they use training data to improve
an initial domain theory.
This page overviews the theory revision project, within
the kme programme at Siemens Corporate Research. The "Implementation"
papers describe the systems we developed for this task. These systems,
in effect, hill-climb to a theory whose accuracy is a local optimum. The
"Theory" papers explain why we seek local optima, by proving that no efficient
system can always find the global optimum.
Theoretic Analysis (Sample/Computational Complexity)
The Complexity of Theory Revision ("pure"
theory: adding/deleting antecedents/rules)
The Complexity of Revising Logic Programs
("impure" theory: reordering antecedents/rules)
Fault Hierarchies
Pat Langley, George Drastal, R. Bharat Rao and Russell Greiner
The Fifth International Workshop on Principles of Diagnosis
(Dx94), New York, October 1994.
The fault hierarchy representation is widely used in expert systems for
the diagnosis of complex mechanical devices. On the assumption that an
appropriate bias for a knowledge representation language is also an appropriate
bias for learning in this domain, we have developed a theory revision method
that operates directly on a fault hierarchy. This domain presents several
challenges: A typical training instance is missing most feature values,
and the pattern of missing features is significant, rather than merely
an effect of noise. Moreover, features that are present may be corrupted.
Finally, the accuracy of a candidate theory is measured by considering
both the sequence of tests required to arrive at a diagnosis, and its agreement
with the diagnostic endpoints provided by an expert. We describe the DELTA
algorithm for theory revision of fault hierarchies, its application in
knowledge base maintenance, and report on experiments with a diagnostic
system in current use.
(updated
version)
Optimized
Russell Greiner, R. Bharat Rao and Glenn Meredith
In progress
Essentially all theory revision systems use a set of theory-to-theory transformations
{ t_k } to hill-climb from a given initial theory to a new theory whose
empirical accuracy, over a given set of labeled training instances { q_j
}, is a local optimum. At the heart of each such process is a "evaluator",
which compares the accuracy of the current theory KB with that of each
of its "neighbors" N(KB) = { t_k(KB) }, each formed by applying one transformation
to the KB, with the goal of determining which neighbor has the highest
score. One obvious way to implement such an evaluator involves simply running
each individual neighbor theory KB_k = t_k(KB) on each instance q_j; this
corresponds to the "wrapper" model [JKP'93]. In practice, however, this
approach can be prohibitively slow. We therefore built an alternative system
DD, which employs a smarter evaluator that can quickly compute the score
of a transformed theory t_k(KB), relative to the current theory KB, by
"looking inside" KB and reasoning about the effects of the t_k transformation.
This paper presents data that shows DD runs around 35 times faster than
the naive wrapped system Delta, attaining the same accuracy. We also discuss
DD's source of power, and generalize from this specific implementation
to suggest other situations to which these ideas may be applicable.
Complex - Pure
Russell Greiner
Artificial Intelligence Journal,
107:2 (February 1999), 175-217.
A knowledge-based expert system uses its database (a.k.a. its "theory")
to produce answers to the queries it receives. Unfortunately, these systems
will produce inaccurate answers if their theories include erroneous information.
Standard "theory revision" systems use a given set of "labeled queries"
(each a query paired with its correct answer) to incrementally transform
the given theory, by adding and/or deleting either rules and/or antecedents,
into a related one that is as accurate as possible. This paper uses both
sample and computational complexity arguments to analyze this process.
After formally defining the theory revision task, we use the observation
that we have access to only a limited number of labeled queries to constrain
the space of new theories that can be considered; in particular, we describe
when a polynomial number of such queries will be sufficient to obtain the
information required to identify a theory whose accuracy is close to optimal
with high probability. We then consider the computational complexity of
finding the best theory within this space, and prove that, unless P=NP,
no polynomial time algorithm can identify this near-optimal theory, even
given the exact distribution of queries. We also present conditions under
which a polynomial-time algorithm can produce a theory whose (in)accuracy
is even close (ie, within a particular polynomial factor) to optimal. These
results justify the standard practice of using these types of transformations
to hill-climb to a locally-optimal theory, based on a given set of labeled
samples.
Expanded from
The
Complexity of Theory Revision, Proceedings of the Fourteenth International
Joint Conference on Artificial Intelligence (IJCAI95), Montreal, August
1995.
Complex - Impure
Russell Greiner
Journal of Logic Programming, 40:2-3
(Aug-Sept 1999), pp. 273-298.
A rule-based program will return a set of answers to each query. An impure
program, which includes the Prolog cut "!" and not(.) operators, can return
different answers if its rules are re-ordered. There are also many reasoning
systems that return only the first answer found for each query;
these first answers, too, depend on the rule order, even in pure rule-based
systems. A theory revision algorithm, seeking a revised rule-base whose
expected accuracy, over the distribution of queries, is optimal,
should therefore consider modifying the order of the rules. This paper
first shows that a polynomial number of training ``labeled queries'' (each
a query paired with its correct answer) provides the distribution information
necessary to identify the optimal ordering. It then proves, however, that
the task of determining which ordering is optimal, once given this distributional
information, is intractable even in trivial situations; eg, even if each
query is an atomic literal, we are seeking only a ``perfect'' theory, and
the rule base is propositional. We also prove that this task is not even
approximable: Unless P=NP, no polynomial time algorithm can produce an
ordering of an n-rule theory whose accuracy is within n^{\gamma} of optimal,
for some \gamma > 0. We next prove similar hardness, and non-approximatability,
results for the related tasks of determining, in these impure contexts,
(1) the optimal ordering of the antecedents; (2) the optimal set
of new rules to add; and (3) the optimal set of existing rules
to delete.
Expanded from
The
Challenge of Revising Impure Theories Proceedings of the Twelfth
International Conference on Machine Learning (ML95), Lake Tahoe, July
1995.
Return to Greiner's
home page