# The Complexity of Theory Revision

### Russell Greiner

A knowledge-based expert system uses its database (a.k.a. its "theory")
to produce answers to the queries it receives. Unfortunately, these systems
will produce inaccurate answers if their theories include erroneous
information. Standard "theory revision" systems use a given set of "labeled
queries" (each a query paired with its correct answer) to incrementally
transform the given theory, by adding and/or deleting either rules and/or
antecedents, into a related one that is as accurate as possible. This paper
uses both sample and computational complexity arguments to analyze this process.
After formally defining the theory revision task, we use the observation that
we have access to only a limited number of labeled queries to constrain the
space of new theories that can be considered; in particular, we describe when
a polynomial number of such queries will be sufficient to obtain the
information required to identify a theory whose accuracy is close to optimal
with high probability. We then consider the computational complexity of
finding the best theory within this space, and prove that, unless P=NP, no
polynomial time algorithm can identify this near-optimal theory, even given
the exact distribution of queries. We also present conditions under which a
polynomial-time algorithm can produce a theory whose (in)accuracy is even
close (ie, within a particular polynomial factor) to optimal. These results
justify the standard practice of using these types of transformations to
hill-climb to a locally-optimal theory, based on a given set of labeled
samples.