# The Challenge of Revising an Impure Theory

### Russell Greiner

A *pure* rule-based program will return a set of answers to each query; and
will return the same answer set even if the rules are re-ordered. To work
effectively, however, many real-world programs are ``impure'', ie, include the
Prolog cut ``!'' and not(.) operators. The answers returned by such programs
*do* depend on the rule ordering. There are also many reasoning
systems that
return only the *first* answer found for each query; these first answers, too,
depend on the rule order, even in *pure* rule-based systems. A theory
revision algorithm, seeking the revised rule-based system that works best in
such impure contexts, should therefore consider modifying the order of the
rules, in addition to adding and deleting rules and antecedents. Such
revision algorithms apply a sequence of modifications, seeking a rule-base
whose expected accuracy, over the distribution of queries, is optimal. This
paper first shows that a polynomial number of training labeled queries (each a
query coupled with its correct answer) is sufficient to provide the
distribution information necessary to identify the optimal ordering. It then
proves, however, that the task of determine which ordering is optimal, once
given this information, is intractable even in trivial situations; eg, even if
each query is an atomic literal, we are seeking only a ``perfect'' theory, and
the knowledge base is propositional. We also prove that this task is not even
approximable: Unless P=NP, no polynomial time algorithm can produce an
ordering whose accuracy is within n^{\epsilon} of optimal, for some
\epsilon > 0, where n is the number of rules. We also prove similar
hardness, and non-approximatability, results for the related tasks of
determining, in these impure contexts, (1) the optimal ordering of the
antecedents; (2) the optimal set of rules to add or (3) to delete; and (4) the
optimal priority values for a set of defaults.