Title: Resolution
1Resolution How can one automate a deductive
reasoning procedure that we informally discussed
in the previous lecture?
2At the knowledge level, an ideal deductive
procedure is as follows Given a knowledge base
KB and a sentence ?, we need a procedure that can
determine whether or not KB ?. Also, if ?x1,
, xn is a formula with free variables among the
xi, we would want a procedure that would find
terms ti (if they exist) such that KB ?t1,
, tn. No computational procedure can fully
satisfy this specification. What we are after is
a deductive reasoning procedure that operates in
as sound and complete manner as possible, in a
language as close as possible to the full FOL.
3There are several ways of formulating the
deductive reasoning task for a KB that is a
finite set of sentences ?1, , an KB ?
iff (?1 ? ? ?n) ? ? iff KB ? ?? is
not satisfiable iff KB ? ?? ?TRUE, where
TRUE is any valid sentence, e.g., ?x(x x). So,
we can find entailments of a finite KB if we have
a procedure for testing the validity of
individual sentences or for determining
the satisfiability of sentences or for
determining the entailment of ?TRUE.
4The resolution procedure determines whether
certain sets of formulas are satisfiable.
First, we discuss propositional
resolution. Then we generalize the description to
deal with variables and quantifiers.
5Resolution at the propositional level The
procedure works on logical formulas in a special
form. Every formula ? of propositional logic can
be converted to another formula, ? such that
(? ? ?) and where ? is a conjunction
of disjunctions of literals (that is, atoms or
their negations). Terminologically, ? and ? are
logically equivalent, and ? is in conjunctive
normal form (CNF). In the propositional case,
CNF formulas look as follows (p ? ?q) ? (p ? r
? ?s) ? (?p ? r)
6- Conversion to CNF
- Eliminate ? and ? (they are shorthand for
formulas with only ?, ? and ?) - Make sure that ? appears only in front of an
atom ??? ? ? ?(? ? ?) ? (?? ? ??) ?(?
? ?) ? (?? ? ??). - Distribute ? over ? (? ? (? ? g)) ? ((? ? g)
? ?) ? ((? ? ?) ? (? ? g)). - Collect terms (? ? ?) ? ? (? ? ?) ? ?.
- The resulting formula can be exponentially larger
than the original.
7- A shorthand representation of CNF a clausal
formula - A clausal formula is a finite set of clauses,
where a clause is - a finite set of literals. The clausal formula is
understood as a - conjunction of its clauses, while each clause is
understood as - a disjunction of its literals (and a literal, as
always, is an atom or - its negation).
- Some notational conventions
- If p is a literal, then p is its complement,
that is, p ?p and vice versa for any p - To stress the difference between clausal
formulas and clauses, square brackets will
be sometimes used to delimit the latter - A clause with a single literal is called a
unit clause. - Examples, p.51
8- The overall task, recapitulated
- To determine whether or not KB ?, it will be
sufficient to - Put the sentences in the KB and ?? in CNF and
- Determine whether or not the resulting set of
clauses issatisfiable.
9A statement about what formulas can be inferred
from other formulas is called a rule of
inference. We will use one rule, binary
resolution Given a clause of the form c1 ? p
for some literal p and a clause of the form c2 ?
p containing a complement of p, infer the
clause c1 ? c2 consisting of the literals in the
first clause other than p and those in the second
clause other than p. Either c1 or c2 or both
can be empty. We say that c1 ? c2 is a resolvent
of the two input clauses with respect to p.
10Examples w, p, q and s, w,?p have the
resolvent w, s, q wrt p p, q and ?p, ?q
have the resolvent q, ?q wrt p and p, ?p wrt
q Note that is not a resolvent for these two
clauses. the empty clause is the resolvent only
of two unit clauses, e.g., p, ?p.
11A resolution derivation of a clause c from a set
of clauses S is a sequence of clauses c1, , cn
where the last clause is c and where each ci is
either an element of S or a resolvent of two
earlier clauses in the derivation. If there is
a derivation of c from S, it is denoted S -
c. Resolution derivation is a symbol level
operation on finite sets of literals. But it has
a direct connection to knowledge-level work.
12- A resolvent is always entailed by the input
clauses - c1 ? p, c2 ? ?p c1 ? c2 .
- Indeed, let ? be any interpretation, and suppose
that ? c1 ? p - and ? c2 ? ?p.
- If ? entails p then ? does not entail ?p. But
since ? c2 ? ?p, - must entail c2 and, therefore, entail c1 ? c2.
- Similarly, if ? does not entail p, then, since ?
c1 ? p, it must be - the case that ? c1, and, therefore, again, ?
c1 ? c2.
13It can be shown that any clause derivable by
resolution from S is entailed by S, that is, if
S - c, then S c. This is so because ci is
either a member of the set S or, by definition of
derivation, is a resolvent of two earlier
clauses, and so is entailed by those clauses.
14However, it is not the case that if S c, S -
c. For example, let S consist of a single clause
?p and let c be ?q,q. S entails c (because c
is always true) even though it has no
resolvents. This means that as a form of
reasoning, resolution is sound but not
complete. But resolution is both sound and
complete when c is an empty clause. S - if
and only if S . S is unsatisfiable if and
only if S - . Now, in order to determine the
satisfiability of any set of clauses we only need
to find a derivation of the empty clause!
15An entailment procedure is a symbol-level
procedure for determining whether KB ?. The
steps are a) put KB and ?? in CNF b) check
whether the resulting set of clauses S is
unsatisfiable by searching for a derivation
of the empty clause. S is unsatisfiable if and
only if KB ? ?? is unsatisfiable, which holds
only if KB ?.
16- procedure RESOLUTION
- input a finite set S of propositional clauses
- output binary (satisfiable or unsatisfiable)
- Check whether ? S if so, return unsatisfiable
- Otherwise, check if there are two clauses in S
such that they resolve to produce another clause
not already in S if not,return satisfiable - Otherwise, add the new resolvent clause to S and
go to Step 1
To efficiency discussion
17The procedure repeatedly adds resolvents to the
input clauses S until either the empty clause is
added (in which case there is a derivation of the
empty clause) or no new clauses can be added (in
which case there is no derivation of an empty
clause). The process is guaranteed to terminate
because any new clause that is added to S
consists only of the literals present in the
original set S and there are only finitely many
clauses with just these literals. The procedure
can be made deterministic by deciding on how to
choose the pair of clauses for producing a new
resolvent if there are more than one candidate
pair. One possibility is to select the first
pair encountered. Another is to go for producing
the shortest resolvent.
To HerbrandTheorem
18The resolution procedure does not distinguish
between clauses that come from the KB and those
that come from the negation of ?, which is
usually called a query. It is clear that to be
able to use this procedure it is worth ones
while to keep the KB in CNF.
19Example 1. Toddler Toddler ? Child Child ? Male
? Boy Infant ? Child Child ?
Female ? Girl Female We can read the sentences
as if they describe a particular person the
person is a toddler, if the person is a toddler
then the person is a child, etc. Suppose, the
query is Is the person a girl?
KB
20comments, p. 55
21Summary Remember that ? ? ? ? ?? ? ? We use a
dashed line to separate clauses in the KB (and
the negation of the query) from those obtained
through the operation of resolution. The diagram
shows both clauses which a resolvent
resolves. The resulting graph is acyclic because
input clauses must always appear earlier in the
derivation. Note that two of the KB clauses
were not used in the derivation.
22Example 2. Sun ? Mail (Rain ? Sleet) ? Mail Rain
? Sun These formulas can be interpreted as
talking about mail deliveries on a particular
day. Let us demonstrate that KB Mail.
KB
23Note that ((Rain ? Sleet) ? Mail) converts into
two formulas in CNF. If we wanted to show that
the KB does not entail Rain, we would use a
similar graph with ?Rain posited about the dotted
line instead of ?Mail and shown that the set of
all possible resolvents does not include .
24- Variables and Quantifiers
- The first step, as before, is to convert formulas
into an equivalent - clausal form. Let us first examine the case when
no existential - quantifiers remain after negations have been
moved inward. - Eliminate ? and ?, as before.
- Move ? inward so that it appears only in front of
an atom, usingthe equivalences used earlier
??? ? ? ?(? ? ?) ? (?? ? ??) ?(? ?
?) ? (?? ? ??) and the following two ??x.?
? ?x.?? ??x.? ? ?x.??.
25- Standardize variables, that is, ensure that each
quantifier is over a distinct variable by
renaming them as necessary. This uses the
following equivalences (provided x does not
occur free in ?) ?y.? ? ?x.?xy and ?
y.? ? ?x.?xy (recall that ?xy is a formula in
which all free occurrences of y are replaced by
x) - Eliminate all remaining existentials (see slide
43) - Move universals outside the scope of conjunction
and disjunctionusing the following equivalences
(again, provided x is not free in ?) (? ?
?x.?) ? (?x.? ? ?) ? ?x(? ? ?) (? ? ?x.?) ?
(?x.? ? ?) ? ?x(? ? ?) - Distribute ? over ? (? ? (? ? g)) ? ((? ? g)
? ?) ? ((? ? ?) ? (? ? g)). - Collect terms (? ? ?) ? ? (? ? ?) ? ?.
To Skolemization
26After this procedure is completed, we obtain a
quantified version of CNF a universally
quantified conjunction of disjunctions of
literals that is logically equivalent to the
original formula (modulo of the treatment
of existentials). Again, it is convenient to
use a clausal form of CNF. We simply drop the
quantifiers (all of them are universal) and are
left with a set of clauses, each of which is a
set of literals, each of which is either an atom
or its negation. An atom now is of the form
P(t1, tn), where the terms ti may
contain variables, constants and function
symbols. Clauses are understood in the normal
way, except that variables in them are
interpreted as universally quantified. So, for
example the clausal formula P(x), ?R(a,
f(b,x)),Q(x,y) stands for the CNF formula ?x
?y (P(x) ? ?R(a, f(b,x)) ? Q(x,y)).
27Additional notation and terminology for
substitutions A substitution ? is a finite set
of pairs x1/t1, , xn/tn where the xi
are distinct variables and the ti are arbitrary
terms. If ? is a substitution and ? is a
literal then ?? is a literal that results from
simultaneously replacing each xi in ? by ti.
For example, if ? x/a, y/g(x,b,z) and ?
P(x,z,f(x,y)), then ? ? P(a, z,
f(a,g(x,b,z))). Similarly, if c is a clause ,
c? is the clause that results from performing
the substitution on each literal. We say that
a term, literal or clause is ground if it
contains no variables. We say that a literal ? is
an instance of a literal ? if for some ?, ?
??.
28First Order Resolution The main idea is that
since clauses with variables are implicitly
universally quantified, we want to allow
resolution inferences that can be made from any
of their instances. For example, consider the
clauses P(x,a), ?Q(x) and ?P(b,y),
?R(b,f(y)). Then, implicitly, we have
clauses P(b,a), ?Q(b) and ?P(b,a),
?R(b,f(a)), which resolve to Q(b),
?R(b,f(a)). We will define the rule of
resolution so that this clause is the resolvent
of the two original ones (that is, not instances
but types).
29The general rule of (binary) resolution Suppose
we are given a clause of the form c1 ? ?1
containing some literal ?1 and a clause of the
form c2 ? ?2 containing the complement of some
literal ?2. Suppose we rename the variables in
the two clauses so that each clause has distinct
variables, and there is a substitution ? such
that ?1? ?2?. Then we can infer the clause (c1
? c2)? consisting of those literals in the first
clause other than ?1 and those in the second
clause other than ?2, after applying ?. We say
in this case that ? unifies ?1 and ?2 and that ?
is the unifier of the two literals.
30The definition of a derivation stays the same A
resolution derivation of a clause c from a set of
clauses S is a sequence of clauses c1, , cn
where the last clause is c and where each ci is
either an element of S or a resolvent of two
earlier clauses in the derivation. Ignoring the
case of equality (), as before, S - if and
only if S .
31Example 1. ?x, GradStudent(x) ? Student(x) ?x,
Student(x) ? HardWorker(x) GradStudent(sue) Lets
show that KB HardWorker(sue)
KB
32Example 2 The three blocks problem.
Query Is there a green block on top of a
nongreen block?
a
Answer is not immediately obvious. Let a, b, and
c be the blocks. Let G be a predicate symbol
meaning green Let O be a predicate symbol
meaning on The facts in S (the KB) are S
O(a,b), O(b,c), G(a), G(c) We claim that S
a, where a (the query) is ?x?y. G(x) ? G(y) ?
O(x,y) The KB is already in CNF! And the
negation of the query contains neither
existentials or equalities.
b
?
c
not green
33(No Transcript)
34Example 3. Arithmetic. We can use the constant
zero to stand for 0 and succ to stand for the
successor function. Every natural number can
then be written as a ground term (a
term containing no variables) using these two
symbols For instance, succ(succ(succ(succ(succ(z
ero))))) means 5. Plus(x,y,z) can be introduced
as a predicate standing for the relation x y
z. The KB can formalize the properties of
addition as follows ?x. Plus(zero,x,x) ?x ?y
?z. Plus(x,y,z) ? Plus(succ(x), y, succ(z)). The
following schema shows that 2 3 5 follows
from this KB.
35(No Transcript)
36A derivation for an entailed existential formula
like ?u, Plus(2,3,u) is similar. Note the
renaming of variables! The variables in the input
clauses must be distinct.
By examining the bindings of the variables we can
locate the value of u It is bound to succ(v),
where v is bound to succ(w), and w to 3. In other
words, the answer for the addition is correctly
determined to be 5.
37Answer Extraction It is not always easy to find
answers by tracing variable bindings. It can
happen that a KB entails some ?x. P(x) without
entailing P(t) for a specific t. For example, in
the three-block problem, the KB entails that some
block must be green and on top of a nongreen
block but not which! Answer extraction helps to
deal with answers even in cases like this. The
main idea is to replace a query such as ?x. P(x)
- where x is the variable we are tracking - by
?x. P(x) ? ?A(x), where A is a new predicate
symbol occurring nowhere else, called the answer
predicate. Since A is not in the original KB, it
will not be possible to derive the empty clause
from the modified query. Instead, we posit that
the derivation terminates as soon as we produce a
clause containing only the answer predicate.
38The first example will have a definite
answer. Student(john) Student(jane) Happy(john)
Query ?x.Student(x) ? Happy(x) lt-- at least one
student is happy.
The final clause can be interpreted as An answer
is John.
39Note that we say an answer is John. There can be
many answers but each derivation only deals with
one. For example, if the KB were Student(john) St
udent(jane) Happy(john) Happy(jane) then in one
derivation we might end up with Jane and in
another with John. The answer extraction process
is especially useful with indefinite
answers. Suppose the KB were Student(john) Stude
nt(jane) Happy(john) ? Happy(jane) We can
determine that there is a student who is happy,
only we will not know who exactly.
40The final clause can be interpreted as saying An
answer is either Jane or John.
41Answer extraction can result in clauses
containing variables. For example, for the
KB ?w. Student(f(a, w)) ?x ?z. Happy(f(x,
g(z))) we get a derivation whose final clause
is A(f(a, g(z))), which can be interpreted as
saying An answer is any instance of the term
f(a, g(z)).
42Skolemization Its time to deal with
existentials. So far, we could not handle a KB
with facts like ?x?y?z. P(x,y,z) because we could
not convert them into CNF. The remedy Because
some individuals are claimed to exist,
we introduce names for them (they are called
Skolem constants andSkolem functions) and
represent facts using those names. If these names
are used nowhere else, all the entailments will
be exactly like the original entailments of a
fact with an existential. How can one do this?
For the above formula, for example, an x is
claimed to exist, so lets call it a for each y
a z is claimed to exist, so lets call it f(y).
Now, instead of reasoning with ?x?y?z.
P(x,y,z) we use ?y. P(a,y,f(y)), where a and f
are Skolem symbols (a constant and a function)
appearing nowhere else.
43We will now fill a lacuna in our description of
the conversion to CNF.
To Conversion to CNF
Skolemization replaces each existential variable
by a new function symbol with as many arguments
as there are universal variables dominating the
existential. In other words, we start with ?x1(
?x2 ( ?x3 ( ?y y ))) where
existentially quantified y appears in the scope
of universally quantified x1, x2 and x3 and only
these variables and we end up with ?x1( ?x2 (
?x3 ( f(x1, x2, x3) ))), where f
appears nowhere else.
44If ? is the original formula and ? is the result
of converting it to CNF with Skolemization, then
it is no longer the case that (? ? ?). For
example, ?x. P(x) is not equivalent to P(a), its
skolemized version. But it can be shown that ?
is satisfiable if and only if ? is
satisfiable, which is exactly what is needed for
resolution. Independent work
Skolemization and universal variables, BL p. 65
45Equality So far, we have ignored formulas with
equality. If we were to treat equality as a
regular predicate, we would have missed
unsatisfiable sets of clauses, e.g., a b, b
c, a ? c. All special properties of equality
should be taken into account. We need clausal
versions of the axioms of equality reflexifity
?x. x x symmetry ?x ?y. x y ? y
x transitivity ?x ?y ?z. x y ? y z ? x
z substitution for functions ?x1?y1?xn?yn .
x1 y1 ? ? xn yn ? f(x1,xn)
f(y1,yn) substitution for predicates
?x1?y1?xn?yn . x1 y1 ? ? xn yn ?
P(x1,xn) ? P(y1,yn)
46It can be shown that with these axioms, equality
can be treated as a binary predicate, so that the
soundness and completeness of resolution for the
empty clause will be preserved. A simple
example of the use of the axioms of
equality. The KB ?x . Married(father(x),
mother(x)) father(john) bill the
query Married(bill, mother(john)).
47Two axioms are used substitution for predicates
and reflexivity. The suggested treatment of
equality is very inefficient.
48Resolution does not provide a general effective
solution to the reasoning problem. The First
Order Case. Suppose, the KB is ?x ?y .
LessThan(succ(x), y) ? LessThan(x,y), just a
single formula. Suppose the query is
LessThan(zero, zero). This should fail because
the KB does not entail the query or its negation.
But resolution can get derivations as follows
49This is an infinite sequence.
50This means that we cannot use a depth-first
procedure to search for the empty clause because
of the possibility of infinite branches. Is
there a way to detect that we are following a
branch of this kind? No, there is not. This
means that there can be no procedure that, given
a set of clauses, returns satisfiable when the
clauses are satisfiable and unsatisfiable otherwis
e. However, we know that resolution is
refutation-complete if the set of clauses is
unsatisfiable, some branch will contain an empty
clause, So, a breadth-first search is guaranteed
to report unsatisfiable when the clauses are
unsatisfiable. When the clauses are satisfiable,
the search may or may not terminate.
51In the propositional case resolution runs to
completion.
To Propositional Resolution
Resolution in FOL sometimes reduces to the
propositional case. Given a set S of clauses,
the Herbrand universe of S is the set of
all ground terms formed using just the constants
and function symbols in S. For example, if S
mentions just constants a and b and unary
function symbol f, then the Herbrand universe is
the set a, b, f(a), f(b), f(f(a)), f(f(b)),
f(f(f(a))), f(f(f(b))), . The Herbrand base of
S is the set of all ground clauses c? where c ?
S and ? assigns the variables in c to terms in
the Herbrand universe. Herbrands theorem states
that a set of clauses is satisfiable if and
only if its Herbrand base is satisfiable.
52To reason with the Herbrand base one does not
need unifiers and other complications so that the
procedure is complete and sound and is
guaranteed to terminate. The Herbrand base is
typically an infinite set of propositional
clauses (there are no miracles and we know that
no procedure can decide the satisfiability of
arbitrary sets of clauses). But when the
Herbrand universe is finite (when there are no
function symbols and a finite set of constants in
S), then the Herbrand base is, too. Another
device for trying to keep the universe finite is
to take into account the type of the arguments
and values of functions and include a term like
f(t) only if the type of t is appropriate for the
function f. For example, if our function is
birthday(person) date we may be able to check
for and exclude meaningless terms like
birthday(birthday(john)).
53If we can get a finite set of propositional
clauses, we know that the first resolution
procedure we introduced will terminate.
However, this does not make the procedure
practical. It was shown that there are
unsatisfiable propositional clauses c1, , cn
such that the shortest derivation of the empty
clause is on the order of 2n steps. So, no
matter how clever we are, on such clauses
resolution is exponential. Can we find a method
better than resolution? This is not known. The
satisfiability problem is NP-complete.
To Resolution Procedure
54So, resolution does not solve all our
problems. For knowledge representation purposes
it is necessary to be able to produce entailments
of a KB for immediate action, but determining the
satisfiability of clauses may be computationally
too expensive for this purpose. What are the
options? One is to give more control over
reasoning to the user. Another option is to use
representation languages that are less expressive
than FOL. But resolution is good when there are
no constraints on time, e.g., in mathematical
theorem proving. So, the practical concern is
looking for methods of derivation that
eliminate as many unnecessary steps as possible.
55SAT solvers Instead of searching for a
derivation that would show a set of clauses to be
unsatisfiable, search procedures may search for
an interpretation that would make the clauses
satisfiable. Such procedures are called SAT
solvers. They are often applied to clauses that
are known to be satisfiable but where the
satisfying interpretation is not known. But
this is not a real departure. Our first
resolution procedure can be adapted to finding a
satisfying interpretation.
56Most General Unifiers Consider two clauses c1
and c2 where c1 contains the literal P(g(x),
f(x), z) and c2 contains ?P(y, f(w), a). These
literals can be unified by the substitution ?1
x/b, y/g(b), z/a, w/b and also by ?2
x/f(z), y/g(f(z)), z/a, w/f(z). If we cannot
derive an empty clause using ?1, then we need
to consider ?2 or even another substitution.
The above substitutions are too specific.
Indeed, any unifier must give w the same value as
x and y the same as g(x) but it is possible not
to commit -- yet -- to the value of x.
57- The substituiton
- ?3 y/g(x), z/a, w/x
- unifies the two literals without making an
arbitrary choice that - might hinder the path to the empty clause.
- This is the most general unifier (MGU).
- Formally, a most general unifier ? of literals ?1
and ?2 is a unifier - such that for any other unifier ? there is a
further substitution ? - such that ? ? ?.
- The operator is interpreted as follows
- ? is the substitution such that for any
literal ?, - ?(? ?) ?(?) ?, that is, ? is applied to ?
and ? to the result.
58Thus, one can get to other unifiers through
additional substitutions. For example, given ?3
?3 y/g(x), z/a, w/x, we can get to ?1
?1 x/b, y/g(b), z/a, w/b by applying
x/b and to ?2 ?2 x/f(z), y/g(f(z)), z/a,
w/f(z) by applying x/f(z). Note that an MGU
need not be unique. For example, ?4 y/g(w),
z/a, x/w is also one for c1 and c2 A
reminder c1 contains the literal P(g(x), f(x),
z) and c2 contains ?P(y, f(w), a).
59- The key fact about MGUs is that we can limit the
resolution rule - to MGUs without loss of completeness. This helps
immensely in - the search because it dramatically reduces the
number of - resolvents that can be inferred from two input
clauses. - Moreover, an MGU of a pair of literals ?1 and ?2
can be calculated - efficiently as follows
- Start with ?
- Exit if ?1? ?2?
- Otherwise, get the disagreement set, DS, which is
the pair ofterms at the first place where the
two literals disagree, e.g.,if ?1? P(a, f(a,
g(z), )) and ?2? P(a, f(a, u), )), thenDS
u, g(z)) - Find a variable v ? DS and a term t ? DS that
does not contain vif none, fail - Otherwise, set ? to ? v/t, and go to Step 2.
60This procedure works well for non-pathological
cases. Because MGUs greatly reduce the search
and can becalculated efficiently, all
resolution-based systems implemented to date use
them.
61Other refinements Clause elimination Ordering
strategies Paramodulation (special treatment of
equality) Sorted logic we would refuse to unify
P(s) with P(t) if the sorts
of s and t are incompatible Connection
graph Directional connectives
62(No Transcript)
63(No Transcript)