Title: A First Order Extension of Stlmarcks Method
1A First Order Extension of Stålmarcks Method
- Magnus Björk
- Chalmers University of Technology
- Gothenburg, Sweden
2Overview
- Stålmarcks method (in propositional logic)
- Features of FOL version
- Soundness completeness
- Implementation benchmarks
3Stålmarcks Method
- Tableaux like
- Operates on propositional formulae
- Successful in industrial applications
- Used in Prover Plug In
- Non-branching rules KE KI
- Dilemma rule branch and merge
4Example in Propositional Logic
P?Q Q?P (P?Q)?R
Input formulae
5Example in Propositional Logic
P?Q Q?P (P?Q)?R
6Example in Propositional Logic
P?Q Q?P (P?Q)?R
P
P
7Example in Propositional Logic
P?Q Q?P (P?Q)?R
P Q P?Q R
P
8Example in Propositional Logic
P?Q Q?P (P?Q)?R
P Q P?Q R
P Q P?Q R
9Example in Propositional Logic
P?Q Q?P (P?Q)?R
P Q P?Q R
P Q P?Q R
P?Q R
The intersection of the two branches
10Structure of a proof
0-hard problem No dilemma rule applications
11Structure of a proof
1-hard problem No nested dilemmas
X
12Structure of a proof
2-hard problem Maximal nesting level of 2
X
X
X
13Our Version of First Order Logic
- Rigid and universal variables
- Universal variables implicitly universally
quantified can be instantiated with anything
during proof - Rigid variables similar to constants may not be
instantiated
14Example in FOL
"x,y.(P(x,y)?Q(x,y)) "x,y.(Q(x,c)?P(x,c)) "x,y.((P
(x,y)?Q(x,y))?R(x,y))
Input formulae
15Example in FOL
"x,y.(P(x,y)?Q(x,y)) "x,y.(Q(x,c)?P(x,c)) "x,y.((P
(x,y)?Q(x,y))?R(x,y)) P(x,y)?Q(x,y) Q(x,c)?P(x,c)
(P(x,y)?Q(x,y))?R(x,y)
Input formulae
16Example in FOL
Input formulae
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
17Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae P(U,V),
18Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae
ØP(U,V)
P(U,V)
19Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae
ØP(U,V)
P(U,V) Q(U,V) P(U,V)?Q(U,V) R(U,V)
20Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae , P(U,c)
ØP(U,V) Suggestion V?c
P(U,V) Q(U,V) P(U,V)?Q(U,V) R(U,V)
21Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae , P(U,c)
ØP(U,V) Suggestion V?c
P(U,V) Q(U,V) P(U,V)?Q(U,V) R(U,V)
No common consequences
22Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae , P(U,c)
23Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae P(U,c)
24Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae
ØP(U,c)
P(U,c)
25Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae
ØP(U,c)
P(U,c) Q(U,c) P(U,c)?Q(U,c) R(U,c)
26Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae
ØP(U,c) ØQ(U,c) P(U,c)?Q(U,c) R(U,c)
P(U,c) Q(U,c) P(U,c)?Q(U,c) R(U,c)
27Example in FOL
... P(x,y)?Q(x,y) Q(x,c)?P(x,c) (P(x,y)?Q(x,y))?R(
x,y)
Future dilemma formulae
ØP(U,c) ØQ(U,c) P(U,c)?Q(U,c) R(U,c)
P(U,c) Q(U,c) P(U,c)?Q(U,c) R(U,c)
P(x,c)?Q(x,c) R(x,c)
28Finding Dilemma Formulae
- Begin with general formulae (containing rigid
variables) - Find instances when unifications fail due to
rigid variables - Redo dilemmas with instantiated dilemma formulae
29Why no Destructive Updates?
- Find general consequences, since rigid variables
become universal after merge - Dont have to worry about fairness all
instances will be tried out eventually
30Logical Intersections
- Ordinary set intersection wasteful on sets of
first order formulae - E.g. P(x) n P(c) Ø, but P(c) consequence of
both sets - Logical intersection unifies formulae from the
two sets
P(f(y)) ØP(c) Q(U)
P(x) Q(c)
P(f(y)) Suggestion U?c
31Proof procedure
- Begin with all non-branching rules until a
certain formula complexity (0-saturation) - Perform all dilemmas with 0-saturation in
branches (1-saturation) - Increase dilemma nesting level
(2, 3, -saturation)
32Completeness
- Say that a set of formulae is n-saturated if all
consequences that can be derived with a dilemma
nesting level of at most n are in the set - Note that the proof system simulates KE, and
therefore any unsatisfiable formula set has a
finite dilemma refutation - Show that proof procedure produces increasingly
saturated sets
33Soundness of the Dilemma Rule
Assume A0s A1s A
P(U) ØP(U)
A0 A1 AU?x
34Soundness of the Dilemma Rule
Assume A0s A1s A
P(U) ØP(U)
A0 A1 AU?x
35Soundness of the Dilemma Rule
Assume A0s A1s A
P(U) or ØP(U)
A0 A1 AU?x
for all U
36Soundness of the Dilemma Rule
Assume A0s A1s A
P(U) or ØP(U)
A0 or A1 AU?x
for all U for all U
37Soundness of the Dilemma Rule
Assume A0s A1s A
P(U) or ØP(U)
A0 or A1 A0 v A1 AU?x
for all U for all U for all U
38Soundness of the Dilemma Rule
Assume A0s A1s A
P(U) or ØP(U)
A0 or A1 A0 v A1 A v
A AU?x
for all U for all U for all U for all U
39Soundness of the Dilemma Rule
Assume A0s A1s A
P(U) or ØP(U)
A0 or A1 A0 v A1 A v
A A AU?x
for all U for all U for all U for all
U for all U
40Implementation Dilemma
- Made during 2004
- No equality support, lots of room for improvement
- Participated in CASC-J2 (2004)
41Benchmarks
- TPTP 2.7.0
- Formulae w/o equality 192/305 63
- Formulae with equality 120/974 12
- CNF w/o equality 733/1290 57
- In total 204 successes of nonzero rating.
- Highest rated 4 problems of rating 0.78
42Conclusions
- Sound and complete fully automated first order
theorem proving method - Some encouraging results
- Offers a way to derive formulae with universal
variables
43Future work
- Equality reasoning
- Equivalence relation approximation
- Model generation
- More optimizations