A brief Introduction to Automated Theorem Proving

About This Presentation

Title:

A brief Introduction to Automated Theorem Proving

Description:

ATP: Vampire, Otter, Spass, E-SETHEO, Darwin, Epilog, SNARK, Gandalf ... No 'external' functions symbols have to be considered! ... – PowerPoint PPT presentation

Number of Views:451

Avg rating:3.0/5.0

Slides: 55

Provided by: member5

Category:

more less

Transcript and Presenter's Notes

Title: A brief Introduction to Automated Theorem Proving

1
A brief Introduction to Automated Theorem Proving

Theoretical Foundations, History and the
Resolution Calculus for classical First-order
Logic
Uwe Keller
based on material by B. Beckert, R. Hähnle, A.
Voronkov, A. Leitsch and T. Tammet

2
Content

Intoduction
Motivation History
Theorem Proving, ATP and Calculi
Foundations
FOL, Normalforms Preprocessing, Metaresults
Resolution
Basic calculus, Unification
Refinements, Redundancy
Decision procedures
Chain Resolution
A Variant of Resolution for the Semantic Web
Demo

3
Part IIntroduction

Motivation History
Theorem Proving, ATP and Calculi

4
Logic and Theorem Proving
Real-world description in natural
language. Mathematical Problems Program
Specification
Formalization
Syntax (formal language). First-order Logic,
Dynamic Logic,
Semantics (truth function)
Calculus (derivation / proof)
Correctness
Valid Formulae
Provable Formulae
Completeness
5
How did it start

Results from first-half of the 20th century in
mathematical logic showed
we can do logical reasoning with a limited set of
simple (computable) rules in restricted formal
languages like First-order Logic (FOL)
That means computers can do reasoning!
Implementation of ATP
First Computers where needed - )
AI as a prominent field Reasoning as a basic
skill!
Mid 1950s first attempts to implement an ATP
Today
(A)TP is no longer only a part of main stream AI
Central shared problem How to represent and
search extremely large search spaces!

6
A rough timeline in ATP

before 1950 Proof-theoretic Work by Skolem,
Herbrand, Gentzen and Schütte
1954 First machine-generated Proof (Davis)
1955ff Semantic Tableaus (Beth, Hinitkka)
1957 First machine-generated Proof in Logic
Calculus (Newell Simon)
1957 Lazy substitution by free (dummy) Vars
(Kanger, Prawitz)
1958 First prover for Predicate Logic (Prawitz)
1959 More provers (Gilmore, Wang)
1960 Davis-Putnam Procedure (Davis, Putnam,
Longman)
1963 Unification (J.A. Robinson)
1963ff Resolution (J.A. Robinson) Inverse
Method (Maslov)
1963ff Modern Tableau Method (Smullyan, Lis)
without Unification
1968 Modelelimination (Loveland), with
Unification
1970ff PROLOG (Colmerauer, Kowalski),
Refinements of Resolution
1971 Connection Method (Bibel), Matings
(Andrews) with Unification
1985 ATP in non-classical logics, Renaissance of
Tableaux Methods
1987 Tableaus with Unification
1993ff Renewed interest in Instance-based
Methods DPLL, Modelevolution

7
Theorem Proving

Given
a formal language (or logic) L
a calculus C for this language ( set of rules)
a conjecture S and a set of assumptions or axioms
A in the language L
Determine
Can we construct a proof for S (from A) in
calculus C?
Logic Syntax Semantics Calculus
TP Proof-search in C (Huge search problem)
Correctness and completeness of Calculi essential
properties
Calculus Non-deterministic Algorithm
Central problem in ATP How to implement a
non-deterministic algorithm efficiently on a
deterministic machine - )

8
Theorem Proving (II)

Research areas
Interactive / tactic TP vs. Automated TP
Classical Logic vs. Non-classical logics
Calculi for
ATP - General principle Refutation approach
Resolution, Tableau, Inverse Method,
Instance-based Methods
ITP General principle Show Proof
situation/context
Sequent Calculi
others General principle Generation of complex
formulae based on very simple axioms
Hilbert-style Calculi
Central difference
What are the elements in a proof what is a
proof?

9
Main TP Applications

Main Applications
Software Hardware Verification
Theorem proving in Mathematics
Query answering in rich knowledge bases
(Ontologies)
Verification of cryptographic protocols
Retrieval of Software Components
Reasoning in non-classical Logics
Program synthesis
many systems implemented
ATP Vampire, Otter, Spass, E-SETHEO, Darwin,
Epilog, SNARK, Gandalf
ITP Isabelle/HOL, Coq, Theorema, KeY-Prover

10
Why is FOL of special interest in the ATP
community ?

There are less more expressive logics than FOL
Classical Propositional Logic, Modal
Propositional Logic, Description Logics, Temporal
Propositional Logic
Higher-order Predicate Logics, Dynamic Predicate
Logics, Type Theory
Research in ATP mainly focused on FOL
FOL is very expressive, many real-world problems
can be formalized in FOL
FOL turned out to be the most expressive logic
that one can adequately approach with ATP
techniques

11
Example

Theorem in (elementary) Calculus
Nullstellensatz Every function which is
continous over a closed interval Ia,b must
take the value 0 somewhere in I if f(a) lt 0 and
f(b) gt 0
Proof idea Consider the Supremum l of set M
x f(x) lt 0, altxltb and show that f(l) 0

12
Example (II)

Formalization
Compact (only LEQ)
Redundancy-free
Specific definitions
Continous functions
Main idea of proofis already encoded
Use Supremum
Can be done by anATP system
but without properFormalization ?!?
ATP better than humanprover? Robbins Problem in
Algebra
Intelligent Proving vs.Combinatorical proving

13
Part IIFoundations

FOL, Normalforms Preprocessing, Metaresults

14
Classical First-order Logic (FOL)

Syntax
Signature
Function Symbols, Predicate Symbols, Arity,
logical Connectives, Quantors
Terms (over ), Atomic Formulae (over ),
Formluae (over )
Definition relative to the signature of the
predicate logic
Semantics
First-order structure / interpretation S (U,I)
Universe U Signature-Interpretation I
Constants I(c) element of U
Functionsymbols I(f) total functions on U
Relationsymbols I(R) relation on U
Logical connectives and quantors in the usual way
Definition relative to the signature of the
predicate logic

15
Classical FOL (II)

Model of a statement
An interpretation S (U,I) is called a model of
a statement s iff valS(s) t
What does it mean to infer a statement from given
premisses?
Informally Whenever our premisses P hold it is
the case that the statement holds as well
Formally Logical Entailment
For every interpretation S which is a model of P
it holds that S is a model of S as well
Special case Validity Set of premisses is
empty
Logical entailment in a logic L is the (semantic)
relation that a calculus C aims at formalizing
syntactically (by means of a derivability
relation)!
Logical entailment considers semantics
(Interpretations) relative to a set of premisses
or axioms!

16
Normal Forms

What is a normal form?
Why are they interesting?
Relation to ATP?
Conversion of input to a specifc NF my be
required by a calculus (e.g. Resolution) )
Preprocessing step
ATP in a sense can be seen as a conversion in a
NF itself, borderline is fuzzy in a sense
Normalforms in FOL
Negation Normal Form
Standard Form
Prenex Normal Form
Clause Normal Form (in a sense a logic free
form)
There are logics where certain NF do not exist,
like CNF in a Dynamic First-order Logic
Certain calculi then can not be applied in these
logics!

17
Negation Normal Form

A formula is in Negation NF (NNF) iff. it
contains no implication and no bi-implication
symbols and all negation symbols occur only as
part of a literal (directly in front of atomic
formulae)
How to achieve this NF ?
Replace implication and bi-implication by their
definition (in terms of Æ and Ç)
Move negation symbols inside to atomic formulae
De Morgan laws
Dualize quantifiers when moving negation symbols
over a quantor
Eliminate multiple negations
All these syntactical transformations generate
semantically equivalent formulae
Example

18
Standard Form

A formula A is in Standard Form if no variable x
in A occurs both bound and free and no bound
variable is used as a quantor variable for
multiple subformulae
How to generate this NF?
Bounded renaming of quantor variables and the
respective occurrences
Transformed formulae is semantically equivalent
to original one
Example
(8 x P(x) Æ Q(z)) ! (9 x R(x) Ç 9 z (P(z) Æ
Q(z)))

19
Prenex Normal Form

A formula A is in Prenex NF iff. it is of the
form A Q1x1 Qnxn B where Qk is a universal
or existential quantor and B contains no
quantors. B is called the Matrix of A
How to construct this NF?
Transform A in NNF and Standard Form
Move iteratively outermost quantor to the outside
until it reaches another quantor. Quantors may
not cross quantors of different sort (in-scope
relation between quantor occurrences may not be
changed)
This transformation generates a formulae which is
logically equivalent to the original one.
Example

20
Clause Normal Form

A formula A is in Clause NF iff. it is in PNF,
closed, the prefix only contains universal
quantors and the Matrix is on conjunctive normal
form.
In other words A 8 x1 8 xn ( (L1,1 Ç Ç
L1,m1) Æ Æ (Lk,1 Ç Ç Lk,mk)) where Li,j is a
literal (negated or positive atomic formula)
How to construct this NF?
Transform A in NNF and Standard Form
Transform result in PNF
Remove existential quantors by Skolemization
(Function terms)
Apply Distributivity laws to convert Matrix of
the result in conjuntive normal form (conjunction
of discjunction of literals)
This transformation results in a formula which is
not logically equivalent, but it is
satisfiability-preserving (which is enough for
the ATP methods later)
Example

21
Clause Normal Form (II)

A formula A is in Clause NF can be written as A
8 x1 8 xn ( (L1,1 Ç Ç L1,m1) Æ Æ (Lk,1 Ç
Ç Lk,mk)) where Li,j is a literal (negated or
positive atomic formula)
Since every formula can be transformed into CNF,
the CNF can be seen as logic free
representation of a formulae
All quantors are universal, no free variables are
allowed -gt drop quantors
Matrix is in CNF Conjunction of Disjunction of
Literals -gt Model as a Set of Sets of Literals
Example
The sketched transformation to CNF is not optimal
Exponential blowup possible (already for NNF)
Syntactical structure of the original formula
gets lost
Skolemsymbols have unnecessarily many parameters
Unnecessarily many new skolem systems are
introduced
One can improve all these aspects of a
transformation to CNF!
Skolemization before PNF transformation,
Definitorial CNF for Matrix, Reuse of Skolem
functions

22
Metaresults

Metaresult Property of a Logic L
Most famous example Gödels Incompleteness
Theorems!
Here some metaresults for FOL which form the
theoretical foundation of ATP
carry over to many other logics as well
Deduction Theorem
If M s ² s then M ² s ! s
Logical entailment can be reduced to validity
Proof by contradiction
If M is a set of closed formulae thenM ² s iff.
M s is unsatisfiable (i.e. has no model)
Logical entailment can be reduced to
unsatisfiability checking
Refutation can be used as a universal principle
for inference in FOL

23
Metaresults (II)

Complexity of logical entailment, validity and
satisfiability
Propositional Logic
Logical entailment (²-relation) is decidable,
Satisfiability too
Set of valid formulae is co-NP-complete
Set of satisfiable formulae is NP-complete
First-order Predicate Logic
Logical entailment / validity / satisfiability is
undecidable
Set of valid formulae is semi-decidable
(recursively enumerable)
Set of satisfiable formulae is not recursively
enumerable

24
Metaresults (III)

Term Interpretations and Herbrand Theorem
S (U,I) is term-interpretation if U Term0?
Let Term0? be non-empty. An interpretation S
(U,I) is called Herbrand-Interpretation if
S is term-interpretation and
I(f)(t1,,tn) f(t1,,tn) for all n-ary function
symbols f 2 ? and ground terms t1,,tn
Herbrand-Modell of s is Herbrand-Intp. I with I ²
s
Herbrand-Interpretations are special because they
have a simple universe (syntactical) and Terms
are basically uninterpreted. Quantifiers then
have ground terms as their range!
Computers can deal with such special
(syntactical) interpretations, but not with
interpretations in general!

25
Metaresults (IV)

Term Interpretations and Herbrand Theorem
Let M be a set of closed formulae s in
Prenex-Normalform that contain no existential
quantors (for instance s in CNF)
Let T be a set of terms (over signature ?)
T(M) set of T-instances of M, i.e. replace
every occurence of a (universal) variable in any
formulae in M with any term in T
Herbrand Theorem
Let Term0? be non-empty and M a set of formulae
in Prenex-NF without existential quantors.
Then the following statements are equivalent
M has a model
M has a Herbrand-model
Term0?(M) has a model
The last set is a set of formulae in
propositional logic

26
Metaresults (V)

Compactness of FOL
A (possibly infinite) set M of formulae has a
model iff every finite subset M ½ M has a model
(i.e. is satisfiable)
Combining Compactness with Herbrands Theorem
Let Term0? be non-empty and M a set of formulae
in Prenex-NF without existential quantors.
Then M is unsatisfiable iff. T(M) is
unsatisfiable for a finite set of ground terms T
½ Term0?
Note that T is a finite set of ground terms over
the signature ? of the formula set M
No external functions symbols have to be
considered!
Allows for using guided substitutions
(Unification!)

27
Metaresults (VI)

That means logical entailment / validity can be
checked
by reduction to unsatisfiabiliy of a set of
formulae M
which can done by finding suitable finite
(counter)-examples for the quantfied variables
such that a contradiction arises
One can only use the Signature ? of the given set
M to find the counterexamples
Basically this is what all ATP procedures do
Find a finite set of counterexamples (objects)
such that a respective instance of the orginial
formula set is determined as being inconsistent
(unsatisfiable)
The theorem immediately gives an algorithm for
ATP!
Problem How to construct / find T in the theorem
in a clever way?

28
Herbands TheoremFrom Clause Logic to
Propositional Logic
Clauses
Clause Logic
(Ground) Substitutions ?
Incons- istent set
Ground clauses
Propositional Logic
29
Part IIIThe Resolution Calculus

Pre-resolution phase
Gilmores Methods, Davis-Putnam Procedure
Unification
Basic Resolution Calculus
Refinements, Redundancy

30
Pre-Resolution period Gilmores Method

First ATP procedure for First-order logic
Directly based on Herbrands Theorem
Reduction of FOL entailment to satisfiability in
Prop. Logic
How to generate candidates C for propositional
satisiability checking from a FOL clause set C
Saturation by ground instances from Hn(C) ( set
of ground terms of depth n)
More precisely Successively generate the sets
Cn of ground clauses c? c 2 C and rg(?) µ
Hn(C)
Since H_n( C) grow exponentially it is very
important to have a good algorithm for checking
satisfiability

31
Pre-Resolution period Gilmores Method

Easy test of satisfiability of the generated C
set of ground clauses
Transform C into Disjunctive Normal Form
D DNF(C) is unsatisfiable iff every
consitutent of D contains a contradiction L Æ L
for some literal L
Can be done in deterministic time O(n log(n))
Problem Convertion from CNF into DNF (almost
always) exponential (inherently complex, since
otherwise P NP), (not known at that time!)
Pseudocode

begin contr false while not contr do
D DNF(C_n) contr all constitutents of
D contain complementary literals nn1
end while end
32
Pre-Resolution period Gilmores Method

Weak points of Gilmores approach
The generation of the candidate ground clause
sets Cn to be checked
the discjunctive normal form transfomation
First weakness is inherent to all procedures
directly applying Herbrands theorem
The second problem concerns propositional logic
only
Gilmores pioneering implementation did not yield
actual proofs for quite simple predicate logic
formulas
A possible improvement
Avoid transformation to DNF and try to find
good decision methods for satisfiability on
CNFs
This is basically what was achieved by Davis and
Putnam DP,1960 shortly after Gilmores
implementation

33
Pre-Resolution periodDavis-Putnam Procedure

Like Gilmores method based on successive
production of ground caluse sets CN and testing
of their unsatisfiability
(Still) very efficient decision method for
satisfiability. Requires CNF for ground clauses.
Invented originally for FOL, it became the most
powerful SAT decision procedure for Propositional
Logic. Many very powerful SAT solvers still are
refining DPP today.
Davis-Logemann-Loveland Rules DLL, 1962
Preliminary step Reduce all clauses in C
Eliminate multiple occurrences of the same
literal (leave only one). Generates a clause set
C
Then apply the follwing rules non-deterministicall
y to C
Tautology-Rule
One-Literal-Rule
Pure-Literal-Rule
Splitting-Rule

34
Pre-Resolution periodDavis-Putnam Procedure

Davis-Logemann-Loveland Rules DLL, 1962
Tautology-Rule Delete all clauses in C
containing complementary literals
One-Literal-Rule If there is a clauses c l
with only one literal l, remove all clauses d
from C which contain l, and remove the dual
literal ld from all other clauses
Pure-Literal-Rule Let D µ C with the following
property There exists a literal l appearing in
all clauses of D, but ld does not appear in C.
Then delete D from C
Splitting-Rule Let C A1,,An,B1,,Bm R
such that R contains l nor ld, all Ai contain l
but not ld and all Bj contain ld but not l. Let
Ai Ai after deletion of l and let Bj Bj
after deletion of ld.Then split C into C1
A1,,An R and C2B1,,Bm R
Properties of the DLL procedure
The rules are essentially reductive (atoms are in
each step deleted)
The rules are correct (rules preserve
satisfiability in case of split only for one of
the new introduced clauses sets
The procedure generates sets that contain the
empty clause for all cases (of the applied
splits) iff C is unsatisfiable (decision
criteria correctness and completeness,
termination)

Example C P Ç Q, R Ç S Ç S, R Ç S, R Ç S,
R Ç S, P Ç Q Ç P
35
Pre-Resolution periodDavis-Putnam Procedure

Pseudocode of the First-order ATP procedure by
Davis Putnam

begin C finite set of clauses if C does not
contain (real) function symbols then apply
DP1 DP3 to C_0 check the DP decision tree for
unsatisfiability else begin n 0 contr
false while not contr do perform DP1 DP3 on
C_n if the DP-decision tree proves
unsatisfiability then contr true else
contr false nn1 end while end end

Nondeterministic (DP3)
If C does not contain function symbols (with
arity gt 0) then the procedure always terminates
( decision procedure for FOL clause set)
If C is satisfiable and C contains function
symbols then the algorithm does not terminate
Yields a decision procedure for validity of the
Bernays-Schönfinkel class in FOL (8 9)

DP1 Reduce all clauses DP2 Delete all
tautologies DP3 Construct a DP decision tree
according to the given rules
36
InterludeInferences Inference systems

An inference I has the formwhere n 0, F1,,Fn,
G are formulae
An inference rule R is a set of inferences
more precisely a decidable (usually efficiently
computable) n1-ary relation over formuale
Usually one uses schematic variables for
representing formulae in inference rules and
attach some (most often syntactic) conditions to
these variables
Every instance I 2 R is called an instance of R
An inference system is a (finite) set of
inference rules
A proof of G from P in is a finite sequence of
formulae F1, Fn such that
Fn F and
for all Fi (i n) it holds that either Fi 2 N or
there is an inference I such that Fi is the
conclusion of I and all the premisses P1, Pj of
I are contained in the prefix F1, , F(i-1)
Here we mainly consider inference systems on
clauses, for instance Resolution

F1 F2 Fn G
Premisses
Conclusion
37
A Revolution in ATP Robinsons Resolution
Principle

In some sense the simplest possible calculus for
FOL (without equality)
In principle only a single inference rule which
combines substution and atomic cut
Possible since it requires set of input formulae
in CNF (very simple and uniform syntactic form)
Binary substitution rule computing a minimal
substitution which makes two atoms equal
A quote from Robinsons landmarking paper
Robinson, 1965
Theorem-proving on the computer, using procedures
based on the fundamental theorem of Herbrand
concerning the FOL Predicate Calculus, is
examined with a view towards improving the
efficiency and widening the range of practical
applicability of these procedures. A close
analysis of the process of substitution (of terms
for variables) and the process of
truth-functional analysis of the results of such
substitutions reveals that both processes can be
combined into a single new iterating process
(called resolution) which is vastly more
efficient than the older cylcic procedures
consisting of substitution stages alternating
with truth-functional analysis stages.

38
A Revolution in ATP Robinsons Resolution
Principle

The basic Resolution Calculus (BRC)
Ground case
General case
Fundamental aspects
Iterative grounding of the clause set
Guided guessing of interesting instances
(Unification) built into the calculus
Resolving upon an atom L does not require L to be
ground (unnecessary grounding avoided)

L Ç C L Ç D C Ç D
C Ç L Ç L C Ç L
Binary Resolution
Factoring
L Ç C L Ç D (C Ç D)?
C Ç L Ç L (C Ç L)?
Binary Resolution
Factoring
where ? is the most general unifier of L and L
39
Basic Resolution CalculusProperties

Properties of the basic Resolution Calculus
Given any two clauses, there are only finitely
many resolvents using the Resolution Inference
Rule.
The Resolution Calculus is sound
If c is provable from C in BRC then C ² c
This means in particular If we can derive the
emtpy clause then C is unsatisfiable
The Resolution Calculus is refutationally
complete
A set C of clauses is unsatisfiable then the
empty clause can be proven (derived) from C
Altogether
A set C of clauses is unsatisfiable iff. there is
a proof for the empty clause from C in BRC
Remark Soundness of the inference system can be
relaxed to satisfiability- preserving!
How to find a contradiction (empty clause)
starting with an initial (unsatisfiable) formula
set?
Saturation approach (wrt. the inference system
BRC)

40
ResolutionProof search by Saturation

Saturated sets
A set of clauses C is called saturated (wrt.
inference system ?) if every inference in ? with
premises in C gives a clause in C
Completness reformulated (in terms of saturated
sets)
A set C of clauses is unsatisfiable iff every
saturated set S of clauses with C µ S also
contains the empty clause
That means Simply construct a(ny) saturated set
S of clauses (wrt. BRC) S (saturation algorithm)
Simple algorithm

S set of input clauses
while not finished do
Repeatedly apply all inferences to clauses in S,
adding to S conclusions of these inferences
If the empty clause is proved, terminate with
success. If no inference rule is applicable,
terminate with failure

41
ResolutionProof search by Saturation
Conclusions
42
ResolutionProof search by Saturation

Most likely scenario .

43
ResolutionProof search by Saturation

Possible theoretical scenarios
At some moment the empty clause is generated, in
this case the input set of clauses is
unsatisfiable
Saturation will terminate without ever generating
the empty clause, in this case the input set of
clauses is satisfiable
Saturation will run forever, but without
generating the empty clause. In this case the
input set of clauses is satisfiable
Possible practical scenarios
At some moment the empty clause is generated, in
this case the input set of clauses is
unsatisfiable
Saturation will terminate without ever generating
the empty clause, in this case the input set of
clauses is satisfiable
Saturation will run until we run out of
resources, but without generating the empty
clause. In this case it is unknown whether the
input set of clauses is (un)satisfiable

44
ResolutionHow to saturate in clever way ?

The simple saturation algorithm is highly
inefficient
Apply inferences not in an arbitrary way, but
within some senseful / useful order.
Generate the empty clause as early as possible in
the saturation process
Prefer some inferences over others (in a
sense), for instance goal directedness
Actually what we need to ensure then to have
completness guaranteed is fairness
A saturation algorithm is fair iff every possible
inference is eventually selected
Completness Theorem reformulated (for Saturation
Algorithms)
Let A be a fair saturation algorithm. A set C of
clauses is unsatisfiable iff A eventually
produces the empty clause
Central problem How to find good saturation
algorithms!

45
How to guess suitable instances?Unification
46
ExampleBasic Resolution Calculus
47
Enhancing Efficiency Refinements of Resolution
48
Resolution RefinementsHyperresolution
49
Resolution RefinementsOrdered Resolution
50
Enhancing Efficiency Redundancy Criteria in
Resolution
51
Part IVChain Resolution