Title: Automated Reasoning
1Automated Reasoning
- Reasoning with Natural Language
- Lecture 5 02-05-05
- David Ahn
2Logical reasoning andnatural language
- How can we use logic in reasoning with natural
language? - As a representation language for information or
knowledge extracted from language - As a way to represent and reason about linguistic
structure - In this lecture
- Reasoning logically about temporal information
extracted from language (1) - Generating logical representations of the
semantic content of natural language (1 and 2)
3Outline
- Reasoning with temporal annotations
- Computational semantics, part I
4Temporal reasoning Primitives
- Primitive entities
- Times
- Points
- Intervals
- Events
- Metric vs. qualitative information
- Appropriate for natural language?
- Metric information about times is expressible
- But qualitative relationships b/t events and
actions are grammaticalized (reflected in syntax
and morphology)
5Temporal reasoning Representations
- There is a wide range of formalisms for reasoning
about time - Tense logics
- Temporal algebras
- Event- and action-based languages (mostly
designed for AI planning but also applied to
natural language semantics) - Situation calculus can be cast as hybrid FOL
- STRIPS can be cast as dynamic linear logic
- Event calculus
6Reasoning with temporal annotations
- How can we reason with temporal information
extracted from natural language? - TimeML
- Guidelines for annotating all expressions
referring to times and events in text - Mechanisms for annotating temporal relations
between times and events - Relational annotation is underspecified
annotators annotate what is salient to them - Good for machine learning, perhaps
- But annotations are difficult to compare
- And for applications, we want all the relations
the text supports
7Evaluating annotations
- Annotations need do be compared semantically, not
syntactically - The following annotations are equivalent
8Evaluating annotations
Before she arrived John met the girl who won the
race.
Before she arrived John met the girl who won the
race.
9Temporal closure
- One approach to filling out the temporal
relations in an annotation - Use an appropriate set of inference rules
- To produce a maximal representation of the
temporal information contained in that annotation - Such a representation is termed the temporal
closure of that annotation - Temporal closure can be used
- For comparison/evaluation of temporal annotation
- To facilitate annotation thru mixed-initiative
tagging - To provide a full set of temporal relations for
down-stream applications
10Temporal closure
- The identifiers for annotated event and time
expressions form two sets, E and T, respectively - Temporal relations are binary relations between
events and times, and so the denotation of each
is a subset of (E?T) ? (E?T) - Relations are further specified either by via
axioms or via a composition table which shows how
they combine - Some rules concern only one relation, and follow
logically from the formal properties of the
relation. E.g., S(imultaneous) is an equivalence
relation, while B(efore) is transitive also
I(ncludes) - (x,y) ? S ? (y,x) ? S
- (x,y) ? B ? (y,z) ? B ? (y,z) ? B
- Other inference rules capture interactions
between relations that follow naturally from
their intended meaning. Examples - (x,y) ? B ? (y,z) ? I ? (x,z) ? B
- (x,y) ? I ? (y,z) ? S ? (x,z) ? I
11Temporal closure
- Let St denote the simultaneity pairs explicitly
specified by an annotated text t, and likewise
for Bt and It - These combine to give the overall temporal model
of the text Mt ?St ,Bt ,It? - The inference rules can be applied to this model
to generate the deductive closure Mt? - Two alternative annotations t and t are
equivalent just in case the deductive closure of
their models are equivalent i.e. Mt? Mt?
12Choices for temporal inference
- What candidates are there for
- Logics of temporal relations
- Closure algorithms
- At least the following
- Interval algebra of Allen (1983)
- Implementation of reduced algebra of Setzer
(2001) - Point algebra of Villain and Kautz (1986)
- Conceptual neighborhoods of Freksa (1992)
- Implementation of Verhagen (2004)
13Interval algebra Allen (1983)
- Intervals (not points) are primitive temporal
entities - 13 primitive relations b/t temporal entities
- Actual relation b/t two intervals is a
disjunction of primitive relations
14Interval algebra Allen (1983)
- Interval relations are transitive
- If we know a and b are in relation R1 and b and c
are in R2, then we can restrict the set of
possible relations for a and c. - Given s(a, b) and o(b, c)
- Infer lt(a, c) or m(a, c) or o(a, c)
- Provided a 13 x 13 transitivity table specifying
ways of composing temporal relations, i.e.,
specifying what inferences are allowed in his
logic - Provided a constraint propagation algorithm for
computing the closure according to the
composition algebra
15Interval algebra Allen (1983)
- A subset of the transitivity table for 3 (core)
relations
16Temporal constraint networks
- Constraints Given intervals i and j, a temporal
constraint (i, j) R (where R is a set of Allen
relations) says that i and j are supposed to
stand in one of the relations in R - e.g., (i, j) before, meets, overlaps
- TCNs A temporal constraint network over a set of
intervals I is a set of constraints involving
intervals in I - Consistency A TCN over a set of intervals I is
consistent iff we can map the left and right
endpoints of each interval in I to a (real)
number such that all constraints are satisfied
(and no interval has length 0)
17Computing closureNormalizing TCNs
- We can normalize a given TCN
- Add inverse constraints for (i, j) R, add (j,
i) R-1 - If there are two constraints (i, j) R1 and (i,
j) R2, replace them with (i, j) R1 n R2 - Add (i, i) equals for every i
- Add full constraint (i, j) before, meets, ...
if there is no other constraint for (i, j) - A normalized TCN containing an empty constraint
is inconsistent
18Computing closureConstraint propagation
- Transitivity again Let tr(r1, r2) denote the
entry in the transitivity table for the interval
relations r1 and r2 - e.g., tr(starts, overlaps) before, meets,
overlaps - We generalize this to sets of relations
- constraints(R1, R2) r r ?tr(r1, r2) r1 ?R1
r2 ?R2 - Constraint propagation Whenever we find (i, j)
R1 and (j, k) R2 in a TCN, add (i, k)
constraints(R1, R2) - To show that a given TCN is inconsistent, we
apply constraint propagation, normalize, and look
for an empty constraint - Constraint propagation and normalization are sound
19Constraint propagation is incomplete
- Constraint propagation is polynomial time, but
- Constraint propagation is not completeit allows
inconsistent networks, e.g. - (a, b) during, contains, (a, c)
finishes, finished-by, - (a, d) met-by, started-by, (b, c)
during, contains, - (b, d) overlapped-by, (c, d) met-by,
started-by - Establishing consistency in full interval algebra
NP-hard
20Interval algebra and TimeML
- TimeML TLINK relations based on Allens IA
- overlaps and is_overlapped_by collapsed into
DURING - But, there is an important difference
- Allen permits relations between intervals to be
indefinite disjunctions of his 13 primitive
relations - The relation b/t any 2 entities in a TimeML
annotation must be a single, definite relation
from the 12 primitive relations - TimeML relations not closed under composition
21Reduced implementation (Setzer 2001)
- Simple temporal closure algorithm for TimeML (and
TimeML-like markup) - Takes 3 of the Allen relations (, lt, contains)
as primitive - All TimeML relations are down-mapped on to these
- No disjunctive relations are permitted
- Axioms relating the temporal relations are given
and include those captured by the following
transitivity table
- Note that the cell that was disjunctive for Allen
is empty no conclusion may be drawn in this case
22Point algebraVilain and Kautz (1986)
- Address computational intractability of Allens
approach - Define a point algebra (PA) for intervals as
follows - Intervals represented by their endpoints
- Intervals i, j expressed as (i-,i), (j-,j)
- Relations between intervals expressed by
conjunctions of relations lt, , gt holding between
interval endpoints - i d j ? i- gt j- ilt j i- lt i
j- lt j - Allens 13 basic relations may be encoded in PA
- However, not all 213 disjunctive relations
- Only 82, but they are closed under composition
- Point algebra is sound, complete and tractable
23Point algebraVilain and Kautz (1986)
- Expressing the 3 core Allen relations in PA
24Conceptual neighborhoodsFreksa (1992)
- Freksa notes that in Allens approach
- Easy to reason with complete, fine-grained
knowledge of temporal relationships - Hard to reason with incomplete, coarse-grained
knowledge - Since latter more common, consider developing a
framework which reverses this - Evangelina watched the brick walls of the
terrace across the park ignite in the evening
sun, as Ben trudged wearily home beneath them.
25Semi-intervals
- In Allens framework, such vagueness can only be
captured using disjunctions of basic relations - Freskas suggestion adopt uncertain relations
(disjunctions of the basic Allen relations) as
primitives and require more complex
representation/reasoning for the precise cases - Like PA, events are represented by their
beginnings and endings - But these beginnings and endings (semi-intervals)
are not points, but intervals themselves - Like PA, these beginnings and endings may be
related by lt,,gt
26Semi-intervals andconceptual neighbors
Interval 1 (a, ?) Interval 2 (?, ?)
Adjoining relations are conceptual neighbors
Groups of conceptual neighbors
form conceptual neighborhoods
27Conceptual neighbors
- Definition Two relations between a pair of
intervals are conceptual neighbours if they can
be transformed into one another by continuous
deformation (shortening, lengthening, moving) of
the intervals, w/o passing through another
relation
- Definition A conceptual neighbourhood is any set
of relations path-connected through the
conceptual neighbour relation
28Conceptual neighborhoods
- Conceptual neighhourhoods provide a way of
constructing complex disjunctions of relations
that are likely to correspond to vague or
uncertain knowledge about relational situations
29Reasoning with conceptual neighborhoods
- Freksa observes that all of the disjunctive
relations occurring as a composition of two of
the basic Allen relations form conceptual
neighbourhoods - A 29 x 29 composition table may be formed
consisting of the 13 basic Allen relations plus
all (16) of the conceptual neighbourhoods
occurring in the cells of the Allen transitivity
table - This table has the properties that
- It is closed under composition
- The relations in it are a subset of the 82 in the
point algebra - Thus, this particular conceptual neighbourhood
algebra (CNA) inherits the tractability of the
point algebra - Implications for TimeML
- Adopting something CNA would allow the
introduction of a mild form of disjunction which
might be useful for annotating vague relations - An algorithm based on Freksas table would be
more efficient that the original Allen procedure
and is complete wrt the underlying algebra
30Sputlink An implementation for TimeML (Verhagen
2004)
- Sputlink A temporal closure system to support
TimeML annotation and evaluation - Supports closure over all of the TimeML relations
- Works with endpoint (semi-interval)
representation of events and times - Temporal relations supported are the 29 relations
proposed by Freksa - Basic Allen relations disjunctive relations
found in the cells of the Allen transitivity
table - Complete and polynomial time
- Embedded in a mixed-initiative temporal
annotation environment that supports text
segmented closure
31Break
32Outline
- Reasoning with temporal annotations
- Computational semantics, part I
33What iscomputational semantics?
- Two fundamental questions
- Semantic construction How can we automate the
process of associating semantic representations
with natural language expressions? - Inference How can we use semantic
representations of natural language to automate
the process of drawing inferences? - Key idea
- Use the lambda calculus to build
- Logical semantic representations
34Why logic as semantic representation?
- Logics have precise model-theoretic semantics (at
least the ones we consider) - Translation of a NL sentence into a logical
formula gives us a precise grasp on (part of) the
meaning of the sentence - Logics also have inference procedures, often with
computational implementations (again, at least
the ones we consider) - Inference is vital for natural language
35Why the lambda calculus?
- We use the lambda calculus as a glue language
- Connects natural language syntax to logical
semantic represetations - The lambda calculus itself is well-understood
- Has a logical interpretation
- Is the foundation for functional programming
languages - The lambda calculus is flexible
- Experimenting with different semantic
representation languages is straightforward
36Semantic construction
- Given a sentence of English, is there a
systematic way of constructing its semantic
representation? - Lets start simple
- Take first-order logic as semantic representation
language. - Is there a systematic way of translating simple
sentences like these into FOL - Vincent likes Mia
- Every woman snorts
- Every boxer loves a woman
37Meaning flows from the lexicon
- What is the appropriate representation for
Vincent likes Mia? - like(vincent, mia)
- Where does this come from?
- Proper name Vincent introduces constant vincent
- Proper name Mia introduces constant mia
- Verb likes introduces constant like
- Ultimately, meaning flows from the lexicon
38What do other words contribute?
- Meaning flows from lexicon is a simple slogan but
raises nontrivial questions - Every woman snorts is represented as
- ?x.(woman(x) ? snort(x))
- What is the contribution of the determiner Every
- The ??
- The ??
- Both together?
- How can we make the contribution precise?
39What does syntax contribute?
- Why is the representation of Vincent likes Mia
like(Vincent, Mia) and not like(Mia, Vincent)? - How do the pieces supplied by the lexicon get
glued together in the right way? - Basic principle Syntactic structure should guide
semantic construction
40Syntactic structure guides semantic construction
S Vincent likes Mia like(Vincent, Mia)
NP Vincent Vincent
VP likes Mia like(?, Mia)
TV likes like(?, ?)
NP Mia Mia
41Compositionality
- Methodological principles in semantic
construction - Meaning (representation) ultimately flows from
the lexicon - Meanings (representations) are combined using
syntactic information - Compositionality the meaning of the whole is a
function of the meaning of the parts (Frege) - where the parts are the substructure given by
the syntax
42Tasks in semantic construction
- Task 1 Specify a reasonable syntax for a
fragment of natural language - Task 2 Specify semantic representations for the
lexical items - Task 3 Specify the translation
compositionallyspecify the translation of each
expression in terms of translations of its parts
43Task 1 Context-free grammar
- s --gt np, vp.
- np --gt pn.
- np --gt det, noun.
- pn --gt vincent.
- pn --gt mia.
- det --gt a.
- det --gt every.
- noun --gt woman.
- noun --gt foot, massage.
- vp --gt iv.
- vp --gt tv, np.
- iv --gt walks.
- tv --gt loves.
- tv --gt likes.
- This grammar accepts sentences like
- Vincent likes Mia
- Every woman walks
44DCGs parsing as deduction
- Axiomatizing CFGs
- Take nonterminals to be binary relations on
positions in an expression - A position divides an expression into two
subexpressions which concatenated together form
the original expression - Every CFG rule
- N0 ? V1 Vn
- can be axiomatized as
- V1(p0, p1) ? ? Vn(pn-1, p) ? N0(p0, p)
- which is in Horn clause (or definite clause)
form - Thus, CFGs can be stated as Prolog
programsDefinite Clause Grammars (DCGs) - In Prolog, an expression is represented as a list
of words, and a position in an expression, as the
sublist beginning at the position - Prolog allows DCGs to be written directly as a
notational convenience
45DCGs parsing as deduction
- A first example of logical reasoning about
language - Prologs left-to-right, depth-first, backtracking
proof procedure yields a top-down, depth-first,
left-to-right parsing mechanism - DCGs also allow nonterminals to have extra
arguments, which can be used to propagate
information up the parse tree - Extra arguments correspond to features in
Generalized Phrase Structure Grammar (GPSG),
where they are used for agreement and
long-distance dependencies (in addition to
semantics) - We will use extra arguments to pass around pieces
of semantic representations
46Building representations
- Specify meanings of the lexical entries
- Typically parts of formulas
- Indicate where the information needed has to come
from - Using syntactic information
- One idea use features
- i.e., extra arguments in DCGs
47DCG with semanticsLexical entries
- noun(X, woman(X)) --gt woman.
- pn(jules) --gt jules.
- iv(Y, snort(Y)) --gt snorts.
- tv(Y, Z, love(Y, Z)) --gt loves.
- det(X, Restriction, Scope, exists(X,
and(Restriction, Scope))) --gt a. - det(X, Restriction, Scope, forall(X,
implies(Restriction, Scope))) --gt every.
48DCG with semanticsProduction rules
- s(Sem) --gt np(X,SemVP,Sem), vp(X,SemVP).
- vp(X,Sem) --gt tv(X,Y,SemTV), np(Y,SemTV,Sem).
- vp(X,Sem) --gt iv(X,Sem).
- np(X,Scope,Sm) --gt det(X,Restr,Scope,Sm),
noun(X,Restr). - np(SemPN,Sem,Sem) --gt pn(SemPN).
49Every woman snorts
- ?- s(Sem, every,woman,snorts,).
- (7) s(Sem, every,woman,snorts, )
- (8) np(X, Sc, Sem, every,woman,snorts, _G1)
- (9) det(X, Re, Sc, Sem, every,woman,snorts,
_G2) - (9) det(X, Re, Sc, forall(X, implies(Re, Sc)),
every,woman,snorts, woman,snorts) - (9) noun(X, Re, woman,snorts, _G3)
- (9) noun(X, woman(X), woman,snorts, snorts)
- (8) np(X, Sc, forall(X, implies(woman(X), Sc)),
every,woman,snorts, snorts) - (8) vp(X, Sc, snorts, )
- (9) iv(X, Sc, snorts, )
- (9) iv(X, snort(X), snorts, )
- (8) vp(X, snort(X), snorts, )
- (7) s(forall(X, implies(woman(X), snort(X))),
every, woman, snorts, ) - Sem forall(_G353, implies(woman(_G353),
snort(_G353)))
50How does it work?
- Explicitly marking missing information provides
good control - Much of the work done by rules, which rely on
Prologs treatment of variables - Something is missing a more disciplined approach
to missing information could reduce (or
eliminate) rule-specific combination methods - The lambda calculus provides this discipline
51The lambda calculus
- The lambda calculus introduces a ? operator that
binds variables - ?-bound variables are placeholders for missing
information - Functional application placing a lambda
expression in front of another expression (its
argument) is an instruction to substitute the
argument for the ?-bound variables - ß-conversion the operation that carries out the
substitution - a-conversion renames variables apart
- The lambda calculus is a glue language, with the
dedicated task of gluing together items needed to
build semantic representations
52The lambda operator
- The lambda operator marks missing information by
binding variables - A simple lambda expression
- ?x.man(x)
- The prefix ?x binds the occurrence of x in man(x)
- This expression can be read as
- I am a 1-place predicate man, and I am looking
for a term to fill my argument slot.
53Functional application
- We will (non-standardly) use _at_ as an operator to
indicate functional application - Continuing our simple example
- ?x.man(x) _at_ vincent
- ?x.man(x) is called the functor
- vincent is called the argument
- This expression can be read as
- Fill each placeholder in the functor by an
occurrence of the argument vincent
54ß-conversion
- The required substitution is performed by
ß-conversion - From
- ?x.man(x) _at_ vincent
- ß-conversion produces
- man(vincent)
- Basically, ß-conversion involves throwing away
the ?x and substituting the argument for all
occurrences of x that were in the scope of ?x
55Lambda-abstraction over predicates
- Since our representation of Every woman snorts
is - ?x.(woman(x)?snort(x))
- Our representation of Every woman is
- ?Q.?x.(woman(x)? Q(x))
- And our representation of Every is
- ?P.?Q.?x.(P(x)? Q(x))
56Every boxer growls
- Step 1 Assign lambda expressions to the basic
lexical items - boxer ?y.boxer(y)
- growls ?x.growl(x)
- every ?P.?Q.?x.(P(x)? Q(x))
57Comparison to initial approach
- In the first attempt, every was represented by
- det(X, Restriction, Scope, forall(X,
implies(Restriction, Scope)) - Substituting P and Q for Restriction and Scope
- det(X, P, Q, forall(X, implies(P, Q))
- This is clearly analogous to
- ?P.?Q.?x.(P(x)? Q(x))
- Whats the big deal?
58The big deal Reasoning about semantic
construction
- We are no longer considering the process of
combining expressions simply as a programming
exercise - We have isolated a representational format
(lambda calculus) that lets us deal with missing
information once and for allimportant data
abstraction - We have isolated the key ideas needed to work
with these represenations - functional application, ß-conversion,
a-conversion - Another example of reasoning about language
- The lambda calculus provides a logical
representation of the dependencies involved in
semantic construction - The lambda calculus has simple inference
mechanisms
59Every boxer growls
- Step 2 Associate the NP node with the
application of the DET representation (functor)
to the NOUN representation (argument)
60ß-conversion
- Applications are instructions to carry out
ß-conversion - Performing the required substitution yields
- every boxer ?Q.?x(?y.boxer(y)_at_x ? Q_at_x)
- This contains a subexpression of the form
?y.boxer(y)_at_x - Another instruction to carry out ß-conversion
- Performing the required substituion yields
- every boxer ?Q.?x(boxer(x) ? Q_at_x)
61Every boxer growls
S every boxer growls ?x(boxer(x) ? growl(x))
NP every boxer ?P.?Q.?x(P_at_x ? Q_at_x)_at_?y.boxer(y)
VP growls ?x.growl(x)
DET every ?P.?Q.?x(P_at_x ? Q_at_x)
Noun boxer ?y.boxer(y)
62A moment of reflection
- In two important respects, our approach to
semantic construction is getting simpler - The process of combining two representations is
now uniform (functional application) - Most of the real work of semantic analysis is
done in the lexicon - This is a sign that we are doing something right
- But lets look more carefully
63Proper names
- Quantifying NPs can clearly be considered
functors - But what about NPs like Vincent?
- Vincent ?P.P _at_ vincent
- Mia ?P.P _at_ mia
- These representations can be used as functors
- This is the most basic example of type-raising
64Vincent loves Mia
NP Vincent loves Mia love(vincent, mia)
NP Vincent ?P.P_at_vincent
VP loves Mia ?x.love(x, mia)
TV loves ?y.?x.love(x, y)
NP Mia ?P.P_at_mia
65Is ß-conversion always safe?
- The representations
- ?x.?y.bloogle(x, y)
- and
- ?z.?w.bloogle(z, w)
- are intended to have the same meaning. The x,
y, z, w are just placeholders they have no
intrinsic meaning - Mostly, things work out fine
- If we apply either expression first to fee and
then to boo, we get, after ß-conversion, the same
result - bloogle(fee, boo)
66But not always
- Things can go wrong if we apply a lambda
expression to a variable that occurs bound in the
functor. - For example, if we take the application
- ?x.?y.bloogle(x, y) _at_ w
- we get ?y.bloogle(w, y), which is what we want
- But, if we take the application
- ?z.?w.bloogle(z, w) _at_ w
- we get ?w.bloogle(w, w), which is not at all
what we want - The variable w has become accidentally bound
67a-conversion
- a-conversion is the process of renaming bound
variables - For example, we obtain
- ?x.?y.bloogle(x, y)
- from
- ?z.?w.bloogle(z, w)
- by a-conversion by replacing z by x and z by w
- When working with the lambda calculus, we always
a-convert before carrying out ß-conversion - All bound variables in the functor are renamed to
be different from those in the argument
68The three tasks revisited
- Task 1 Specify a reasonable syntax for a
fragment of natural language. We can do this
using DCGs - Task 2 Specify semantic representations for
lexical items using the lambda calculus - Task 3 Specify the translation of an item R with
parts F and A in terms of functional application - Specify which part is the functor (say F)
- Specify which part is the argument (say A)
- Use ß-conversion (w/a-conversion) to compute F_at_A
69Looking aheadMore computational semantics
- Quantifier scope ambiguity and underspecified
representations - Another example of using logic to reason about
language - Inference for pragmatics using semantic
representations