Title: Farrar on Ontologies for NLP
1Farrar on Ontologies for NLP
- Fourth w/s on multimodal semantic representation,
Tilburg 10-11 Jan 2005
2How I understand the paper
- Activity 4 sometimes titled
- Logico-semantic relations / entities
- Semantic Data Categories
- Farrars paper discusses
- Ontologies for NLP (cf. referential descriptors)
- Defense of Bateman 1992, The theoretical status
of ontologies in NLP - Main claim ontologies for NLP need to be shaped
by linguistic concerns
3The ontology debate (Hobbs 1985, quoted by
Bateman)
- Semantics is the attempted specification of
the relation between language and the world. (..)
There is a spectrum of choices one can make (..).
At one end of the spectrum (..), one can adopt
the correct theory of the world, the one given
by quantum mechanics and the other sciences. (..)
At the (other) end, one can assume a theory of
the world that is isomorphic to the way we talk
about it.
4The standard response to Hobbs question
- Have your cake and eat it multilevel semantics
(e.g., various systems at Philips and BBNearly
theories of underspecification) - a deep semantics thats as close to the
scientific facts as you like - a shallow semantics thats as close to
linguistic structure as you like - any finite number of levels in between
- a computable mapping between adjacent levels
5Farrar Bateman
- Defend the Generalised Upper Model (Bateman et
al.) - Appear to use multilevel semantics separation
between linguistic and nonlinguistic levels of
knowledge. Conceptual/semantic distinction - Appear to argue against classic denotational
(e.g.,Montague-style) semantics - (e.g., this seemed to be the gist of example
school?xpurpose(x)learning,where x is
either institution or building)
6- To find out how the Upper Model compares with
denotational multilevel semantics, lets look at
some examples
7Example 1 American in TENDUM(Bunt et al.,
Philips/IPO)
- Shallow semantics
- American company ?x(company(x) AM(x))
- American passenger ?x(passenger(x) AM(x))
- American airplane ?x(airplane(x) AM(x))
- Deep semantics replace AM by one of
?x(country(headq(x))USA
company?x(nationality(x)USA
passenger?x(country(headq(carrier(x)))USA
airplane?x(country(headq(builder(x)))USA
airplane
8The idea behind this trick
- Shallow semantics one shallow constant AM
used for many different properties, matching
English usage - Deep semantics AM replaced by an expression that
has straightforward denotation in the world
9Generalised Upper Model (GMU)
- GMU appears to opt for shallow constants
- Often, these appear to cover cases that are
semantically very different - It is not always clear how they are linked with
deep (denotational) expressions(cf., Bunt
Romary 2004 Do concepts have modeltheoretic
semantics?) - This may result in formulas whose meaning we
dont really understand
10Example 2 Generalised possession (after Bateman
1992)
- The handle of the door, the doors handle, ..
- part-of relation
- The desk of John, Johns desk, ..
- ownership relation
- The son of Abraham, Abrahams son, ..
- son of relation
- The father of Isaac, Isaacs father, ..
- father of relation
11Shallow relation constant POS(part of, owned by,
)
- The handle of the door?x(POS(x,door)
handle(x)) - The desk of John?x(POS(x,John) desk(x)) But
- Abrahams son?x(POS(x,Abraham) son(x)) ???
- Isaacs father?x(POS(x,Isaac) father(x))
???
12The issue illustrated by possession
- Is a shared form sufficient for postulating a
shared (shallow) representation (e.g., POS in the
case of generalised possession)? - This issue can also be illustrated by focussing
on the definite determiner
13Example 3 Identifiability
- Generalised Upper Model works with abstract
notions like definiteness (identifiability)
The pope lives in Italy He is the son of a rich
banker A man collapsed. (..) The man died He
is the best left-footer in Scotland - All these could be generated from something like
.. IDENT pope.., ..IDENT son .., etc.
14Identifiability (ctd.)
- Using one constant IDENT does not tell us what
the different usages of the definite article have
in common. - In NLG, it could put an unreasonable burden on
previous modules (which decide whether to
generate IDENT or not) - Perhaps the right question is not How close to
NL should an ontology be?, but How do we link
the different levels of meaning?
15Example 4 The weather
- The deepest question in this area How deep
should deep representations be?(Quantum
mechanics, cf. Hobbs??) - The wind blows (fiercely), and its snowing
too - What we want
- Generate this from numerical weather data
- Interpret it and draw inferences
- Note it ltgt the wind
16The wind blows (fiercely)
- Suppose shallow rep. blow(w) deep rep.
speed(w)gt50mph - Mapping from shallow to deepblow
?x(speed(x)gt50mph) - The wind blows shallow blow(w)deep
?x(speed(x)gt50mph)(w) speed(w)gt50mph
17It is snowing
- Whats a suitable shallow representation?
- it does not refer
- Maybe just an atomic proposition Snow
- Possible mapping from shallow to deep
- Snow ?x(precipitation(x)
type(x)snow
quantity(x)gt10mm p/h)
18Questions
- Are these the kinds of mismatches between NL and
reality that you see as the main challenges for
building ontologies that are useful to NLP? - Does the proposed (classical multilevel
semantics) approach look reasonable to you? - How does this approach compare to the
Generalised Upper Model?