Lexical Semantics for the Semantic Web - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

Lexical Semantics for the Semantic Web

Description:

'Web technology must not discriminate between the scribbled draft and the polished performance. ... a person, calm a horse, calm someone's nerves, fears, or ... – PowerPoint PPT presentation

Number of Views:58

Avg rating:3.0/5.0

Slides: 24

Provided by: james878

Category:

more less

Transcript and Presenter's Notes

Title: Lexical Semantics for the Semantic Web

1
Lexical Semantics for the Semantic Web

Patrick Hanks
Masaryk University, Brno
Czech Republic
hanks_at_fi,muni.cz
UFAL, Mathematics Faculty, Charles University in
Prague

2
Outline of the talk

A neglected aspect of Tim Berners-Lees vision
Introducing semantics to the semantic web
Computing meaning and inferences in free text
Patterns in text and how to use them
Building a resource that encodes patterns
linking meanings (implicatures) to patterns (not
to words)
A pattern dictionary
What does the pattern dictionary look like?
The role of an ontology in a pattern dictionary

3
Aims of the Semantic Web

To enable computers to manipulate data
meaningfully
Most of the Web's content today is designed for
humans to read, not for computer programs to
manipulate meaningfully.
Berners-Lee et al., Scientific American, 2001

4
A neglected aspect of Berners-Lees vision

Web technology must not discriminate between the
scribbled draft and the polished performance.
T. Berners-Lee et al.,
Scientific American, 2001
The vision includes being able to process the
meaning and implicatures of free text
not just pre-processed tagged texts Wikis,
names, addresses, appointments, and suchlike.

5
A paradox

Traditional KR systems typically have been
centralized, requiring everyone to share exactly
the same definition of common concepts such as
'parent' or 'vehicle'.
Berners-Lee et al. 2001.
Implying that SW is more tolerant?
Apparently not
Human languages thrive when using the same term
to mean somewhat different things, but automation
does not. --Ibid.

6
The root of the problem

Scientists from Leibniz to the present have
wanted word meaning to be precise and certain.
But it isnt. Meaning in natural language is
vague and probabilistic
Some theoretical linguists (and CL researchers),
not liking fuzziness in data, have preferred to
disregard data in order to preserve theory
Do not allow SW research to fall into this trap
To fulfil Berners-Lees dream, we need to be able
to compute the meaning of un-pre-processed
documents

7
What NOT to do for the SW

The meaning of the English noun second is vague
a short unit of time or 1/60 of a minute.
Wait a second.
He looked at her for a second.
It is also a very precisely defined technical
term in certain scientific contexts the basic
SI unit of time
the duration of 9,192,631,770 cycles of
radiation corresponding to the transition between
two hyperfine levels of the ground state of an
atom of caesium 133.
If we try to stipulate a precise meaning for all
terms in advance of using them, well never be
able to fulfil the dream and we will invent an
unusable language

8
Precision and vagueness

Stipulating a precise definition for an ordinary
word such as second removes it from ordinary
language.
When it is given a precise, stipulative
definition, an ordinary word becomes a technical
term.
An adequate definition of a vague concept must
aim not at precision but at vagueness it must
aim at precisely that level of vagueness which
characterizes the concept itself.
Wierzbicka 1985, pp.12-13

9
The paradox of natural language

Word meaning may be vague and fuzzy, but people
use words to make very precise statements
This can be done because text meaning is
holistic, e.g.
fire in isolation is very ambiguous
But He fired the bullet that was recovered from
the girl's body is not at all ambiguous
Ithaca is ambiguous
But Ithaca, NY is much less ambiguous.
Even the tiniest bit of (relevant) context helps.

10
What is to be done?

Process only the (strictly defined) mark-up of
documents, not their linguistic content?
And so abandon the dream of enabling computers to
manipulate linguistic content?
Force humans to conform to formal requirements
when writing documents?
Not a serious practical possibility
Teach computers to deal with natural language in
all its fearful fuzziness?
Maybe this is what we need to do

11
Hypertext and relevance

The power of hypertext is that anything can link
to anything.
Berners-Lee et al., 2001
Yes, but we need procedures for determining
(automatically) what counts as a relevant link,
e.g.
Firing a person is relevant to employment law.
Firing a gun is relevant to warfare and armed
robbery.

12
How do we know who is doing what to whom?

Through context (a standard, uncontroversial
answer)
But teasing out relevant context is tricky
Firing a person Person MUST be mentioned
Whereas firing a gun occurs in patterns where
neither Firearm nor Projectile are
mentioned, e.g.
The police fired into the crowd/over their
heads/wide.
Negative evidence can be important
He fired cannot mean he dismissed someone from
employment
Relevant context is cumulative
So correlations among arguments are often needed

13
How to compute meaning for the Semantic Web

STEP 1. Identify all the normal patterns of
normal utterances by data analysis
STEP 2. Develop a resource that says precisely
what the basic implicatures of each pattern are,
e.g.
Human fire AdvDirection
Human causes Firearm to discharge
Projectile
STEP 3. Populate the semantic types in an
ontology
STEP 4. Develop a linguistic theory that
distinguishes norms from exploitations
Abandon the received theories of speculative
linguists
STEP 5. Develop procedures for finding best
matches between a free text statement and a
pattern.

14
The double helix of language norms and
exploitations

A natural language consists of TWO kinds of
rule-governed behaviour
Using words normally
Exploiting the norms
We dont even know what the norms of any language
are, still less the exploitation rules
People have assumed that norms of usage are
obvious
But only some of the things that are obvious are
true
We need to identify the norms by painstaking
empirical analysis of evidence
There is not a sharp dividing line between norm
and exploitation
Todays norm is tomorrows exploitation

15
Corpus Pattern Analysis (CPA)

Identifies normal usage patterns for each word
Each pattern include a verb, its valencies, and
the semantic type(s) of each argument (valence)
Associates a meaning (implicature) with each
pattern (NOT with each word)
Provides a basis for matching occurrences of
target words in unseen texts to their nearest
pattern (norm)
CPA is the basis for a Pattern Dictionary
(demo)
http//nlp.fi.muni.cz/projekty/cpa/
Click on web access in line 1

16
Focusing arguments by semantic-type alternation

You can calm a person, calm a horse, calm
someones nerves, fears, or anxiety.
These all activate the same meaning of the verb
calm. Anxiety does not have the required
semantic type (anxiety is not Animate)
However, the expected animate argument is present
but only as a possessive. And even if there is
no possessive, being an attribute of Animate
is part of the meaning of nerves, fear, anxiety,
etc.
Regular alternations such as these have a
focusing function. They do not activate different
senses.
Other examples
Repair a car, repair the engine (of a car),
repair the damage
Treat a person, treat her injuries, treat her
injured arm

17
Ontologies

The arguments of CPA patterns are expressed as
semantic types, related to a shallow semantic
ontology.
The term ontology is has become highly
ambiguous
SW ontologies are, typically, interlinked
networks of things like address lists, dates,
events, and websites, with html mark-up showing
attributes and values
They differ from philosophical ontologies, which
are theories about the nature of all the things
in the universe that exist
They also differ from lexical ontologies such as
WordNet, which are networks of words with
supposed conceptual relations
The CPA shallow ontology is a device for grouping
semantically similar words together to facilitate
meaning processing

18
The CPA Shallow Ontology

The CPA Shallow Ontology is a bag of bags of
words
Developed, bottom-up, by cluster analysis of
corpora
The nouns that NORMALLY occur in the same
syntagmatic slot in relation to a given verb are
grouped into a cluster
A cluster of different nouns activate the same
meaning of the verb
The cluster is named with a semantic type, e.g.
Human, Event, Abstract, Artefact,
etc.
Each cluster is compared with similar clusters
occurring with other verbs. Each combination of
clusters constitutes a lexical set.
Identically named clusters contain slightly
different members (lexical items)
Therefore, lexical sets shimmer.

19
The Predictive Power of Lexical Sets

EXAMPLE A noun, meeting has been classified
with semantic type Event at both arrange and
attend
Suppose meeting is found in the direct object
slot after leave or runbut not frequently enough
to have been included in a cluster for those
verbs in the Ontology
However, the patterns Human leave Event
and Human run Event will be found in
the Pattern Dictionary
Then there is a high probability that meeting
belongs there (even though not listed as
typical), activating probable implicatures
leave "go away from
run "organize and cause to function
efficiently

20
Phraseology in Computational Linguistics

Computational linguists are turning away from
word-by-word analysis (the Lego bricks method,
inherited from Frege) to phraseological analysis.
E.g.
Marine Carpuat and Dekai Wu. 2007. How phrase
sense disambiguation outperforms word sense
disambiguation for statistical machine
translation. In Proceedings, Conference on
Theoretical and Methodological Issues in Machine
Translation (TMI 2007). Skovde, Sweden
The Pattern Dictionary provides an inventory of
patterns
A benchmark for NLP researchers using patterns
A benchmark for introducing semantics to the
Semantic Web

21
The English Pattern Dictionary current status

Focuses on verbs
Specifically, the correlations among the lexical
and semantic values of the arguments of each
sense of each verb
700 verbs analysed so far
400 verbs complete, finalized, checked and
released
300 more are work in progress, awaiting checking
There are approximately 6000 verbs in English, so
we have done about 10
Shallow ontology in development
New lexically driven theory of language, which is
precise about the vague phenomenon of language
Hanks (forthcoming) Analysing the Lexicon Norms
and Exploitations. MIT Press

22
The English Pattern Dictionary the future

5,400 more verbs to analyse (then the adjectives)
Develop a different procedure for nouns (noun-y
nouns)
Finalize the CPA shallow ontology and populate it
Pattern dictionaries for other languages
Czech
German (A. Geyken, Berlin)
Italian (E. Jezek, U. of Pavia)
Theoretical work
Typology of exploitations
Implications of CPA for parsing theory
Alternation of semantic types in arguments
Relationship between semantic types and semantic
roles
Links between the Pattern Dictionary and FrameNet

23
Conclusions

To enable computers to manipulate data
meaningfully (the raw data itself, not just tags
added to the data), we need
an inventory of patterns of normal usage for each
word
a pattern dictionary
a theory that distinguishes normal usage from
exploitations of norms for rhetorical, poetic,
and other purposes
pattern-matching procedures text lt gt pattern
dictionary
a statistical, probabilistic approach to
identifying meaning.
Only then will computers be able to compute the
meaning of texts, understand the implicatures,
translate them, retrieve data from them, and
manipulate them in other ways
At that point, we shall be a little closer to
realizing Berners-Lees 2001 dream