Lexical Semantics for the Semantic Web - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Lexical Semantics for the Semantic Web

Description:

'Web technology must not discriminate between the scribbled draft and the polished performance. ... a person, calm a horse, calm someone's nerves, fears, or ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 24
Provided by: james878
Category:

less

Transcript and Presenter's Notes

Title: Lexical Semantics for the Semantic Web


1
Lexical Semantics for the Semantic Web
  • Patrick Hanks
  • Masaryk University, Brno
  • Czech Republic
  • hanks_at_fi,muni.cz
  • UFAL, Mathematics Faculty, Charles University in
    Prague

2
Outline of the talk
  • A neglected aspect of Tim Berners-Lees vision
  • Introducing semantics to the semantic web
  • Computing meaning and inferences in free text
  • Patterns in text and how to use them
  • Building a resource that encodes patterns
  • linking meanings (implicatures) to patterns (not
    to words)
  • A pattern dictionary
  • What does the pattern dictionary look like?
  • The role of an ontology in a pattern dictionary

3
Aims of the Semantic Web
  • To enable computers to manipulate data
    meaningfully
  • Most of the Web's content today is designed for
    humans to read, not for computer programs to
    manipulate meaningfully.
  • Berners-Lee et al., Scientific American, 2001

4
A neglected aspect of Berners-Lees vision
  • Web technology must not discriminate between the
    scribbled draft and the polished performance.
  • T. Berners-Lee et al.,
    Scientific American, 2001
  • The vision includes being able to process the
    meaning and implicatures of free text
  • not just pre-processed tagged texts Wikis,
    names, addresses, appointments, and suchlike.

5
A paradox
  • Traditional KR systems typically have been
    centralized, requiring everyone to share exactly
    the same definition of common concepts such as
    'parent' or 'vehicle'.
  • Berners-Lee et al. 2001.
  • Implying that SW is more tolerant?
  • Apparently not
  • Human languages thrive when using the same term
    to mean somewhat different things, but automation
    does not. --Ibid.

6
The root of the problem
  • Scientists from Leibniz to the present have
    wanted word meaning to be precise and certain.
  • But it isnt. Meaning in natural language is
    vague and probabilistic
  • Some theoretical linguists (and CL researchers),
    not liking fuzziness in data, have preferred to
    disregard data in order to preserve theory
  • Do not allow SW research to fall into this trap
  • To fulfil Berners-Lees dream, we need to be able
    to compute the meaning of un-pre-processed
    documents

7
What NOT to do for the SW
  • The meaning of the English noun second is vague
    a short unit of time or 1/60 of a minute.
  • Wait a second.
  • He looked at her for a second.
  • It is also a very precisely defined technical
    term in certain scientific contexts the basic
    SI unit of time
  • the duration of 9,192,631,770 cycles of
    radiation corresponding to the transition between
    two hyperfine levels of the ground state of an
    atom of caesium 133.
  • If we try to stipulate a precise meaning for all
    terms in advance of using them, well never be
    able to fulfil the dream and we will invent an
    unusable language

8
Precision and vagueness
  • Stipulating a precise definition for an ordinary
    word such as second removes it from ordinary
    language.
  • When it is given a precise, stipulative
    definition, an ordinary word becomes a technical
    term.
  • An adequate definition of a vague concept must
    aim not at precision but at vagueness it must
    aim at precisely that level of vagueness which
    characterizes the concept itself.
  • Wierzbicka 1985, pp.12-13

9
The paradox of natural language
  • Word meaning may be vague and fuzzy, but people
    use words to make very precise statements
  • This can be done because text meaning is
    holistic, e.g.
  • fire in isolation is very ambiguous
  • But He fired the bullet that was recovered from
    the girl's body is not at all ambiguous
  • Ithaca is ambiguous
  • But Ithaca, NY is much less ambiguous.
  • Even the tiniest bit of (relevant) context helps.

10
What is to be done?
  • Process only the (strictly defined) mark-up of
    documents, not their linguistic content?
  • And so abandon the dream of enabling computers to
    manipulate linguistic content?
  • Force humans to conform to formal requirements
    when writing documents?
  • Not a serious practical possibility
  • Teach computers to deal with natural language in
    all its fearful fuzziness?
  • Maybe this is what we need to do

11
Hypertext and relevance
  • The power of hypertext is that anything can link
    to anything.
  • Berners-Lee et al., 2001
  • Yes, but we need procedures for determining
    (automatically) what counts as a relevant link,
    e.g.
  • Firing a person is relevant to employment law.
  • Firing a gun is relevant to warfare and armed
    robbery.

12
How do we know who is doing what to whom?
  • Through context (a standard, uncontroversial
    answer)
  • But teasing out relevant context is tricky
  • Firing a person Person MUST be mentioned
  • Whereas firing a gun occurs in patterns where
    neither Firearm nor Projectile are
    mentioned, e.g.
  • The police fired into the crowd/over their
    heads/wide.
  • Negative evidence can be important
  • He fired cannot mean he dismissed someone from
    employment
  • Relevant context is cumulative
  • So correlations among arguments are often needed

13
How to compute meaning for the Semantic Web
  • STEP 1. Identify all the normal patterns of
    normal utterances by data analysis
  • STEP 2. Develop a resource that says precisely
    what the basic implicatures of each pattern are,
    e.g.
  • Human fire AdvDirection
  • Human causes Firearm to discharge
    Projectile
  • STEP 3. Populate the semantic types in an
    ontology
  • STEP 4. Develop a linguistic theory that
    distinguishes norms from exploitations
  • Abandon the received theories of speculative
    linguists
  • STEP 5. Develop procedures for finding best
    matches between a free text statement and a
    pattern.

14
The double helix of language norms and
exploitations
  • A natural language consists of TWO kinds of
    rule-governed behaviour
  • Using words normally
  • Exploiting the norms
  • We dont even know what the norms of any language
    are, still less the exploitation rules
  • People have assumed that norms of usage are
    obvious
  • But only some of the things that are obvious are
    true
  • We need to identify the norms by painstaking
    empirical analysis of evidence
  • There is not a sharp dividing line between norm
    and exploitation
  • Todays norm is tomorrows exploitation

15
Corpus Pattern Analysis (CPA)
  • Identifies normal usage patterns for each word
  • Each pattern include a verb, its valencies, and
    the semantic type(s) of each argument (valence)
  • Associates a meaning (implicature) with each
    pattern (NOT with each word)
  • Provides a basis for matching occurrences of
    target words in unseen texts to their nearest
    pattern (norm)
  • CPA is the basis for a Pattern Dictionary
    (demo)
  • http//nlp.fi.muni.cz/projekty/cpa/
  • Click on web access in line 1

16
Focusing arguments by semantic-type alternation
  • You can calm a person, calm a horse, calm
    someones nerves, fears, or anxiety.
  • These all activate the same meaning of the verb
    calm. Anxiety does not have the required
    semantic type (anxiety is not Animate)
  • However, the expected animate argument is present
    but only as a possessive. And even if there is
    no possessive, being an attribute of Animate
    is part of the meaning of nerves, fear, anxiety,
    etc.
  • Regular alternations such as these have a
    focusing function. They do not activate different
    senses.
  • Other examples
  • Repair a car, repair the engine (of a car),
    repair the damage
  • Treat a person, treat her injuries, treat her
    injured arm

17
Ontologies
  • The arguments of CPA patterns are expressed as
    semantic types, related to a shallow semantic
    ontology.
  • The term ontology is has become highly
    ambiguous
  • SW ontologies are, typically, interlinked
    networks of things like address lists, dates,
    events, and websites, with html mark-up showing
    attributes and values
  • They differ from philosophical ontologies, which
    are theories about the nature of all the things
    in the universe that exist
  • They also differ from lexical ontologies such as
    WordNet, which are networks of words with
    supposed conceptual relations
  • The CPA shallow ontology is a device for grouping
    semantically similar words together to facilitate
    meaning processing

18
The CPA Shallow Ontology
  • The CPA Shallow Ontology is a bag of bags of
    words
  • Developed, bottom-up, by cluster analysis of
    corpora
  • The nouns that NORMALLY occur in the same
    syntagmatic slot in relation to a given verb are
    grouped into a cluster
  • A cluster of different nouns activate the same
    meaning of the verb
  • The cluster is named with a semantic type, e.g.
    Human, Event, Abstract, Artefact,
    etc.
  • Each cluster is compared with similar clusters
    occurring with other verbs. Each combination of
    clusters constitutes a lexical set.
  • Identically named clusters contain slightly
    different members (lexical items)
  • Therefore, lexical sets shimmer.

19
The Predictive Power of Lexical Sets
  • EXAMPLE A noun, meeting has been classified
    with semantic type Event at both arrange and
    attend
  • Suppose meeting is found in the direct object
    slot after leave or runbut not frequently enough
    to have been included in a cluster for those
    verbs in the Ontology
  • However, the patterns Human leave Event
    and Human run Event will be found in
    the Pattern Dictionary
  • Then there is a high probability that meeting
    belongs there (even though not listed as
    typical), activating probable implicatures
  • leave "go away from
  • run "organize and cause to function
    efficiently

20
Phraseology in Computational Linguistics
  • Computational linguists are turning away from
    word-by-word analysis (the Lego bricks method,
    inherited from Frege) to phraseological analysis.
    E.g.
  • Marine Carpuat and Dekai Wu. 2007. How phrase
    sense disambiguation outperforms word sense
    disambiguation for statistical machine
    translation. In Proceedings, Conference on
    Theoretical and Methodological Issues in Machine
    Translation (TMI 2007). Skovde, Sweden
  • The Pattern Dictionary provides an inventory of
    patterns
  • A benchmark for NLP researchers using patterns
  • A benchmark for introducing semantics to the
    Semantic Web

21
The English Pattern Dictionary current status
  • Focuses on verbs
  • Specifically, the correlations among the lexical
    and semantic values of the arguments of each
    sense of each verb
  • 700 verbs analysed so far
  • 400 verbs complete, finalized, checked and
    released
  • 300 more are work in progress, awaiting checking
  • There are approximately 6000 verbs in English, so
    we have done about 10
  • Shallow ontology in development
  • New lexically driven theory of language, which is
    precise about the vague phenomenon of language
  • Hanks (forthcoming) Analysing the Lexicon Norms
    and Exploitations. MIT Press

22
The English Pattern Dictionary the future
  • 5,400 more verbs to analyse (then the adjectives)
  • Develop a different procedure for nouns (noun-y
    nouns)
  • Finalize the CPA shallow ontology and populate it
  • Pattern dictionaries for other languages
  • Czech
  • German (A. Geyken, Berlin)
  • Italian (E. Jezek, U. of Pavia)
  • Theoretical work
  • Typology of exploitations
  • Implications of CPA for parsing theory
  • Alternation of semantic types in arguments
  • Relationship between semantic types and semantic
    roles
  • Links between the Pattern Dictionary and FrameNet

23
Conclusions
  • To enable computers to manipulate data
    meaningfully (the raw data itself, not just tags
    added to the data), we need
  • an inventory of patterns of normal usage for each
    word
  • a pattern dictionary
  • a theory that distinguishes normal usage from
    exploitations of norms for rhetorical, poetic,
    and other purposes
  • pattern-matching procedures text lt gt pattern
    dictionary
  • a statistical, probabilistic approach to
    identifying meaning.
  • Only then will computers be able to compute the
    meaning of texts, understand the implicatures,
    translate them, retrieve data from them, and
    manipulate them in other ways
  • At that point, we shall be a little closer to
    realizing Berners-Lees 2001 dream
Write a Comment
User Comments (0)
About PowerShow.com