Lecture 5: Lexical Relations - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture 5: Lexical Relations

Description:

antonym(large, small) antonym(big, small) antonym(big, little) but not large, little ... Taxonomy, dictionary, category structure ... – PowerPoint PPT presentation

Number of Views:2342
Avg rating:3.0/5.0
Slides: 57
Provided by: ValuedGate1
Category:

less

Transcript and Presenter's Notes

Title: Lecture 5: Lexical Relations


1
Lecture 5 Lexical Relations WordNet
SIMS 202 Information Organization and Retrieval
  • Prof. Ray Larson Prof. Marc Davis
  • UC Berkeley SIMS
  • Tuesday and Thursday 1030 am - 1200 pm
  • Fall 2003
  • http//www.sims.berkeley.edu/academics/courses/is2
    02/f03/

2
Lecture Overview
  • Review
  • Lexical Relations
  • WordNet
  • Demo
  • Discussion Questions
  • Action Items for Next Time

Credit for some of the slides in this lecture
goes to Marti Hearst and Warren Sack
3
Lecture Overview
  • Review
  • Lexical Relations
  • WordNet
  • Demo
  • Discussion Questions
  • Action Items for Next Time

Credit for some of the slides in this lecture
goes to Marti Hearst and Warren Sack
4
Definition of AI
  • ... artificial intelligence AI is the science
    of making machines do things that would require
    intelligence if done by humans (Minsky, 1963)

5
The Goals of AI Are Not New
  • Ancient Greece
  • Daedalus automata
  • Judaisms myth of the Golem
  • 18th century automata
  • Singing, dancing, playing chess?
  • Mechanical metaphors for mind
  • Clock
  • Telegraph/telephone network
  • Computer

6
Some Areas of AI
  • Knowledge representation
  • Programming languages
  • Natural language understanding
  • Speech understanding
  • Vision
  • Robotics
  • Planning
  • Machine learning
  • Expert systems
  • Qualitative simulation

7
AI or IA?
  • Artificial Intelligence (AI)
  • Make machines as smart as (or smarter than)
    people
  • Intelligence Amplification (IA)
  • Use machines to make people smarter

8
Furnas The Vocabulary Problem
  • People use different words to describe the same
    things
  • If one person assigns the name of an item, other
    untutored people will fail to access it on 80 to
    90 percent of their attempts.
  • Simply stated, the data tell us there is no one
    good access term for most objects.

9
The Vocabulary Problem
  • How is it that we come to understand each other?
  • Shared context
  • Dialogue
  • How can machines come to understand what we say?
  • Shared context?
  • Dialogue?

10
Vocabulary Problem Solutions?
  • Furnas et al.
  • Make the user memorize precise system meanings
  • Have the user and system interact to identify the
    precise referent
  • Provide infinite aliases to objects
  • Minsky and Lenat
  • Give the system commonsense so it can
    understand what the users words can mean

11
CYC
  • Decades long effort to build a commonsense
    knowledge-base
  • Storied past
  • 100,000 basic concepts
  • 1,000,000 assertions about the world
  • The validity of Cycs assertions are
    context-dependent (default reasoning)

12
Cyc Examples
  • Cyc can find the match between a user's query for
    "pictures of strong, adventurous people" and an
    image whose caption reads simply "a man climbing
    a cliff"
  • Cyc can notice if an annual salary and an hourly
    salary are inadvertently being added together in
    a spreadsheet
  • Cyc can combine information from multiple
    databases to guess which physicians in practice
    together had been classmates in medical school
  • When someone searches for "Bolivia" on the Web,
    Cyc knows not to offer a follow-up question like
    "Where can I get free Bolivia online?"

13
Cyc Applications
  • Applications currently available or in
    development
  • Integration of Heterogeneous Databases
  • Knowledge-Enhanced Retrieval of Captioned
    Information
  • Guided Integration of Structured Terminology
    (GIST)
  • Distributed AI
  • WWW Information Retrieval
  • Potential applications
  • Online brokering of goods and services
  • "Smart" interfaces
  • Intelligent character simulation for games
  • Enhanced virtual reality
  • Improved machine translation
  • Improved speech recognition
  • Sophisticated user modeling
  • Semantic data mining

14
Cycs Top-Level Ontology
  • Fundamentals
  • Top Level
  • Time and Dates
  • Types of Predicates
  • Spatial Relations
  • Quantities
  • Mathematics
  • Contexts
  • Groups
  • "Doing"
  • Transformations
  • Changes Of State
  • Transfer Of Possession
  • Movement
  • Parts of Objects
  • Composition of Substances
  • Agents
  • Organizations
  • Actors
  • Roles
  • Professions
  • Emotion
  • Propositional Attitudes
  • Social
  • Biology
  • Chemistry
  • Physiology
  • General Medicine
  • Materials
  • Waves
  • Devices
  • Construction
  • Financial
  • Food
  • Clothing
  • Weather
  • Geography
  • Transportation
  • Information
  • Perception
  • Agreements
  • Linguistic Terms
  • Documentation

http//www.cyc.com/cyc-2-1/toc.html
15
Lecture Overview
  • Review
  • Lexical Relations
  • WordNet
  • Demo
  • Discussion Questions
  • Action Items for Next Time

Credit for some of the slides in this lecture
goes to Marti Hearst and Warren Sack
16
Syntax
  • The syntax of a language is to be understood as a
    set of rules which accounts for the distribution
    of word forms throughout the sentences of a
    language
  • These rules codify permissible combinations of
    classes of word forms

17
Semantics
  • Semantics is the study of linguistic meaning
  • Two standard approaches to lexical semantics
    (cf., sentential semantics and, logical
    semantics)
  • (1) compositional
  • (2) relational

18
Lexical Semantics Compositional Approach
  • Compositional lexical semantics, introduced by
    Katz Fodor (1963), analyzes the meaning of a
    word in much the same way a sentence is analyzed
    into semantic components. The semantic components
    of a word are not themselves considered to be
    words, but are abstract elements (semantic atoms)
    postulated in order to describe word meanings
    (semantic molecules) and to explain the semantic
    relations between words. For example, the
    representation of bachelor might be ANIMATE and
    HUMAN and MALE and ADULT and NEVER MARRIED. The
    representation of man might be ANIMATE and HUMAN
    and MALE and ADULT because all the semantic
    components of man are included in the semantic
    components of bachelor, it can be inferred that
    bachelor ? man. In addition, there are
    implicational rules between semantic components,
    e.g. HUMAN ? ANIMATE, which also look very much
    like meaning postulates.
  • George Miller, On Knowing a Word, 1999

19
Lexical Semantics Relational Approach
  • Relational lexical semantics was first introduced
    by Carnap (1956) in the form of meaning
    postulates, where each postulate stated a
    semantic relation between words. A meaning
    postulate might look something like dog ? animal
    (if x is a dog then x is an animal) or, adding
    logical constants, bachelor ? man and never
    married if x is a bachelor then x is a man and
    not(x has married) or tall ? not short if x is
    tall then not(x is short). The meaning of a
    word was given, roughly, by the set of all
    meaning postulates in which it occurs.
  • George Miller, On Knowing a Word, 1999

20
Pragmatics
  • Deals with the relation between signs or
    linguistic expressions and their users
  • Deixis (literally pointing out)
  • E.g., Ill be back in an hour depends upon the
    time of the utterance
  • Conversational implicature
  • A Can you tell me the time?
  • B Well, the milkman has come. I dont know
    exactly, but perhaps you can deduce it from some
    extra information I give you.
  • Presupposition
  • Are you still such a bad driver?
  • Speech acts
  • Constatives vs. performatives
  • E.g., I second the motion.
  • Conversational structure
  • E.g., turn-taking rules

21
Language
  • Language only hints at meaning
  • Most meaning of text lies within our minds and
    common understanding
  • How much is that doggy in the window?
  • How much social system of barter and trade (not
    the size of the dog)
  • doggy implies childlike, plaintive, probably
    cannot do the purchasing on their own
  • in the window implies behind a store window,
    not really inside a window, requires notion of
    window shopping

22
Semantics The Meaning of Symbols
  • Semantics versus Syntax
  • add(3,4)
  • 3 4
  • (different syntax, same meaning)
  • Meaning versus Representation
  • What a persons name is versus who they are
  • A rose by any other name...
  • What the computer program looks like versus
    what it actually does

23
Semantics
  • Semantics assigning meanings to symbols and
    expressions
  • Usually involves defining
  • Objects
  • Properties of objects
  • Relations between objects
  • More detailed versions include
  • Events
  • Time
  • Places
  • Measurements (quantities)

24
The Role of Context
  • The concept associated with the symbol 21 means
    different things in different contexts
  • Examples?
  • The question Is there any salt?
  • Asked of a waiter at a restaurant
  • Asked of an environmental scientist at work

25
Whats in a Sentence?
  • A sentence is not a verbal snapshot or movie
    of an event. In framing an utterance, you have to
    abstract away from everything you know, or can
    picture, about a situation, and present a
    schematic version which conveys the essentials.
    In terms of grammatical marking, there is not
    enough time in the speech situation for any
    language to allow for the marking of everything
    which could possibly be significant to the
    message.
  • Dan Slobin, in Language Acquisition The state of
    the art, 1982

26
Lexical Relations
  • Conceptual relations link concepts
  • Goal of Artificial Intelligence
  • Lexical relations link words
  • Goal of Linguistics

27
Major Lexical Relations
  • Synonymy
  • Polysemy
  • Metonymy
  • Hyponymy/Hypernymy
  • Meronymy/Holonymy
  • Antonymy

28
Synonymy
  • Different ways of expressing related concepts
  • Examples
  • cat, feline, Siamese cat
  • Overlaps with basic and subordinate levels
  • Synonyms are almost never truly substitutable
  • Used in different contexts
  • Have different implications
  • This is a point of contention

29
Polysemy
  • Most words have more than one sense
  • Homonym same sound and/or spelling, different
    meaning (http//www.wikipedia.org/wiki/Homonym)
  • bank (river)
  • bank (financial)
  • Polysemy different senses of same word
    (http//www.wikipedia.org/wiki/Polysemy)
  • That dog has floppy ears.
  • She has a good ear for jazz.
  • bank (financial) has several related senses
  • the building, the institution, the notion of
    where money is stored

30
Metonymy
  • Use one aspect of something to stand for the
    whole
  • The building stands for the institution of the
    bank.
  • Newscast The White House released new figures
    today.
  • Waitperson The ham sandwich spilled his drink.

31
Hyponymy/Hyperonymy
  • ISA relation
  • Related to Superordinate and Subordinate level
    categories
  • hyponym(robin,bird)
  • hyponym(emu,bird)
  • hyponym(bird,animal)
  • hyperym(animal,bird)
  • A is a hypernym of B if B is a type of A
  • A is a hyponym of B if A is a type of B

32
Basic-Level Categories (Review)
  • Brown 1958, 1965, Berlin et al., 1972, 1973
  • Folk biology
  • Unique beginner plant, animal
  • Life form tree, bush, flower
  • Generic name pine, oak, maple, elm
  • Specific name Ponderosa pine, white pine
  • Varietal name Western Ponderosa pine
  • No overlap between levels
  • Level 3 is basic
  • Corresponds to genus
  • Folk biological categories correspond accurately
    to scientific biological categories only at the
    basic level

33
Psychologically Primary Levels
  • SUPERORDINATE animal furniture
  • BASIC LEVEL dog chair
  • SUBORDINATE terrier rocker
  • Children take longer to learn superordinate
  • Superordinate not associated with mental images
    or motor actions

34
Meronymy/Holonymy
  • Part/Whole relation
  • meronym(beak,bird)
  • meronym(bark,tree)
  • holonym(tree,bark)
  • Transitive conceptually but not lexically
  • The knob is a part of the door.
  • The door is a part of the house.
  • ? The knob is a part of the house ?
  • Holonyms are (approximately) the inverse of
    meronyms

35
Antonymy
  • Lexical opposites
  • antonym(large, small)
  • antonym(big, small)
  • antonym(big, little)
  • but not large, little
  • Many antonymous relations can be reliably
    detected by looking for statistical correlations
    in large text collections. (Justeson Katz 91)

36
Thesauri and Lexical Relations
  • Polysemy same word, different senses of meaning
  • Slightly different concepts expressed similarly
  • Synonyms different words, related senses of
    meanings
  • Different ways to express similar concepts
  • Thesauri help draw all these together
  • Thesauri also commonly define a set of relations
    between terms that is similar to lexical
    relations
  • BT, NT, RT
  • More on Thesauri next week

37
What is an Ontology?
  • From Merriam-Websters Collegiate
  • A branch of metaphysics concerned with the nature
    and relations of being
  • A particular theory about the nature of being or
    the kinds of existence
  • More prosaically
  • A carving up of the worlds meanings
  • Determine what things exist, but not how they
    inter-relate
  • Related terms
  • Taxonomy, dictionary, category structure
  • Commonly used now in CS literature to describe
    structures that function as Thesauri

38
Lecture Overview
  • Review
  • Lexical Relations
  • WordNet
  • Demo
  • Discussion Questions
  • Action Items for Next Time

Credit for some of the slides in this lecture
goes to Marti Hearst and Warren Sack
39
WordNet
  • Started in 1985 by George Miller, students, and
    colleagues at the Cognitive Science Laboratory,
    Princeton University
  • Miller also known as the author of the paper The
    Magical Number Seven, Plus or Minus Two Some
    Limits on our Capacity for Processing
    Information (1956)
  • Can be downloaded for free
  • www.cogsci.princeton.edu/wn/

40
Miller on WordNet
  • In terms of coverage, WordNets goals differ
    little from those of a good standard
    college-level dictionary, and the semantics of
    WordNet is based on the notion of word sense that
    lexicographers have traditionally used in writing
    dictionaries. It is in the organization of that
    information that WordNet aspires to innovation.
  • (Miller, 1998, Chapter 1)

41
Presuppositions of WordNet Project
  • Separability hypothesis
  • The lexical component of language can be
    separated and studied in its own right
  • Patterning hypothesis
  • People have knowledge of the systematic patterns
    and relations between word meanings
  • Comprehensiveness hypothesis
  • Computational linguistics programs need a store
    of lexical knowledge that is as extensive as that
    which people have

42
WordNet Size
WordNet Uses Synsets sets of synonymous terms
  • POS Unique Synsets
  • Strings
  • Noun 107930 74488
  • Verb 10806 12754
  • Adjective 21365 18523
  • Adverb 4583 3612
  • Totals 144684 109377

43
Structure of WordNet
44
Structure of WordNet
45
Structure of WordNet
46
Unique Beginners
  • Entity, something
  • (anything having existence (living or nonliving))
  • Psychological_feature
  • (a feature of the mental life of a living
    organism)
  • Abstraction
  • (a general concept formed by extracting common
    features from specific examples)
  • State
  • (the way something is with respect to its main
    attributes "the current state of knowledge"
    "his state of health" "in a weak financial
    state")
  • Event
  • (something that happens at a given place and
    time)

47
Unique Beginners
  • Act, human_action, human_activity
  • (something that people do or cause to happen)
  • Group, grouping
  • (any number of entities (members) considered as a
    unit)
  • Possession
  • (anything owned or possessed)
  • Phenomenon
  • (any state or process known through the senses
    rather than by intuition or reasoning)

48
Lecture Overview
  • Review
  • Lexical Relations
  • WordNet
  • Demo
  • Discussion Questions
  • Action Items for Next Time

Credit for some of the slides in this lecture
goes to Marti Hearst and Warren Sack
49
WordNet Demo
  • Available online (from Unix) if you wish to try
    it
  • Login to irony and type wn word for any word
    you are interested in
  • Demo

50
Lecture Overview
  • Review
  • Lexical Relations
  • WordNet
  • Demo
  • Discussion Questions
  • Action Items for Next Time

Credit for some of the slides in this lecture
goes to Marti Hearst and Warren Sack
51
Discussion Questions
  • Joe Hall on Lexical Relations and WordNet
  • Which method of linguistic analysis do you think
    will be more fruitful... the painstaking process
    involved with building WordNet or the relatively
    easy output afforded by Church et al.'s
    computational method that, however, requires much
    work to decipher the results?

52
Discussion Questions
  • Joe Hall on Lexical Relations and WordNet
  • What are the problems/advantages of using the
    World Wide Web itself as a "corpus"? (If you were
    to incorporate the current digital copies of all
    newspapers, journals, etc. wouldn't you very
    quickly exceed the 15 Million words of the
    largest corpus in the Church article?)

53
Discussion Questions
  • Joe Hall on Lexical Relations and WordNet
  • With the diversity of dialects of the English
    language, how much does this type of
    computational analysis get confused by phrases
    such as "What up?" (i.e., slang)? Aren't these
    some of the more interesting parts of language
    (i.e., how language evolves)?

54
Lecture Overview
  • Review
  • Lexical Relations
  • WordNet
  • Demo
  • Discussion Questions
  • Action Items for Next Time

Credit for some of the slides in this lecture
goes to Marti Hearst and Warren Sack
55
Homework
  • Read Chapters 3 and 5 of The Organization of
    Information (Textbook)
  • Discussion Question volunteers?
  • Tu Tran
  • Hong Qu

56
Next Time
  • Introduction to Metadata
Write a Comment
User Comments (0)
About PowerShow.com