Title: Putting Meaning Into Your Trees
1Putting Meaning Into Your Trees
- Martha Palmer
- University of Pennsylvania
- Institute for Research in Cognitive Science
- October 17, 2003
2Elusive nature of meaning
- Natural Language Understanding
Natural Language Processing/Engineering
Empirical techniques rule!
3Outline
- Introduction
- Background WordNet, Levin classes, VerbNet
- Proposition Bank capturing shallow semantics
- Mapping PropBank to VerbNet
- Mapping PropBank to WordNet
4Ask Jeeves A Q/A, IR ex.
- What do you call a successful movie?
- Tips on Being a Successful Movie Vampire ... I
shall call the police. - Successful Casting Call Shoot for Clash of
Empires'' ... thank everyone for their
participation in the making of yesterday's movie. - Demme's casting is also highly entertaining,
although I wouldn't go so far as to call it
successful. This movie's resemblance to its
predecessor is pretty vague... - VHS Movies Successful Cold Call Selling Over
100 New Ideas, Scripts, and Examples from the
Nation's Foremost Sales Trainer.
Blockbuster
5Ask Jeeves filtering w/ POS tag
- What do you call a successful movie?
- Tips on Being a Successful Movie Vampire ... I
shall call the police. - Successful Casting Call Shoot for Clash of
Empires'' ... thank everyone for their
participation in the making of yesterday's movie. - Demme's casting is also highly entertaining,
although I wouldn't go so far as to call it
successful. This movie's resemblance to its
predecessor is pretty vague... - VHS Movies Successful Cold Call Selling Over
100 New Ideas, Scripts, and Examples from the
Nation's Foremost Sales Trainer.
6Filtering out call the police
sentential complement
noun phrase
Syntax
7Filtering out call the police
Sense tags
8Ask Jeeves A Q/A, IR ex.
- Whom do you call for a successful movie?
- Tips on Being a Successful Movie Vampire ... I
shall call the police. - Successful Casting Call Shoot for Clash of
Empires'' ... thank everyone for their
participation in the making of yesterday's movie. - Demme's casting is also highly entertaining,
although I wouldn't go so far as to call it
successful. This movie's resemblance to its
predecessor is pretty vague... - VHS Movies Successful Cold Call Selling Over
100 New Ideas, Scripts, and Examples from the
Nation's Foremost Sales Trainer.
9Distinguishing the questions
call
?
you
whom
for a successful movie
Syntax?
10Distinguishing the questions
call3
you
whom
for a successful movie
Sense tags
11English lexical resource is required
- That provides sets of possible syntactic frames
for verbs. - And provides clear, replicable sense
distinctions. - AskJeeves Who do you call for a good electronic
lexical database for English?
12WordNet call, 28 senses
- name, call -- (assign a specified, proper name
to - "They named their son David" )
- -gt LABEL
- 2. call, telephone, call up, phone, ring -- (get
or try to get into communication (with someone)
by telephone - "I tried to call you all night" )
- -gtTELECOMMUNICATE
- 3. call -- (ascribe a quality to or give a name
of a common noun that reflects a quality - "He called me a bastard" )
- -gt LABEL
- 4. call, send for -- (order, request, or command
to come - "She was called into the director's office"
"Call the police!") - -gt ORDER
13WordNet Princeton (Miller 1985, Fellbaum 1998)
- On-line lexical reference (dictionary)
- Nouns, verbs, adjectives, and adverbs grouped
into synonym sets - Other relations include hypernyms (ISA),
antonyms, meronyms - Limitations as a computational lexicon
- Contains little syntactic information
- No explicit predicate argument structures
- No systematic extension of basic senses
- Sense distinctions are very fine-grained, ITA 73
- No hierarchical entries
14Levin classes (Levin, 1993)
- 3100 verbs, 47 top level classes, 193 second and
third level - Each class has a syntactic signature based on
alternations. - John broke the jar. / The jar broke. /
Jars break easily. - John cut the bread. / The bread cut. /
Bread cuts easily. - John hit the wall. / The wall hit. /
Walls hit easily.
15Levin classes (Levin, 1993)
- Verb class hierarchy 3100 verbs, 47 top level
classes, 193 - Each class has a syntactic signature based on
alternations. - John broke the jar. / The jar broke. /
Jars break easily. - change-of-state
- John cut the bread. / The bread cut. /
Bread cuts easily. - change-of-state, recognizable
action, - sharp instrument
- John hit the wall. / The wall hit. /
Walls hit easily. - contact, exertion of force
16(No Transcript)
17Confusions in Levin classes?
- Not semantically homogenous
- braid, clip, file, powder, pluck, etc...
- Multiple class listings
- homonymy or polysemy?
- Conflicting alternations?
- Carry verbs disallow the Conative,
- (she carried at the ball), but include
- push,pull,shove,kick,draw,yank,tug
- also in Push/pull class, does take the Conative
(she kicked at the ball)
18Intersective Levin Classes
apart CH-STATE
across the room CH-LOC
at CH-LOC
Dang, Kipper Palmer, ACL98
19Intersective Levin Classes
- More syntactically and semantically coherent
- sets of syntactic patterns
- explicit semantic components
- relations between senses
- VERBNET
- www.cis.upenn.edu/verbnet
Dang, Kipper Palmer, IJCAI00, Coling00
20VerbNet Karin Kipper
- Class entries
- Capture generalizations about verb behavior
- Organized hierarchically
- Members have common semantic elements, thematic
roles and syntactic frames - Verb entries
- Refer to a set of classes (different senses)
- each class member linked to WN synset(s) (not
all WN senses are covered)
21Levin class escape-51.1-1
- WordNet Senses WN 1, 5, 8
- Thematic Roles Locationconcrete
-
Themeconcrete - Frames with Semantics
- Basic Intransitive "The convict escaped"
motion(during(E),Theme) direction(during(E),Prep,
Theme, Location) Intransitive ( path PP)
"The convict escaped from the prison" - Locative Preposition Drop "The convict escaped
the prison"
22Levin class future_having-13.3
- WordNet Senses WN 2,10,13
- Thematic Roles Agentanimate OR organization
-
Recipientanimate OR organization - Theme
- Frames with Semantics
- Dative "I promised somebody my time"
Agent V Recipient Theme has_possession(start(E),
Agent,Theme) future_possession(end(E),Recipient,Th
eme) cause(Agent,E) Transitive ( Recipient PP)
"We offered our paycheck to her" Agent V Theme
Prep(to) Recipient ) Transitive (Theme Object)
"I promised my house (to somebody)" Agent V
Theme
23Hand built resources vs. Real data
- VerbNet is based on linguistic theory
- how useful is it?
- How well does it correspond to syntactic
variations found in naturally occurring text? -
24Proposition BankFrom Sentences to Propositions
meet(Somebody1, Somebody2)
. . .
When Powell met Zhu Rongji on Thursday they
discussed the return of the spy
plane. meet(Powell, Zhu) discuss(Powell,
Zhu, return(X, plane))
25Capturing semantic roles
SUBJ
- Mark broke ARG1 the LCD Projector.
- ARG1 The windows were broken by the hurricane.
- ARG1 The vase broke into pieces when it toppled
over.
SUBJ
SUBJ
See also Framenet, http//www.icsi.berkeley.edu/
framenet/
26English lexical resource is required
- That provides sets of possible syntactic frames
for verbs with semantic role labels. - And provides clear, replicable sense
distinctions.
27A TreeBanked phrase
a GM-Jaguar pact that would give the U.S. car
maker an eventual 30 stake in the British
company.
NP
SBAR
S
WHNP-1
VP
that
NP-SBJ
VP
T-1
would
NP
give
PP-LOC
28The same phrase, PropBanked
a GM-Jaguar pact that would give the U.S. car
maker an eventual 30 stake in the British
company.
Arg0
that would give
Arg1
T-1
an eventual 30 stake in the British company
Arg2
the US car maker
29The full sentence, PropBanked
have been expecting
Arg1
Arg0
Analysts have been expecting a GM-Jaguar pact
that would give the U.S. car maker an eventual
30 stake in the British company.
Analysts
Arg0
that would give
Arg1
T-1
an eventual 30 stake in the British company
Arg2
the US car maker
30A treebanked sentence
31The same sentence, PropBanked
32Frames File Example expect
Roles Arg0 expecter Arg1 thing
expected Example Transitive, active
Portfolio managers expect further declines in
interest rates. Arg0
Portfolio managers REL
expect Arg1 further
declines in interest rates
33Frames File example give
- Roles
- Arg0 giver
- Arg1 thing given
- Arg2 entity given to
- Example double object
- The executives gave the chefs a standing
ovation. - Arg0 The executives
- REL gave
- Arg2 the chefs
- Arg1 a standing
ovation
34Annotation procedure
- PTB II - Extraction of all sentences with given
verb - Create Frame File for that verb Paul Kingsbury
- (3100 lemmas, 4400 framesets,118K predicates)
- Over 300 created automatically via VerbNet
- First pass Automatic tagging (Joseph
Rosenzweig) - http//www.cis.upenn.edu/josephr/TIDES/index.html
lexicon - Second pass Double blind hand correction
-
Paul Kingsbury - Tagging tool highlights discrepancies Scott
Cotton - Third pass Solomonization (adjudication)
- Betsy Klipple, Olga Babko-Malaya
35Trends in Argument Numbering
- Arg0 agent
- Arg1 direct object / theme / patient
- Arg2 indirect object / benefactive / instrument
/ attribute / end state - Arg3 start point / benefactive / instrument /
attribute - Arg4 end point
- Per word vs frame level more general?
36Additional tags (arguments or adjuncts?)
- Variety of ArgMs (Arggt4)
- TMP - when?
- LOC - where at?
- DIR - where to?
- MNR - how?
- PRP -why?
- REC - himself, themselves, each other
- PRD -this argument refers to or modifies another
- ADV others
37Inflection
- Verbs also marked for tense/aspect
- Passive/Active
- Perfect/Progressive
- Third singular (is has does was)
- Present/Past/Future
- Infinitives/Participles/Gerunds/Finites
- Modals and negations marked as ArgMs
38Complex mappingPropBank/FrameNet
Buy Arg0 buyer Arg1 commodity Arg2
seller Arg3 price Arg4 beneficiary
Sell Arg0 seller Arg1 commodity Arg2
buyer Arg3 price Arg4 beneficiary
Trade, exchange Arg0 one party Arg1
commodity Arg2 other party Arg3 price Arg4
beneficiary
39Ergative/Unaccusative Verbs
- Roles (no ARG0 for unaccusative verbs)
- Arg1 Logical subject, patient, thing rising
- Arg2 EXT, amount risen
- Arg3 start point
- Arg4 end point
- Sales rose 4 to 3.28 billion from 3.16 billion.
The Nasdaq composite index added 1.01 to
456.6 on paltry volume.
40Annotator accuracy ITA 84
41Actual data for leave
- http//www.cs.rochester.edu/gildea/PropBank/Sort/
- Leave .01 move away from Arg0 rel Arg1 Arg3
- Leave .02 give Arg0 rel Arg1 Arg2
- sub-ARG0 obj-ARG1 44
- sub-ARG0 20
- sub-ARG0 NP-ARG1-with obj-ARG2 17
- sub-ARG0 sub-ARG2 ADJP-ARG3-PRD 10
- sub-ARG0 sub-ARG1 ADJP-ARG3-PRD 6
- sub-ARG0 sub-ARG1 VP-ARG3-PRD 5
- NP-ARG1-with obj-ARG2 4
- obj-ARG1 3
- sub-ARG0 sub-ARG2 VP-ARG3-PRD 3
42Automatic classification
- Merlo Stevenson automatically classified 59
verbs with 69.8 accuracy - 1. Unergative, 2. unaccusative, 3. object-drop
- 100M words automatically parsed
- C5.0. Using features transitivity, causativity,
- animacy,
voice, POS - EM clustering 61, 2669 instances, 1M words
- Using Gold Standard semantic role labels
- 1. float hop/hope jump march leap
- 2. change clear collapse cool crack open flood
- 3. borrow clean inherit reap organize study
43English lexical resource is required
- That provides sets of possible syntactic frames
for verbs with semantic role labels? - And provides clear, replicable sense
distinctions.
44English lexical resource is required
- That provides sets of possible syntactic frames
for verbs with semantic role labels - that can be automatically assigned accurately
to new text? - And provides clear, replicable sense
distinctions.
45Automatic Labelling of Semantic Relations
- Stochastic Model
- Features
- Predicate
- Phrase Type
- Parse Tree Path
- Position (Before/after predicate)
- Voice (active/passive)
- Head Word
Gildea Jurafsky, CL02, Gildea Palmer, ACL02
46Semantic Role Labelling Accuracy-Known Boundaries
- Accuracy of semantic role prediction for known
boundaries--the - system is given the
constituents to classify. - Framenet examples (training/test) are handpicked
to be unambiguous. - Lower performance with unknown boundaries.
- Higher performance with traces.
- Almost evens out.
47Additional Automatic Role Labelers
- Performance improved from 77 to 88 (Gold
Standard parses, lt 10 instances) - Same features plus
- Named Entity tags
- Head word POS
- For unseen verbs backoff to automatic verb
clusters - SVMs
- Role or not role
- For each likely role, for each Arg, Arg or not
- No overlapping role labels allowed
Pradhan, et. al., ICDM03 Sardeneau, et. al,
ACL03 Chen Rambow, EMNLP03
48EM clustering - (based on Gildea, Coling02)
- Szuting Yi
- Using a hidden variable to predict Arg0 and Arg1
labels - P(rv,s,n), supervised
- Score of 92.79 vs 91.11 (MLE baseline)
49Iterative Clustering
- Combining Supervised (PB) and Unspervised (BNC)
50A Chinese Treebank Sentence
- ??/Congress ??/recently ??/pass ?/ASP ???/banking
law - The Congress passed the banking law recently.
- (IP (NP-SBJ (NN ??/Congress))
- (VP (ADVP (ADV ??/recently))
- (VP (VV ??/pass)
- (AS ?/ASP)
- (NP-OBJ (NN ???/banking
law)))))
51The Same Sentence, PropBanked
- (IP (NP-SBJ arg0 (NN ??))
- (VP argM (ADVP (ADV ??))
- (VP f2 (VV ??)
- (AS ?)
- arg1 (NP-OBJ (NN ???)))))
- ??(f2) (pass)
- arg0 argM arg1
- ?? ?? ??? (law)
- (congress)
52A Korean Treebank Sentence
?? ??? 3 ???? ???? ??? ?? ??? ????.
He added that Renault has a deadline until the
end of March for a merger proposal.
- (S (NP-SBJ ?/NPN?/PAU)
- (VP (S-COMP (NP-SBJ ??/NPR?/PCA)
- (VP (VP (NP-ADV
3/NNU -
?/NNX?/NNX??/PAU) - (VP
(NP-OBJ ??/NNC??/NNC -
??/NNC?/PCA) -
?/VV?/ECS)) -
?/VX?/EFN?/PAD) - ???/VV?/EPF?/EFN)
- ./SFN)
53The same sentence, PropBanked
(S Arg0 (NP-SBJ ?/NPN?/PAU) (VP Arg2
(S-COMP ( Arg0 NP-SBJ ??/NPR?/PCA)
(VP (VP ( ArgM NP-ADV
3/NNU
?/NNX?/NNX??/PAU)
(VP ( Arg1
NP-OBJ ??/NNC??/NNC
??/NNC?/PCA)
?/VV?/ECS))
?/VX?/EFN?/PAD)
???/VV?/EPF?/EFN) ./SFN)
?????
Arg0
Arg2
??
?? ??
Arg0
Arg1
ArgM
???
???? ???
3 ????
???? (??, ??? 3 ???? ???? ??? ?? ??) (add)
(he) (Renaut has a deadline until
the end of March for a merger proposal) ?? (???,
3 ????, ???? ???) (has)
(Renaut) (until the end of March) (a deadline
for a merger proposal)
54Summary
- Shallow semantic annotation that captures
critical dependencies and semantic role labels - Supports training of supervised automatic taggers
- Methodology ports readily to other languages
- English PropBank release spring 2004
- Chinese PropBank release fall 2004
- Korean PropBank release summer 2005
55English lexical resource is required
- That provides sets of possible syntactic frames
for verbs with semantic role labels that can be
automatically assigned accurately to new text. - And provides clear, replicable sense
distinctions?
56Word Senses in PropBank
- Orders to ignore word sense not feasible for 700
verbs - Mary left the room
- Mary left her daughter-in-law her pearls in her
will - Frameset leave.01 "move away from"
- Arg0 entity leaving
- Arg1 place left
- Frameset leave.02 "give"
- Arg0 giver
- Arg1 thing given
- Arg2 beneficiary
- How do these relate to traditional word senses in
VerbNet and WordNet?
57Mapping from PropBank to VerbNet
58Mapping from PB to VerbNet
59Mapping from PropBank to VerbNet
- http//www.cs.rochester.edu/gildea/VerbNet/
- Overlap with PropBank framesets
- lt 50 VN entries, gt 85 VN classes
- 50,000 PropBank instances
- Results
- MATCH - 78.63. (80.90 relaxed)
- Benefits
- Thematic role labels and semantic predicates
- WordNet sense tags
Kingsbury Kipper, NAACL03, Text Meaning
Workshop
60Filtering out call the police
Sense tags
61WordNet as a WSD sense inventory
- Senses unnecessarily fine-grained?
- No consensus on criteria for sense distinctions
- Senseval1 and Senseval2
- Word Sense Disambiguation bakeoffs
- Senseval1 Hector, ITA 95.5
- Senseval2 WordNet 1.7, ITA 73
- Good news
- Verbs tagged here, added groupings
- Best performing WSD system is ours
62Results averaged over 28 verbs
MX Maximum Entropy WSD, p(sensecontext) Featur
es topic, syntactic constituents, semantic
classes 2.5, 1.5 to
5, 6
63Grouping improved ITA andMaxent WSD
- Call 31 of errors due to confusion between
senses within same group 1 - name, call -- (assign a specified, proper name
to They named their son David) - call -- (ascribe a quality to or give a name of a
common noun that reflects a quality He called me
a bastard) - call -- (consider or regard as beingI would not
call her beautiful) - 75 with training and testing on grouped senses
vs. - 43 with training and testing on fine-grained
senses
64Results averaged over 28 verbs
65Results averaged over 28 verbs
66Overlap between Groups and Framesets 95
Frameset2
Frameset1
WN1 WN2 WN3 WN4 WN6 WN7 WN8
WN5 WN 9 WN10 WN11 WN12 WN13
WN 14 WN19 WN20
develop
Palmer, Dang Fellbaum, NLE 2004
67Sense Hierarchy
- Framesets coarse grained distinctions
- Sense Groups (Senseval-2) intermediate level
- (includes Levin classes) 95 overlap
- WordNet fine grained distinctions
68Role Labels Framesetsas features for WSD
- Jinying Chen Decision Tree C5.0, Coarse-grained
- 5 verbs, Gold Standard PropBank annotation
- Features
- VOICE PAS, ACT
- FRAMESET 01,02,
- ARGn (n0,1,2 ) 0(not occur), 1(occur)
- CoreFrame 01-ARG0-ARG1, 02-ARG0-ARG2,
- ARGM 0(has not ARGM), 1(has ARGM)
- ARGM-f(fDIS, ADV, ) i (occur i times)
- Comparable results to Maxent with PropBank
features - Predicate-argument structure and sense
distinctions are inextricably linked
69Lexical resources provide concrete criteria for
sense distinctions
- PropBank coarse grained sense distinctions
determined by different subcategorization frames
(Framesets) - Intersective Levin classes regular sense
extensions through differing syntactic
constructions - VerbNet distinct semantic predicates for each
sense (verb class)
Are these the right distinctions?
70English lexical resource is available
- That provides sets of possible syntactic frames
for verbs with semantic role labels that can be
automatically assigned accurately to new text. - And provides clear, replicable sense distinctions
71WSD in Machine Translation
- Different syntactic frames
- John left the room
- Juan saiu do quarto. (Portuguese)
- John left the book on the table.
- Juan deizou o livro na mesa.
- Same syntactic frame?
- John left a fortune.
- Juan deixou uma fortuna.