Automatic Methods to Detect the Compositionality of Multiwords

About This Presentation

Title:

Automatic Methods to Detect the Compositionality of Multiwords

Description:

Current work (McCarthy) 'prototypical selectional preference models' acquired ... Is the argument prototypical for this predicate and argument relationship? E.g. ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 32

Provided by: dianamcc

Category:

more less

Transcript and Presenter's Notes

Title: Automatic Methods to Detect the Compositionality of Multiwords

1
Automatic Methods to Detect the Compositionality
of Multiwords
2
Outline

What we want to cover
Why we do it
A survey of current methods
Approaches to evaluation
Comparison of some of the results
Conclusions
Directions for the future

3
Compositionality, non-compositionality and
decomposability

Compositionality the meaning of the phrase is
a function of the meaning of the parts
Non-Compositionality The meaning of the phrase
is not a function of the meaning of the parts
Decomposability The meaning of the phrase can
be ascribed to its parts
Idiosyncratic spill the beans, let the cat out
of the bag
Simple traffic light, car park

4
Correlation (or confusion) of compositionality

with productivity
with statistical frequency of occurrence

5
Motivation

Any requirement for semantic interpretation will
require handling of non-compositional multiwords
in order to arrive at the correct interpretation
e.g. She kicked the bucket
Associated syntactic behaviour is needed for
parsing
e.g. blow up the houses of parliament
Important for lexical acquisition
e.g. eat hot dog
Associated non-productive and syntactic
behaviour important for generation
e.g. Wine and dine

6
Methods the main categories

Statistical p(see,red) /
(p(see)p(red)
Translations see red lt-gt aberrear
Dictionaries listings, semantic
codes and semantic relationships
Substitutions see red, see yellow,
see blue
Distributional see look perceive
gaze
red yellow
orange blue

7
Statistical Methods

Statistical measures
e.g. pointwise mutual information
Venkatapathy and Joshi, (2006) useful for
alignment
Syntactic flexibility
Fazly and Stevenson (2006) (verbnoun compounds)
idiomatic nature reflected
(passivization, determiner type and
pluralization)

8
Translations

Melamed (1997) "non compositional compounds
statistical comparison of translation models i)
with concatenated words ii) separate words
Mukerjee et al (2006) Hindi-English Parallel
corpora used for detecting Hindi complex
predicates.
Venkatapathy and Joshi (2006) compositionality
(PMI) used for alignment.
Translations from one ? many are not necessarily
non-compositional
e.g. swimming pool (piscine) video tape (video),
Nevertheless, very useful to find collocations
for a language pair
Villada Moirón and Tieldemann (2006) diversity of
translations for an expression. Overlap of
meaning of expression from translation and those
of its component words.

9
Substitution Methods
baggage, luggage

Pearce (2001) Anti-collocations using WordNet
synonyms
e.g. emotional baggage vs emotional luggage
Lin (1999) PMI 95 significant difference
between phrase and phrase with close substitute.
Close substitutes found from an automatically
generated thesaurus (Lin,98)
e.g. see gaze, look, perceive
Lexical fixedness Fazly and Stevenson 2006
(verbnoun compounds) as Lin (1999) but using
difference in PMI between target and average of
the PMI of the set of substitutes

10
Dictionary methods

Recognition of idiomatic tokens in a Japanese
corpus using syntactic evidence and information
in an idiom dictionary Hasimoto et al (2006)
Using hierarchical information in WordNet to
model decomposability for evaluation (Baldwin et
al. 2003)
Piao et al. (2006) lexical resource (Lancaster
Semantic Lexicon) to compare meaning of listed
multiword to that of its component words. Measure
semantic distance using semantic tags given in
lexicon

11
Substitution Methods Contd

What is being captured?
Bannard et al (2003) and Baldwin et al (2003)
argue that these methods capture
non-productivity, (simple decomposable
collocations)
NB Pearce (2001) is explicitly targeting
collocations rather than compositionality
Fazly and Stevenson (2006) acknowledge the
partial relationship (compositionality and
lexical fixedness) but the relationship exists
nevertheless

12
Selectional Preference Models

Bannard (2002) verb particle data eat up
ltobjectgt vs eat ltobjectgt
(Li and Abe, 1995) models acquired using corpus
data and WordNet,
Current work (McCarthy) prototypical
selectional preference models acquired using
corpus data and an automatically generated
thesaurus
(Lin, 98 see later)
e.g. drink ltobjectgt vs drink tea
e.g. throw ltobjectgt vs throw light

13
Distributional Approaches Latent Semantic
Analysis
Contexts of dog Contexts of dog
context frequency
bark 50
animal 30
food 10
water 5
drink 3
bath 1
14
Distributional Approaches Latent Semantic
Analysis
15
Distributional Approaches Thesaurus creation

Example dog, hot and hot dog
feed the dog, keep dogs, keep cats, stroke cats,
feed the horse,
---------------------------------
hot water cold water, hot milk, warm milk,
boiling milk, hot weather
------------------------------
eat the sandwich, eat the hot dog, cook the hot
dog, serve the burger

dog cat animal pet horse
---------------------------------
hot cold warm boiling mild
---------------------------------
hot dog hamburger sandwich pizza

16
Distributional Approaches

Schone and Jurafsky (2001) LSA weighed sum of
vectors for component words compared to MWE
candidate
Baldwin et al (2003) decomposability (simple vs
non or idiosyncratic)
of noun noun compounds and verb particle
constructions. Compared vectors of constituent
words in isolation
Bannard et al (2003) compare LSA with Lin (1999)
on verb particle constructions
Katz and Giesbrecht (2006) do token analysis
for 1 example "ins Wasser fallen" . Compare
literal and compositional vectors for this
example. Type based experiment with composed
vectors where constituent words have occurred in
isolation.

17
Distributional Methods

McCarthy et al. (2003) look at overlap of similar
words (neighbours) in a distributional thesaurus
for verb e.g. climb compared to verb and particle
construction e.g. climb down

Various other measures, including number of
neighbours in the phrasal set with the same
particle, (minus the number having the same
particle in the simplex verb neighbours)
18
Combining approaches

Venkatapathy and Joshi (2005)
frequency
PMI
substitution based on Lin (1999)
distributed frequency of object,
distributed frequency of object with dissimilar
verbs
LSA similarity of V-O with verbal form of O
LSA dissimilarity of V-O with V
All combined with SVM ranking

19
Method Selectional Preferences using
distributional thesaurus (McCarthy)

Is the argument prototypical for this predicate
and argument relationship?
E.g. eat my hat
like substitution methods, but not explicitly
looking for substitute
Verb direct objects
e.g. eat meal 5 dinner 5 tea 6 lunch 10 food 6
sandwich 3 duck 1 cheese 2
hat 3
food sandwich, cheese, meat duck
---------------------------------
meal dinner lunch tea supper
---------------------------------
clothing shirt belt hat trousers

20
Methods for evaluation token based

token based
Hashimoto et al (2006) 300 example sentences of
100 idioms, Information from dictionary for
discrimination
Katz and Giesbrecht (2006) 67 occurrences of 1
idiom (ins Wasser fallen)
literal and idiomatic readings have orthogonal
LSA vectors
Compare individual token vectors to these

21
Methods for evaluation type based

Dictionary
Schone and Jurfasky (2001) Fazly and Stevenson
(2001)
Using is-links (hyponymy)
Baldwin et al. (2003), WordNet
manual verification
Lin (1999)
Web as validation
Villavicencio (2005)
Hayes et al (2005)
Compositionality judgements
Contribution from constituents, (Bannard, 2002)
(Bannard et al 2003)
Along a continuum (McCarthy et al 2003),
(Venkatapath and Joshi, 2005)

22
Some results Compositionality Judgements on a
Continuum

McCarthy et al. (2003) 111 phrasal verb versus
verb constructions
(0-10)
3 native english speakers, highly significant
Kendall coefficient of Concordance
Venkatapathy and Joshi (2005) 765 verb object
pairs (1-6)
2 fluent english speakers, Spearmans Rank
Correlation Coefficient
Good level of agreement

carry out
cloud over
climb up
change hands take
interest announce
plan
23
Results McCarthy et al. datasets
Overlap rs Z score p under H0
X 30 0.166 1.74 0.04
X 50 0.136 1.43 0.08
OverlapS
X 30 0.306 3.21 lt0.0007
X 50 0.303 3.18 lt0.0007
24
Results McCarthy et al. datasets
X500 statistic Z score p under H0
sameparticle rs0.414 4.34 lt 0.00003
sameparticle-simplex rs0.49 5.17 lt0.00003
simplexasneighbour Mann Whitney 0.950 0.171
simplexrank rs-0.115 -1.21 0.113
simplexscore rs0.052 0.54 0.295
Piao et al (2006) Semantic lexicon (79/116) rs0.354 0.001357
25
Correlation of McCarthy et al (2003) human
rankings with statistics and dictionaries
statistic Z score P under H0
LLR rs -0.168 -1.76 0.0392
?2 rs -0.213 -2.22 0.0139
MI rs -0.248 -2.60 0.0047
Phrasal freq rs -0.096 -1.01 0.156
Simplex freq rs 0.092 0.96 0.169
WordNet Mann Whitney 2.39 0.0084
ANLT phrasals Mann Whitney 3.03 0.0012
26
Correlation of measures with man-made resources
(Mann Whitney Z scores)
In WordNet In ANLT phrasals
PMI -2.61 -4.53
sameparticle-simplex 3.71 4.59
27
Results with Venkatapathy and Joshi (2005) dataset
feature correlation feature correlation
1) Frequency (BNC) .129 2) PMI .203
3) Distributed frequency of object .111 4) Distributed frequency of object with dissimilar verbs .139
5) LSA dissimilarity of V-O with V .139 6) LSA similarity of V-O with verbal form of O .300
7) Lin (1999) substitution .210 Ranking SVM function (using 1-7) .448
McCarthy 1/pref score (638/765) -.403
28
Conclusions

Purpose of task should match method and
evaluation
Evaluation is tricky
Decisions are not clear cut
Statistical measures and substitution methods
may be useful, though capturing behaviour that
correlates with compositionality
Distributional approaches promising for
languages without resources
Selectional preferences may add useful
information, alongside other measures

29
Future

Address tokens as well as types
Tokens on a continuum
Error analysis
Separating non-decomposable from
idiosyncratically decomposable
Detecting what multiwords mean, distributional
approaches might be promising in this respect
kick the bucket --- die
share datasets!!!

30
References

Baldwin, Timothy, Colin Bannard, Takaaki Tanaka
and Dominic Widdows (2003) An Empirical Model of
Multiword Expression Decomposability. In
Proceedings of the ACL Workshop on Multiword
Expressions Analysis, Acquisition and Treatment,
Sapporo, Japan, pp. 8996.
Bannard, Colin (2002) Statistical Techniques for
Automatically Inferring the Semantics of
Verb-Particle Constructions LinGO Working Paper
No. 2002-06 http//lingo.stanford.edu/pubs/WP-20
02-06.pdf
Bannard, Colin, Timothy Baldwin and Alex
Lascarides (2003) A Statistical Approach to the
Semantics of Verb-Particles, In Proceedings of
the ACL Workshop on Multiword Expressions
Analysis, Acquisition and Treatment, Sapporo,
Japan, pp. 6572.
Fazly, Afsaneh, and Suzanne Stevenson (2006)
Automatically constructing a lexicon of verb
phrase idiomatic combinations, In Proceedings of
the 11th Conference of the European Chapter of
the Association for Computational Linguistics
(EACL), 337-344, Trento, Italy.
Hayes, Jer, Nuno Seco, and Tony Veale (2005)
Creative discovery in the lexical validation gap.
Computer Speech and Language, 19(4)513-523,
Hashimoto, Chikara, Sato Satoshi and Utsuro
Takehito (2006) Japanese Idiom Recognition
Drawing a Line between Literal and Idiomatic
Meanings, In Proceedings of the COLING/ACL 2006
Main Conference Poster Sessions pp 353-360,
Sydney, Australia.
Katz, Graham and Eugenie Giesbrecht (2006)
Automatic Identification of Non-Compositional
Multi-Word Expressions using Latent Semantic
Analysis, In Proceedings of the ACL Workshop on
Multiword Expressions Identifying and Exploiting
Underlying Properties Sydney Australia
Lin, Dekang (1998) Automatic Retrieval and
Clustering of Similar Words Automatic, In
Proceedings of 17th International Conference on
Computational Linguistics and the 36th Annual
Meeting of the Association for Computational
Linguistics Montreal, Canada.
Lin, Dekang (1999) Automatic Identification of
Non-Compositional Phrases, In Proceedings of
ACL-99, pp.317--324. University of Maryland,
Colledge Park, Maryland.
Melamed, I. Dan (1997) Automatic Discovery of
Non-Compositional Compounds in Parallel Data, in
Proceedings of the 2nd Conference on Empirical
Methods in Natural Language Processing (EMNLP),
Providence, RI.

31
References continued

McCarthy, Diana, Bill Keller and John Carroll
(2003) Detecting a Continuum of Compositionality
in Phrasal Verbs. In Proceedings of the
ACL-SIGLEX Workshop on Multiword Expressions
Analysis, Acquisition and Treatment , Sapporo,
Japan.
Mukerjee, Amitabha, Ankit Soni and Achla M Raina
(2006) Detecting Complex Predicates in Hindi
using POS Projection across Parallel Corpora In
Proceedings of the ACL Workshop on Multiword
Expressions Identifying and Exploiting
Underlying Properties pp 28-35 Sydney Australia
Pearce, Darren (2001) Synonymy in Collocation
Extraction. In WordNet and Other Lexical
Resources Applications, Extensions and
Customizations (NAACL 2001 Workshop). pp 41-46.
June. 2001. Carnegie Mellon University,
Pittsburgh.
Piao, Scott S.L., Paul Rayson, Olga Mudraya,
Andrew Wilson and Roger Garside (2006) Measuring
MWE Compositionality Using Semantic Annotation In
Proceedings of the ACL Workshop on Multiword
Expressions Identifying and Exploiting
Underlying Properties Sydney Australia pp 28-35
Schone, Patrick and Daniel Jurafsky (2001) Is
Knowledge-Free Induction of Multiword Unit
Dictionary Headwords a Solved Problem?
Proceedings of Empirical Methods in Natural
Language Processing, Pittsburgh, PA.
Venkatapathy, Sriram and Aravind, K. Joshi (2005)
Measuring the relative compositionality of
verb-noun (V-N) collocations by integrating
features. In Proceedings of HLT/EMNLP, Vancouver.
Villada Moirón, Begoña and Joerg Tiedemann
(2006). Identifying idiomatic expressions using
automatic word-alignment. In Proceedings of the
EACL Workshop on Multiword Expressions in a
Multilingual Context. Trento, Italy.
Villavicencio, A. (2005) The availability of
verb-particle constructions in lexical resources
How. much is enough? Computer Speech and
Language, 19(4)

Write a Comment

User Comments (0)

About PowerShow.com

Automatic Methods to Detect the Compositionality of Multiwords - PowerPoint PPT Presentation

Automatic Methods to Detect the Compositionality of Multiwords

Current work (McCarthy) 'prototypical selectional preference models' acquired ... Is the argument prototypical for this predicate and argument relationship? E.g. ... – PowerPoint PPT presentation