Title: An Approach to Catalan Adjective Classes by Clustering
1An Approach to Catalan Adjective Classes by
Clustering
- Laura Alonso Alemany
- Universitat de Barcelona
- lalonso_at_fil.ub.es
- Gemma Boleda Torrent
- Universitat Pompeu Fabra
- gemma.boleda_at_trad.upf.es
2motivation
- to search for empirical (corpus-based) support
for theories of adjective classification via
data-driven methods - to enhance a lexicon with information on
adjective classes in an inexpensive and reliable
way
3contents
- introduction
- previous theoretical work
- a preliminary hypothesis
- experiments on clustering adjectives
- results and discussion
4introduction
- hypothesis 0 a single class of adjectives
- BUT heterogeneous behaviour of adjectives
- La noia Ć©s (molt) alta the girl is (very) tall
- La bandera Ć©s nacional the flag is national
- LassassĆ Ć©s presumptethe murderer is alleged
5why clustering
- clustering has been used for inferring knowledge
in not-so-well-known domains - verbal subcategorization and selectional
restrictions (Schulte im Walde Brew 2002) - inference of POS tags for unknown languages
- it introduces little bias into the final results
- there are no pre-defined classes (as opposed to
classification methods see Bohnet et al. 2002) - ... but bias in modelling the data
6problems with clustering
- it is a data-driven technique, but the
appropriate degree of abstraction must be chosen - completely data-driven approaches are possible,
but - the search space becomes far too big
- they are very sensitive to data sparseness
7contents
- introduction
- previous theoretical work
- a preliminary hypothesis
- experiments on clustering adjectives
- results and discussion
8two traditions
- two main scholarly traditions regarding the study
of adjectives - descriptive grammar
- morphology (derivational processes) and syntax
(ordering among adjectives and with respect to
head) - denotational semantics
- formal semantics
- semantic type (modifier or predicate)
9classifications
10qualitative / lte,tgt
- predicative (syntactic version Levi 1978)
- red house / this house is red
- national flag / this flag is national
- alleged murderer / this murderer is alleged
- gradable / comparable
- very red / redder, reddish
- scalar (Raskin Nirenburg 1995)
- red/green/blue, big/small
- in Catalan, typically following the head noun
11adverbial / ltlte,tgt, lte,tgtgt
these parameters seem to be relevant, well use
them in experiments
- nonpredicative
- alleged murderer / this murderer is alleged
- nongradable, noncomparable
- very/more alleged murderer
- nonscalar
- and no antonym
- in Catalan, only preceding the head noun
12on adjective position
- the position of the adjective in Catalan and in
other Romance languages is related to reference
restriction (GCC, GDLE) - prenominal ? nonrestricting
- postnominal ? restricting
- very few strict nonpredicative adjectives
- usual case mixed behaviour, with shift in
meaning (potential problem!) - antic president former president
- nonpredicative reading
- armari antic antique wardrobe
- qualitative reading
13a gap relational
- a.o. Bally 1944, GDLE, GCC, Engel 1988, Levi 1978
la mĆ quina Ć©s agrĆcola THE MACHINE IS
AGRICULTURAL
una mĆ quina gran agrĆcola vs. una mĆ quina
agrĆcola gran A MACHINE AGRICULTURAL BIG
una mĆ quina agrĆcola i gran A MACHINE
AGRICULTURAL AND BIG
14a gap relational
- predicativity mixed behavior
- El congrƩs Ʃs internacional ? lte,tgt
- THE CONFERENCE IS INTERNATIONAL
- La Joana Ć©s corresponsal internacional
- THE JOANA IS INTERNATIONAL CORRESPONDENT
- La Joana Ć©s internacional /? lte,tgt
- La corresponsal Ć©s internacional /? lte,tgt
- ambiguity / class shift or property?
15a gap relational
- gradability and comparativity
- said to be nongradable and noncomparable but very
easy qualitativization - un tractor molt agrĆcola
- A TRACTOR VERY AGRICULTURAL
- una noia molt internacional
- A GIRL VERY INTERNATIONAL
- (has travelled a lot, knows many people
from abroad) - could reflect diachronic processes
these facts could explain results at least in
part-
16contents
- introduction
- previous theoretical work
- a preliminary hypothesis
- experiments on clustering adjectives
- results and discussion
17adjective classes
- hypothesis three classes of adjectives
- qualitative
- non predicative
- relational
vermell red, alt tall presumpte
alleged agrĆcola agricultural
18challenges
- does this classification have empirical
(corpus-based) support? - can adjectives be automatically classified using
the features reviewed? - which are the most relevant features for
adjective classification?
19contents
- introduction
- previous theoretical work
- a preliminar hypothesis
- experiments on clustering adjectives
- results and discussion
20modelling adjectives
- find a textual correlate of theoretical
parameters that describe semantic classes - in terms of morphosyntactic data
- retrievable from an annotated corpus
- it is not always possible
- and careful with redundant features!
- values are difficult to set adequately
21the set of attributes
- follows a verb
- cooccurs with molt very and the like
- form inflected by size morphemes
- cooccurs with mƩs/menys more/less
- form inflected by superlative morpheme Ćssim
- precedes or follows a noun
- precedes or follows an adjective
-
- comparativity, scalability
- ref. restr. / relative ordering
- distributional properties
- POS of surrounding words (five word window)
22corpus fragment of CTILC
- collected by the Institute for Catalan Studies
(IEC) - 8.5 million words
- Catalan texts from 1970 onwards
- only written, quite formal register
- manually revised tagging (but there are errors!!)
- lemma, part-of-speech, morphological info (EAGLES
standard) - no syntactic information
- 571365 adjective occurrences (tokens)
- 17325 adjective lemmata (types)
23data and tools
- each adjective is described as a vector
- where each dimension is one of the features
relevant for characterising the adjective - the values of the features are a real value
between 0 and 1 - a matrix is built with all the vectors
- perform the clustering with CLUTO (Karypis 2002)
24experiment setting
- set of objects only frequent adjectives (4859
objects, 10 occurrences) - set of attributes
- only textual correlates of semantic properties
- only context of occurrence
- combination of 1, 2 / with customized values
- attribute values true percentages
- number of clusters 2, 3, 4, 5, 6, 7
- clustering parameters
- combination of E/I criteria, partitional algorithm
25gold standard
nonpredicative very few, not represented ? added
manually
- annotated by human judges
- 76 adjectives chosen randomly from the corpus
- classified by human judges into 41 classes
- qualitative calent hot, actiu active/lively
- relational cientĆfic, digital
- qualitative/non-predicative antic
- non-predicative presumpte alleged, mer mere
- errors artista
- costly process, only a small number of adjectives
can be considered
26contents
- introduction
- previous theoretical work
- a preliminary hypothesis
- experiments on clustering adjectives
- results and discussion
27semantic parametersvs. gold standard
467 3040 229
593 787
28semantic parametersvs. gold standard
gradability 0.07, comparativity 0 millor best,
eixerit nice, lively
preceding common noun 0.06, after common noun
0.49 presumpte alleged, antic
former/old/antique
after common noun 0.49, comparativity
0 important, subversiu subversive
after noun 0.54, comparativity 0 alemany
german, internacional
predicativity 0.1, comparativity 0, after Adj
0.03 possible, necessari
467 3040 229
593 787
29contextual vs. semantic attributes
contextual
2107 697 290
593 1172
467 3040 229
336 787
semantic
30contextual vs. semantic attributes
-1 common noun 0.5, -2 determiner 0.34 general,
negre black, alemany german, internacional
1 punctuation 0, -1 Noun 0.5 preescolar,
subversiu
1 punctuation, -1 adv possible, hot
1 noun, -1 determiner mer mere, antic
contextual
2107 697 290
593 1172
1Prep 0.3, 2determiner 0.25 important,
necessari, diagonal
467 3040 229
336 787
semantic
31agreement between solutions
32homogeneity of adjective classesvs. gold
standard
semantic parameters and context
contextual attributes
semantic parameters
customized values
33questions
- which is the best clustering solution?
- which attributes are actually descriptive of
adjective behaviour? - which are noisy?
- which classes receive empirical support?
34discussion
- contextual and semantic features yield quite
similar results, although - semantic features seem to be more adequate
- contextual are stronger!
- the most discriminating attribute is position of
the adjective with respect to the noun - why are some others not discriminating?
(modelling) - noisy
- preposition follows
- punctuation follows
35discussion
- clustering is a useful technique for inductive
investigation on adjective classes - which hadnt been done before
- theoretically biased results are supported by
distributional properties
36discussion
- the following classes of adjectives emerge from
the results - nonpredicative (with few elements)
- relational
- consistent behaviour
- similar to a part of the qualitative
- could reflect a diachronic process or class shift
- or a bad modelling of the adjectives
37discussion
- qualitative adjectives as described in the
literature are not homogeneous - predicativity, gradability and comparativity are
not distributed uniformly in these adjectives - distributional properties are not uniform either
unexpected?
38future work
- further linguistic investigation of results
- other clustering solutions
- evaluation
39references
- Bally, C. (1944) Linguistique gƩnƩrale et
linguistique franƧaise - B. Bohnet, S. Klatt and L. Wanner (2002) An
Approach to Automatic Annotation of Functional
Information to Adjectives with an Application to
German - GDLE Bosque, I. and V. Demonte, eds. (1999)
GramƔtica Descriptiva de la Lengua EspaƱola - Engel, U. (1988) Deutsche Grammatik, Heidelberg
Julius Groos Verlag - Levi, J. N. (1978) The Syntax and Semantics of
Complex Nominals - Montague, R. (1974) Formal Filosophy. Selected
Papers of Richard Montague - Raskin, V. and S. Nirenburg (1995) Lexical
Semantics of Adjectives. A Microtheory of
Adjectival Meaning - Schulte im Walde, S. and C. Brew (2002) Inducing
German Semantic Verb Classes from Purely
Syntactic Subcategorisation Information - GCC SolĆ , J. et al., eds. (2002) GramĆ tica del
CatalĆ Contemporani
40a vector
mes1_Esp 0.0331491712707182
mes1_Nom 0.0267034990791897 mes1_PT
0.320441988950276 mes1_Prep
0.366482504604052 mes1_Pron
0.0220994475138122 mes1_Verb
0.0460405156537753 mes1_no
0.0220994475138122 mes2_Adj
0.0607734806629834 mes2_Adv
0.00552486187845304 mes2_Conj
0.0552486187845304 mes2_Det
0.276243093922652 mes2_Esp
0.069060773480663 mes2_Nom
0.160220994475138 mes2_PT
0.0718232044198895 mes2_Prep
0.124309392265193 mes2_Pron
0.0303867403314917 mes2_Verb
0.140883977900552
menys2_Verb 0.25414364640884
menys2_no 0.00552486187845304 menys1_Adj
0.00276243093922652 menys1_Adv
0.0110497237569061 menys1_Conj
0.0276243093922652 menys1_Det
0.0276243093922652 menys1_Esp 0
menys1_Nom 0.81767955801105 menys1_Num 0
menys1_PT 0.0110497237569061
menys1_Prep 0.00552486187845304
menys1_Verb 0.0524861878453039 menys1_no
0.0331491712707182 mes1_Adj
0.00828729281767956 mes1_Adv
0.0441988950276243 mes1_Conj
0.0718232044198895 mes1_Det
0.0386740331491713
- verd 181
- serestarsemblarpredicatiu
0.0386740331491713 - comparativitat 0
- gradabilitat 0.0165745856353591
- modificador_dreta 0.0220994475138122
- modificador_esquerra 0.895027624309392
- menys2_Adj 0.0497237569060773
- menys2_Adv 0.00828729281767956
- menys2_Conj 0.00552486187845304
- menys2_Det 0.281767955801105
- menys2_Esp 0.0441988950276243
- menys2_Nom 0.0911602209944751
- menys2_Num 0
- menys2_PT 0.0607734806629834
- menys2_Prep 0.187845303867403
- menys2_Pron 0.0110497237569061
predicativity comparativity gradability right
modifier left modifier
back
41the matrix
- 0 0 0 0 0 0.0833333333333333 0 0 0.75 0 0
0.166666666666667 0 0 0 0 0 0 0 0 0 1 - 0.2 0 0.0666666666666667 0.0666666666666667 0 0
0.0666666666666667 0 0.466666666 - 0.384615384615385 0 0.153846153846154 0 0 0 0
0.0769230769230769 0.2307692307692 - 0.11 0 0.075 0.015 0.005 0.12 0.04 0.04 0.33 0.13
0.01 0.15 0.01 0.06 0 0.01 0.1 - 0 0 0 0 0 0 0.0133333333333333 0 0.88 0 0
0.0933333333333333 0 0.013333333333333 - 0.192307692307692 0.0384615384615385
0.0384615384615385 0 0 0.0769230769230769 0 - 0.117647058823529 0 0.0784313725490196 0 0
0.196078431372549 0.0196078431372549 - 0 0 0 0.0789473684210526 0.0263157894736842
0.105263157894737 0 0 0.368421052631 - 0 0 0 0 1 0.0588235294117647 0 0.294117647058824
0 0 0 0 0 0.588235294117647 0.0 - 0.0952380952380952 0.0476190476190476
0.0476190476190476 0.0476190476190476 0.04 - 0 0 0.0681818181818182 0 0 0.204545454545455
0.0681818181818182 0 0.272727272727 - 0.0769230769230769 0 0 0 0 0.230769230769231 0 0
0.461538461538462 0 0 0.2307692 - 0.04 0 0.08 0 0 0.28 0 0 0.4 0.04 0 0.04 0 0.08 0
0 0.16 0 0.12 0.2 0.2 0.36 0 0 - 0.293333333333333 0 0.04 0.0133333333333333 0
0.08 0.0133333333333333 0.02666666 - 0.133333333333333 0 0 0 0 0.133333333333333 0
0.0666666666666667 0.4666666666666 - 0 0 0 0.0909090909090909 0.0909090909090909
0.181818181818182 0 0.09090909090909 - 0.0434782608695652 0 0.130434782608696
0.0434782608695652 0 0.130434782608696 0 - 0.104166666666667 0 0.0625 0 0.0208333333333333
0.0208333333333333 0.0625 0.1041 - 0.0526315789473684 0 0.105263157894737
0.0526315789473684 0 0 0.105263157894737
back
42CLUTO(v. 1.5.1, Karypis 2002)
- high dimensional datasets
- analysis of cluster features
- partitional or agglomerative algorithms
- various criterion functions, taking into account
similarity within the objects in a cluster
(internal criterion) and/or the differences
between objects of different clusters (external
criterion)
partitional
combination of internal and external criteria
back
43human gold standardinter-judge agreement
back
44contextual attributes vs. gold standard
agreement with semantic attributes
45contextual attributes vs. gold standard
following common noun (50), following specifier
(34)
preceding common noun (7), following specifier
(7)
preceding preposition (30), preceding specifier
(25)
preceding punctuation (40), following adverb or
verb
not preceding punctuation, following common noun
(50)
human
agreement with semantic attributes
46customized valuesgradability and comparativity
normalized to binary
gradability (61), after common noun (10),
followed by common noun (2)
comparativity (12), followed by common noun
(2), after common noun (11)
comparativity (14), gradability (61)
after common noun (11), comparativity (13),
gradability (61)
gradability (65), comparativity (12)
back
47customized valuesgradability and comparativity
normalized to binary
back
48interpretation of resultsquality of cluster
solution
- tightness of obtained clusters
- objects within a cluster are very similar to each
other - objects are very dissimilar to objects in
different clusters - attribute distribution different values across
clusters evidence discriminating function of
attributes
49tightness of clustering solutions
back
50attribute distribution across clusters
51attribute distribution across clusters
back
52attribute distribution across clusters
back to interpretation
back
53decision list
- a gold standard annotated by human judges
- a gold standard built with a decision list
- deductive classification using some of the
attributes in the vectors for classifying
adjectives into pre-defined classes - predicativity
- position with respect to the head noun
- gradability and comparativity
- fully automatic inexpensive but unsupervised
54decision list vs.human gold standard
a deductive approach does not provide a good
solution