Semi-automatic methods for WordNet construction - PowerPoint PPT Presentation

About This Presentation
Title:

Semi-automatic methods for WordNet construction

Description:

Meaning. Large-scale of LK from the web. Large-scale WSD. Words and Works ... (liquid extracted from flowers, herbs, fruits, etc). Merge approach: Taxonomy ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 84
Provided by: lsi19
Learn more at: https://www.cs.upc.edu
Category:

less

Transcript and Presenter's Notes

Title: Semi-automatic methods for WordNet construction


1
Semi-automatic methods for WordNet construction
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Universitat Politècnica de Catalunya
  • Eneko Agirre
  • http//www.ji.si.upc.es/users/eneko
  • IxA NLP Group
  • University of the Basque Country

2002 International WordNet Conference
2
Setting
  • NLP and the Lexicon
  • Theoretical WG, GPSG, HPSG.
  • Practical realistic complexity and coverage
  • Lexical bottleneck (Briscoe 91)
  • Even worse for languages other than English

3
Setting
  • Which LK is needed by a concrete NLP system?
  • Where is this LK located?
  • Which procedures can be applied?

4
Setting
  • Which LK is needed by a concrete NLP system?
  • Phonology phonemes, stress, etc.
  • Morphology POS, etc.
  • Syntactic category, subcat., etc.
  • Semantic class, SRs, etc.
  • Pragmatic usage, registers, TDs, etc.
  • Translations translation links

5
Setting
  • Where is this LK located?
  • Human brain
  • Structured Lexical Resources
  • Monolingual and bilingual MRDs
  • Thesauri
  • Unstructured Lexical Resources
  • Monolingual and bilingual Corpora
  • Mixing resources

6
Setting
  • Which procedures can be applied?
  • Prescriptive approach
  • Machine-aided manual construction
  • Descriptive approach
  • Automatic acquisition from pre-existing Lexical
    Resources
  • Mixed approach

7
Outline
  • Setting
  • Words and Works
  • Merge approach
  • Taxonomy construction monolingual MRDs
  • Mapping taxonomies bilingual MRDs
  • Expand approach
  • Translation of synsets bilingual MRDs
  • Interface for manual revision
  • Conclusions

8
Words and WorksWhere is this Lexical Knowledge
located?
  • Human brain
  • Linguistic String Project (Fox et al. 88)
  • Lexical Information for 10,000 entries
  • WordNet (Miller et al. 90)
  • Semantic Information v1.6 with 99,642 synsets
  • Comlex (Grishman et al. 94)
  • Syntactic information 38,000 English words
  • CYC Ontology (Lenat 95)
  • a person-century of effort to produce 100,000
    terms
  • LDOCE3-NLP
  • dictionary with 80,000 senses

9
Words and WorksWhere is this Lexical Knowledge
located?
  • Structured Lexical Resources
  • Monolingual MRDs
  • LDOCE
  • learners dictionary
  • 35,956 entries and 76,059 definitions
  • 86 semantic and 44 pragmatic codes
  • controlled vocabulary of 2,000 words
  • (Boguraev Briscoe 89)
  • (Vossen Serail 90)
  • (Bruce Guthrie 92), (Wilks et al. 93)
  • (Dolan et al. 93), (Richardson 97)

10
Words and WorksWhere is this Lexical Knowledge
located?
  • Structured Lexical Resources
  • Other Monolingual MRDs
  • Websters (Jensen Ravin 87)
  • LPPL (Artola 93)
  • DGILE (Castellón 93), (Taulé 95), (Rigau 98)
  • CIDE (Harley Glennon 97)
  • AHD (Richardson 97)
  • WordNet (Harabagiu 98)
  • Bilingual MRDs
  • Collins Spanish/English (Knigth Luk 94)
  • Vox/Harraps Spanish/English (Rigau 98)

11
Words and WorksWhere is this Lexical Knowledge
located?
  • Structured Lexical Resources
  • Thesauri
  • Rogets Thesaurus
  • 60,071 words in 1,000 categories
  • (Yarowsky 92), (Grefenstette 93), (Resnik 95)
  • Rogets II and The New Collins Thesaurus
  • (Byrd 89)
  • Macquaries thesaurus
  • (Grefenstette 93)
  • Bunrui Goi Hyou Japanese thesaurus
  • (Utsuro et al. 93)

12
Words and WorksWhere is this Lexical Knowledge
located?
  • Structured Lexical Resources
  • Encyclopaedia
  • Groliers Encyclopaedia (Yarowsky 92)
  • Encarta (Richardson et al. 98)
  • Others
  • Telephonic Guides
  • Mixing structured lexical resources
  • Rogets Thesaurus and Groliers (Yarowsky 92)
  • LDOCE, WN, Collins, ONTOS, UM (Knight Luk 94)
  • Japanese MRD to WN (Okumura Hovy 94)
  • LLOCE, LDOCE (Chen Chang 98)

13
Words and WorksWhere is this Lexical Knowledge
located?
  • Unstructured Lexical Resources
  • Corpora
  • WSJ, Brown Corpus (SemCor), Hansard
  • Proper Nouns (Hearst Schütze 95)
  • Idiosyncratic Collocations (Church et al. 91)
  • Preposition preferences (Resnik and Hearst 93)
  • Subcategorization structures (Briscoe and Carroll
    97)
  • Selectional restrictions (Resnik 93), (Ribas 95)
  • Thematic structure (Basili et al. 92)
  • Word semantic classes (Dagan et al. 94)
  • Bilingual Lexicons for MT (Fung 95)

14
Words and WorksWhere is this Lexical Knowledge
located?
  • Using both structured and non-structured Lexical
    Resources
  • MRDs and Corpora
  • (Liddy Paik 92)
  • (Klavans Tzoukermann 96)
  • WordNet and Corpora
  • (Resnik 93), (Ribas 95), (Li Abe 95), (McCarthy
    01)

15
Words and WorksInternational Projects on Lexical
Acquisition
  • Japanese Projects
  • EDR (Yokoi 95)
  • Nine years project oriented to MT
  • Bilingual Corpora with 250,000 words
  • Monolingual, bilingual and coocurrence
    dictionaries
  • 200,000 general vocabulary
  • 100,000 technical terminology
  • 400,000 concepts

16
Words and Works International Projects on
Lexical Acquisition
  • American Projects
  • Comlex (Grishman et al. 94)
  • Syntactic information for 38,000 words
  • WordNet (Miller 90)
  • Semantic Information
  • more than 123,000 words organised in 99,000
    synsets
  • more than 116,000 relations between synsets
  • Pangloss (Knight Luk 94)
  • PUM, ONTOS, LDOCE semantic categories, WordNet
  • Cyc (Lenat 95)
  • common-sense knowledge
  • 100,000 concepts and 1,000,000 axioms

17
Words and Works International Projects on
Lexical Acquisition
  • European Projects
  • Acquilex I and II
  • LA from monolingual and bilingual MRDs and
    corpora
  • LE-Parole
  • Large-scale harmonised set of corpora and
    lexicons for all the EU languages
  • EuroWordNet
  • Multilingual WordNet for several European
    Languages
  • Meaning
  • Large-scale of LK from the web
  • Large-scale WSD

18
Words and WorksLexical Acquisition from MRDs
  • Syntactic Disambiguation (Dolan et al. 93)
  • Semantic Processing (Vanderwende 95)
  • WSD (Lesk 86), (Wilks Stevenson 97), (Rigau 98)
  • IR (Krovetz Croft 92)
  • MT (Knight and Luk 94), (Tanaka Umemura 94)
  • Semantically enriching MRDs
  • (Yarowsky 92), (Knight 93), (Chen Chan 98)
  • Building LKBs
  • (Bruce Guthrie 92)
  • (Dolan et al. 93)
  • (Artola 93)
  • (Castellón 93), (Taulé 95), (Rigau 98)

19
Words and WorksAcquisition of LK from MRDs
  • This tutorial focus on
  • the massive acquisition of LK
  • from MRDs (conventional, in any language)
  • using (semi) automatic methodologies
  • Why MRDs?

The conventional dictionaries for human use
usually contain spelling, pronunciation,
hyphenation, capitalization, usage notes for
semantic domains, geographic regions, and
propiety ethimological, syntactic and semantic
information about the most basic units of the
language (Amsler 81)
20
Words and WorksMain Problems of MRDs
  • Conventional dictionaries are not systematic
  • Dictionaries are built for human use
  • Implicit Knowledge
  • words are described/translated in terms of words

21
Words and WorksMRDs and Semantic Knowledge
  • jardín_1_1 Terreno donde se cultivan plantas y
    flores ornamentales.
  • florero_1_4 Maceta con flores.
  • ramo_1_3 Conjunto natural o artificial de
    flores, ramas o hierbas.
  • pétalo_1_1 Hoja que forma la corola de la flor.
  • tálamo_1_3 Receptáculo de la flor.
  • miel_1_1 Substancia viscosa y muy dulce que
    elaboran las abejas, en una distensión del
    esófago, con el jugo de las flores y luego
    depositan en las celdillas de sus panales.
  • florería_1_1 Floristería tienda o puesto donde
    se venden flores.
  • florista_1_1 Persona que tiene por oficio hacer
    o vender flores.
  • camelia_1_1 Arbusto cameliáceo de jardín,
    originario de Oriente, de hojas perennes y
    lustrosas, y flores grandes, blancas, rojas o
    rosadas (Camellia japonica).
  • camelia_1_2 Flor de este arbusto.
  • rosa_1_1 Flor del rosal.

22
Outline
  • Setting
  • Words and Works
  • Merge approach
  • Taxonomy construction monolingual MRDs
  • Mapping taxonomies bilingual MRDs
  • Expand approach
  • Translation of synsets bilingual MRDs
  • Interface for manual revision
  • Conclusions

23
Merge approachMain Methodology
24
Merge approachMain Methodology
  • Taxonomy construction (Rigau et al. 98, 97)
  • monolingual MRDs
  • Step 1 Selection of the main top beginners
    for a semantic primitive
  • Step 2 Exploiting genus, construction of
    taxonomies for each semantic primitive
  • Mapping taxonomies (Daudé et al. 99)
  • bilingual MRDs
  • Step 3 Creation of translation links

25
Merge approach Taxonomy ConstructionMethodology
  • Problems following a pure descriptive approach
  • Circularity
  • Errors and inconsistencies
  • Definitions with omitted genus
  • Top dictionary senses do not usually represent
    useful knowledge for the LKB
  • Too general
  • Too specific

26
Merge approach Taxonomy ConstructionMethodology
Mixed Methodology
Prescriptive approach Manual construction of
the Top Structure
27
Merge approach Taxonomy Construction
Methodology
Mixed Methodology
Prescriptive approach Manual construction of
the Top Structure
Descriptive approach Acquiring implicit
information from MRDs
28
Merge approach Taxonomy Construction
Methodology
Mixed Methodology
Prescriptive approach Manual construction of
the Top Structure
Descriptive approach Acquiring implicit
information from MRDs
29
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
Word sense zumo_1_1 Attached-to c_art_subst
type. Definition líquido que se extrae de las
flores, hierbas, frutos, etc. (liquid
extracted from flowers, herbs, fruits, etc).
30
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
  • A) Attaching DGILE senses to semantic primitives
  • 1) First labelling
  • Conceptual Distance (Rigau 94)
  • 2) Second labelling
  • Salient Words (Yarowsky 92)
  • B) Filtering Process

31
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
  • A.1) First labelling
  • Conceptual Distance (Agirre et al. 94)
  • length of the shortest path
  • specificity of the concepts
  • using WordNet
  • Bilingual dictionary

32
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
ltentitygt
ltobject, ...gt
ltartifact, artefactgt
ltstructure, constructiongt
lthouse, lodginggt
ltbuilding, edificegt
ltplace of worship, ...gt
ltreligious residence, cloisergt
ltchurch, church buildinggt
ltconventgt
ltmonasterygt
ltabbeygt
ltabbeygt
ltabbeygt
abadía_1_2 Iglesia o monasterio regido por un
abad o abadesa (abbey, a church or a monastery
ruled by an abbot or an abbess)
33
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
ltentitygt
ltobject, ...gt
ltartifact, artefactgt
ltstructure, constructiongt
lthouse, lodginggt
ltbuilding, edificegt
ltplace of worship, ...gt
ltreligious residence, cloisergt
ltchurch, church buildinggt
ltconventgt
ltmonasterygt
ltabbeygt 06 ARTIFACT
ltabbeygt
ltabbeygt
abadía_1_2 Iglesia o monasterio regido por un
abad o abadesa (abbey, a church or a monastery
ruled by an abbot or an abbess)
34
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
  • A.1) First labelling (Results)
  • 29,205 labelled definitions (31)
  • 61 accuracy at a sense level
  • 64 accuracy at a file level

35
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
  • A.2) Second labelling
  • Salient Words (Yarowsky 92)
  • Importance
  • local frequency
  • appears more significantly more often in the
    corpus of a semantic category than at other
    points in the whole corpus

36
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
  • A.2) Second labelling (Results)
  • 86,759 labelled definitions (93)
  • 80 accuracy at a file level

biberón_1_1 ARTIFACT 4.8399 Frasco de cristal ...
(glass flask ...) biberón_1_2 FOOD
7.4443 Leche que contiene este frasco ...
(milk contained in that flask ...)
37
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
  • B) Filtering process (FOODs)
  • removes all genus terms
  • FILTER 1 not FOODs by the bilingual mapping
  • FILTER 2 appear more often as genus in other
    Semantic Primitive
  • FILTER 3 with a low frequency

38
Merge approach Taxonomy Construction Step 1
Selection of the main top beginners
  • B) Filtering process (FOOD Results)

39
Merge approach Taxonomy Construction Step 2
Exploiting Genus
Word sense vino_1_1 Hypernym zumo_1_1. Definiti
on zumo de uvas fermentado. (fermented
juice of grapes). Word sense rueda_2_1
Hypernym vino_1_1. Definition vino procedente
de la región de Rueda (Valladolid).
(wine from the region of Rueda).
40
Merge approach Taxonomy Construction Step 2
Exploiting Genus
  • Genus Sense Identification
  • 97 accuracy for nouns
  • Genus Sense Disambiguation
  • Unrestricted WSD (coverage 100)
  • Knowledge-based WSD (not supervised)
  • Eight Heuristics (McRoy 92)
  • Combining several lexical resources
  • Combining several methods

41
Merge approach Taxonomy Construction Step 2
Exploiting Genus
Results
42
Merge approach Taxonomy Construction Step 2
Exploiting Genus
Knowledge provided by each heuristic
43
Merge approach Taxonomy Construction Step 2
Exploiting Genus
F2F3gt9 35,099 definitions F2F3gt4 40,754
definitions No filters 111,624 definitions
44
Merge approach Taxonomy Construction Step 2
Exploiting Genus
... zumo_1_1 vino_1_1 quianti_1_1 zumo_1_1
vino_1_1 raya_1_8 zumo_1_1 vino_1_1
requena_1_1 zumo_1_1 vino_1_1 reserva_1_12
zumo_1_1 vino_1_1 ribeiro_1_1 zumo_1_1
vino_1_1 rioja_1_1 zumo_1_1 vino_1_1
roete_1_1 zumo_1_1 vino_1_1 rosado_1_3
zumo_1_1 vino_1_1 rueda_2_1 zumo_1_1
vino_1_1 sherry_1_1 zumo_1_1 vino_1_1
tarragona_1_1 zumo_1_1 vino_1_1 tintilla_1_1
zumo_1_1 vino_1_1 tintorro_1_1 zumo_1_1
vino_1_1 toro_3_1 ...
45
Merge approach Mapping Taxonomies Step 3
Creation of translation links
C1
C2
C3
C4
C5
C6
46
Merge approach Mapping Taxonomies Step 3
Creation of translation links
C1
C2
C3
C4
C5
C6
47
Merge approach Mapping Taxonomies Step 3
Creation of translation links
  • Connecting already existing Hierarchies
  • Relaxation labelling Algorithm
  • Constraints
  • Between
  • Spanish taxonomy automatically derived from an
    MRD (Rigau et al. 98)
  • WordNet
  • using a bilingual MRD

48
Merge approach Mapping Taxonomies Step 3
Creation of translation links
49
Merge approach Mapping Taxonomies Step 3
Relaxation Labelling algorithm
  • Iterative algorithm for function optimisation
    based on local information
  • it can deal with any kind of constraints
  • variables (senses of the taxonomy)
  • labels (synsets)
  • Finds a weight assignment for each possible label
    for each variable
  • weights for the labels of the same variable add
    up to one
  • weight assignation satisfies -to the maximum
    possible extent- the set of constraints

50
Merge approach Mapping Taxonomies Step 3
Relaxation Labelling algorithm
  • 1) Start with a random weight assignment
  • 2) Compute the support value for each label of
    each variable (according to the constraints)
  • 3) Increase the weights of the labels more
    compatible with context and decrease those and
    decrease those of the less compatible labels.
  • 4) If a stopping/convergence is satisfied, stop,
  • otherwise go to step 2.

51
Merge approach Mapping Taxonomies Step 3
Constraints
  • Rely on the taxonomy structure
  • Coded with three characters
  • X Spanish Taxonomy, I (immediate),
  • Y English Taxonomy, A (ancestor)
  • X Relation, E (hypernym), O (hyponym), B (both)
  • Examples

IIE
AAB




52
Merge approach Mapping Taxonomies Step 3
Results
  • Poly TOK, FOK TOK, FNOK total
  • animal 279 (90) 30 (91) 209 (90)
  • food 166 (94) 3 (100) 169 (94)
  • cognition 198 (67) 27 (90) 225 (69)
  • communication 533 (77) 40 (97) 573 (78)
  • all TOK, FOK TOK, FNOK total
  • animal 424 (93) 62 (95) 486 (90)
  • food 166 (94) 83 (100) 249 (96)
  • cognition 200 (67) 245 (90) 445 (82)
  • communication 536 (77) 234 (97) 760 (81)

53
Merge approach Mapping Taxonomies Step 3
Example
piel
(substance ltskin, fur, peelgt)
marta
(substance ltsable, marte, coal_backgt)
visón
(substance ltmink, mink_coatgt)
54
Outline
  • Setting
  • Words and Works
  • Merge approach
  • Taxonomy construction monolingual MRDs
  • Mapping taxonomies bilingual MRDs
  • Expand approach
  • Translation of synsets bilingual MRDs
  • Interface for manual revision
  • Conclusions

55
Expand approach
  • Take one WordNet as starting point
  • Translate synsets
  • English ltcar, automobilegt
  • Basque ltauto, berebilgt
  • We obtain a structurally similar WordNet in
    another language, but some of the synsets will be
    missing
  • Use bilingual dictionary
  • maintien n.m. (attitude) bearing (conservation)
    maintenance
  • 1. Keep bilingual senses (Agirre Rigau 95)
  • maintien1 (attitude) bearing maintien2
    (conservation) maintenance
  • 2. Produce all translation pairs (Atserias et al.
    97)
  • maintien - bearing
  • maintien - maintenance

56
Expand approach - produce all pairings
  • Used to produce the first version of the nominal
    part of the Spanish WordNet
  • Based on WN 1.5
  • Both directions in bilingual dictionary merged
  • Spanish/English 19,443 translation pairs
  • English/Spanish 16,324 translation pairs
  • Harmonized bilingual 28,131 translation pairs
  • Overlap with WordNet 12,665 nouns (14)
  • Two methods
  • class methods consider only pairings
  • conceptual distance methods consider similarity
    of synsets

57
Expand approach - produce all pairings
  • Ten class methods
  • Four monosemic criteria
  • Four polysemic criteria
  • Two hybrid criteria
  • Three conceptual distance methods
  • CD1 using pairwise word coocurrences
  • CD2 using headword and genus
  • CD3 using bilingual Spanish entries with
    multiple translations

58
Expand approach - produce all pairings
  • Class methods
  • Four possible configurations for pairs which
    either share an English word or an Spanish word
    connected graph.

59
Expand approach - produce all pairings
  • 4 monosemous class methods
  • All English words involved are monosemous in WN

60
Expand approach - produce all pairings
  • 4 polysemous class methods
  • At least 1 English word involved is polysemous

61
Expand approach - produce all pairings
  • 2 other class methods
  • Variant criteriontwo synonyms share a single
    SW
  • Field criterionuse field indicators in
    bilingual entry when available

lt..., EW, ..., EW, ...gt
SW
VC
lt..., headword-EW, ..., Ind-EW, ...gt
FC
SW
62
Expand approach - produce all pairings
  • Ten class methods (results)

63
Expand approach - produce all pairings
  • Conceptual Distance Methods (Agirre et al. 94)
  • length of the shortest path
  • specificity of the concepts
  • Using WordNet
  • Bilingual dictionary

64
Expand approach - produce all pairings
  • Three conceptual distance methods
  • CD1 using pairwise word coocurrences from
    monolingual dict.
  • CD2 using headword and genus from monolingual
    def.
  • CD3 using bilingual Spanish entries with
    multiple translations

65
Expand approach - produce all pairings
ltentitygt
CD2
ltobject, ...gt
ltartifact, artefactgt
lthouse, lodginggt
ltreligious residence, cloisergt
abadía_1_2 Iglesia o monasterio regido por un
abad o abadesa (abbey, a church or a monastery
ruled by an abbot or an abbess)
66
Expand approach - produce all pairings
  • Three conceptual distance methods

67
Expand approach - produce all pairings
  • Keep SW-synset pairs produced by methods with
    precision above 85
  • mono1
  • mono2
  • mono3
  • mono4
  • variant
  • But, if two different methods propose the same
    SW-synset pair, it could get better confidence
  • try pairwise combinations of methods

68
Expand approach - produce all pairings
  • Combinations of methods higher precision in some
    cases

69
Expand approach - produce all pairings
  • Results
  • SpWN v 0.1
  • BasqueWN v 0.1
  • 2 bilingual dictionaries
  • apply first 8 class methods only

70
Expand approach - bilingual senses
  • Smaller experiment with French bilingual
    dictionary
  • Based on WN 1.5
  • Keep structure of bilingual dictionary bilingual
    senses
  • 21322 entries, 31502 subentries (senses)
  • 16917 nominal subentries
  • Disambiguation is possible
  • 1) one of the translation words is monosemous in
    WordNet.
  • 2) the translation is given by a list of words.
  • 3) a cue in French is provided alongside the
    translation.
  • 4) a semantic field is provided.
  • folie 1 n.f. madness
  • provision 1 n.f. supply, store
  • trésor 2 n.m. (ressources) (comm.) finances

71
Expand approach - bilingual senses
  • Possible disambiguation case by case

72
Expand approach - bilingual senses
  • Disambiguation Conceptual Density Agirre
    Rigau 95
  • The relatedness of a certain word-sense to the
    words in the context (cue, other translations
    and/or semantic field) allows us to select that
    sense over the others
  • Bilingual dictionary English WordNet

73
Expand approach - summary
  • all pairings
  • coverage and precision
  • produce a good starting point for manual revision
  • bilingual senses
  • keeping bilingual sense might help precision
  • very low coverage

74
Outline
  • Setting
  • Words and Works
  • Merge approach
  • Taxonomy construction monolingual MRDs
  • Mapping taxonomies bilingual MRDs
  • Expand approach
  • Translation of synsets bilingual MRDs
  • Interface for manual revision
  • Conclusions

75
Interface for manual revision
76
Interface for manual revision
77
Interface for manual revision
  • Client/Server achitecture
  • Data base
  • EWN design implemented on SQL tables
  • English, Spanish, Catalan and Basque
  • Interface
  • Perl CGIs that access the data bases

78
Outline
  • Setting
  • Words and Works
  • Merge approach
  • Taxonomy construction monolingual MRDs
  • Mapping taxonomies bilingual MRDs
  • Expand approach
  • Translation of synsets bilingual MRDs
  • Interface for manual revision
  • Conclusions

79
Conclusions
  • methods to automatically produce preliminary
    versions
  • methods mainly for nouns
  • need to manually revise
  • merge approach
  • method to produce native hierarchies and word
    senses
  • trust lexicographers hierarchies
  • need to map to ILI in independent process
  • expand approach
  • method to translate English WNs synsets
  • trusts WNs hierarchies, sense distinctions
  • mapping to ILI for free

80
Conclusions
  • merge approach
  • manual work
  • revising and re-organizing the automatic
    hierarchies (hard)
  • revising automatic mapping (very hard)
  • allows for integration of data from monolingual
    dictionary
  • definition text itself
  • lexico-semantic relations from definitions
  • expand approach
  • manual work
  • revise proposed translations (fast)
  • review the rest of the synsets (many)
  • include glosses

81
Conclusions
  • Interface to speed up manual work
  • Downloadable soon
  • WN 1.5 in data-base format
  • Interface
  • WordNets can be checked at
  • http//www.lsi.upc.es/nlp
  • http//ixa.si.ehu.es/wei3.html
  • This slides will (shortly) be available at
  • http// ...
  • http//www.ji.si.ehu.es/users/eneko

82
Bibliography
83
Semi-automatic methods for WordNet construction
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Universitat Politècnica de Catalunya
  • Eneko Agirre
  • http//www.ji.si.upc.es/users/eneko
  • IxA NLP Group
  • University of the Basque Country

2002 International WordNet Conference
Write a Comment
User Comments (0)
About PowerShow.com