German Rigau i Claramunt - PowerPoint PPT Presentation

1 / 127
About This Presentation
Title:

German Rigau i Claramunt

Description:

German Rigau i Claramunt http://www.lsi.upc.es/~rigau TALP Research Center Departament de Llenguatges i Sistemes Inform tics Universitat Polit cnica de Catalunya – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 128
Provided by: Hora168
Learn more at: http://www.cs.upc.edu
Category:

less

Transcript and Presenter's Notes

Title: German Rigau i Claramunt


1
Ontologies
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Departament de Llenguatges i Sistemes Informàtics
  • Universitat Politècnica de Catalunya

2
Ontologies
Outline
  • WordNet (Miller et al. 90, Fellbaum 98)
  • EuroWordNet (Vossen et al. 98)
  • Spanish WordNet
  • Combining Methods (Atserias et al. 97)
  • Mapping hierarchies (Daudé et al. 01)
  • Mikrokosmos (Viegas et al. 96)
  • Cyc (Malesh et al. 96)
  • WordNet 2 (Harabagiu 98)
  • MindNet (Richardson et al. 97)
  • ThoughtTreasure (Mueller 00)
  • Meaning ...

3
WordNet EuroWordNet
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Departament de Llenguatges i Sistemes Informàtics
  • Universitat Politècnica de Catalunya

4
WordNet EuroWordNetWordNet
  • Universidad de Princeton (Miller et al. 1990)
  • Conceptos lexicalizados (parabras, lexíes)
  • Relacionados entre sí por relaciones semánticas
  • sinonimia
  • antonimia
  • hiperonimia-hiponimia
  • meronimia
  • implicación
  • causa
  • ...

5
WordNet EuroWordNetRelaciones Semánticas de
WN1.5
  • Sinonimia
  • Conceptos Lexicalizados (SYNSETS)
  • Noción débil de sinonimia Sinonimia en contexto
  • Synset Conjunto de palabras o lexías que en un
    contexto dado expresan un concepto
  • Hiperonimia / Hiponimia
  • Relación de clase a subclase

6
WordNet EuroWordNetRelacions Semàntiques de
WN1.5
  • Meronimias
  • Parte componente
  • mano?brazo
  • Elemento de colectividad
  • persona?gente
  • Sustancia
  • periódico?papel

7
WordNet EuroWordNetRelaciones Semánticas de
WN1.5
  • Antonimia
  • grande?pequeño
  • Causa
  • matar?morir
  • Implicación
  • divorciarse?casarse
  • Derivación
  • presidencial?presidente
  • Similitud
  • bueno?positivo

8
WordNet EuroWordNetEjemplo WordNet
ltconveyancegt
ltvehiclegt
ltdoorlockgt
ltcar doorgt
ltmotor vehicle, automovile,...gt
ltcruiser, squad car, patrol car, ...gt
ltcruiser, squad car, patrol car, ...gt
ltcab, taxi, hack, ...gt
9
WordNet EuroWordNetEuroWordNet
  • Proyecto LE-2 4003
  • Telematics Application Programme de la UE
  • Redes semánticas de diversas lenguas
  • Integradas e interconectadas
  • Inglés Universidad de Sheffield
  • Holandés Univ. de Amsterdam
  • Italiano I.L.C. de Pisa
  • Español UB, UPC, UNED.
  • Computers and the Humanities
  • (Vol.monográfico,1998)
  • http//www.hum.uva.nl/ewn/

10
WordNet EuroWordNetExtensiones EuroWordNet
  • EWN2
  • Alemán, Francés, Checo, Sueco, Estonio
  • Proyecto ITEM
  • Castellano, Catalán, Vasco
  • CREL (Centre de Referència dEnginyeria
    Lingüística)
  • Catalán (UB, UPC)

11
WordNet EuroWordNetAplicaciones
  • Desarrollo de recursos Básicos
  • Tratamiento interlingüístico de la información
  • - Sistemas multilingües de recuperación de
    información (p.e., Internet)
  • - Módulo léxico-semántico de los sistemas de
    ingeniería lingüística
  • ? Extracción de información
  • ? Traducción automática

12
WordNet EuroWordNetRequisitos de Diseño
  • Preservación de las relaciones semánticas
    específicas de cada lengua
  • Máxima compatibilidad entre los diferentes
    recursos
  • Relativa independencia de los WordNets
  • en el proceso de construcción
  • en el resultado final

13
(No Transcript)
14
WordNet EuroWordNetComponentes de EuroWordNet
  • Núcleo
  • El ILI
  • La Top Concept Ontology (TCO)
  • Ontología de dominios (DO)
  • Periferia
  • WordNets específicos

15
WordNet EuroWordNetInterlingual Index of
EuroWordNet
  • Colección no estructurada de elementos
  • Ligados con
  • al menos, un synset de un EWN
  • un elemento de la TCO o DO
  • Asociados a synsets de WN 1.5

16
WordNet EuroWordNetTop Concept Ontology of
EuroWordNet
  • Jerarquía de conceptos independientes de la
    lengua
  • distinciones semánticas objeto, lugar, dinámico,
  • abstracta (no léxica)
  • Superpuesta al ILI
  • Tres tipos de entidades
  • Primer orden entidades concretas
  • Segundo orden situaciones estáticas o dinámicas
  • Tercer orden proposiciones abstractas

17
WordNet EuroWordNetTop Concept Ontology of
EuroWordNet
18
WordNet EuroWordNetDomain Ontology of
EuroWordNet
  • Jerarquía de etiquetas de dominio
  • Reducción de la polisemia
  • Dominios
  • Tráfico
  • Tráfico rodado, tráfico aéreo
  • Información Internacional
  • Micología
  • Medicina

19
WordNet EuroWordNetRelaciones de EuroWordNet
  • Riqueza superior a WN
  • Entre
  • synsets (módulos monolingües)
  • registros ILI (multilingües)
  • actuar-1 EQ-SYNONYM behave in a certain
    manner
  • registros ILI y TCO o OD

20
WordNet EuroWordNetRelaciones
Interlingüísticas de EuroWordNet
21
WordNet EuroWordNetRelaciones de EuroWordNet
22
Spanish WordNetBuilding Process
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Departament de Llenguatges i Sistemes Informàtics
  • Universitat Politècnica de Catalunya

23
Spanish WordNetGeneral Methodology
  • 1) Mapping to WN1.5
  • manual work
  • automatic derivation of equivalents, using
    bi-lingual dictionaries
  • 2) Manual correction
  • 3) Re-structuring

24
Spanish WordNetMain Steps First Core (Manual
Translation)
  • Nouns
  • A) WN1.5s Tops File plus first level of hyponyms
    (about 800 synsets).
  • B) The rest of EWNs Common Base Concepts (which
    were not in our set).
  • C) Manual translation of synsets intermediate
    between (A) and (B) following WN1.5 hyerarchy
    ¾thus building a compact taxonomy equivalent to
    WN1.5 without gaps¾
  • Verbs
  • Manual translation of EWNs Base Concepts (about
    150 synsets)

25
Spanish WordNetMain Steps Subset 1
(Semi-automatic)
  • Nouns
  • Applying authomatic methods using bi-lingual
    dictionaries
  • Manual validation of several subsets to check if
    the link is correct
  • Deriving a Confidence Score (CS) for every
    authomatic method (heuristic)
  • Selecting pairs synset-word above 85 CS
  • Some manual correction of this Subset 1 (mainly,
    filling gaps)
  • Verbs
  • 3600 English verbs connected to WN1.5 senses and
    ambiguously translated to Spanish are manually
    inspected and disambiguated

26
Spanish WordNetMain Steps Subset 1 (Results 1)
27
Spanish WordNetMain Steps Subset 1 (Results 2)
28
Spanish WordNetMain Steps Subset 2
  • Main goals
  • enhance the quality of the Subset 1 by manual
    revision
  • extend it by manual building of synsets
  • 4 Sub-tasks

29
Spanish WordNetMain Steps Subset 2
  • 1) Covering manually those gaps in the hyponymy
    chains covered by other languages
  • 2) Manual cleaning of some automatically-generated
    variants.
  • (a) pairs of synsets which are adjacent in the
    hyponymy chain and share at least one variant.
  • deleting redundant variants
  • re-locating to either pre-existant or newly
    created synsets
  • (b) multi-word expressions present in synsets.
  • Deleting non-lexicalized

30
Spanish WordNetMain Steps Subset 2
  • 3) Manual addition of new vocabulary which has
    been considered relevant.
  • It mainly comes from the Catalan WordNet since
    we are building both wordnets in parallell, we
    detected those synsets which were built for
    Catalan and not for Spanish
  • 4) Manual addition of cross-part of speech
    relations between nominal and verbal synsets.
  • This work has been based mainly on noun-verb
    pairs obtained by means of morphological
    criteria. (Work carried out by UNED Madrid-)

31
Spanish WordNetMain Steps Subset 2 (Results)
32
Spanish WordNetMain Steps Subset 2 (Results)
33
Spanish WordNetMain Steps Beyond Subset 2
  • Massive Manual Checking (from Nov98)
  • Using WEI
  • Variants automatically generated
  • Filling gaps in the hierachy
  • New vocabulary
  • New Adjectives

34
(No Transcript)
35
Spanish WordNetMain Steps Beyond Subset 2
36
Spanish WordNetMain Steps Beyond Subset 2
37
Spanish WordNetMain Steps Parole Coverage
38
Spanish WordNetCurrent Figures
  • Spanish, Catalan, Basque, (English)
  • http//nipadio.lsi.upc.es/wei2.html

39
Combining Multiple Methods for the Automatic
Construction of Multilingual WordNets
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Departament de Llenguatges i Sistemes Informàtics
  • Universitat Politècnica de Catalunya

40
Combining Multiple Methods ...Outline
  • Ten class methods
  • Four monosemic criteria
  • Four polysemic criteria
  • two hybrid criteria
  • Three conceptual distance methods
  • CD1 using pairwise word coocurrences
  • CD2 using headword and genus
  • CD3 using bilingual Spanish entries with
    multiple translations

41
Combining Multiple Methods ...Ten class methods
  • Four Classes

42
Combining Multiple Methods ...Ten class methods
  • Four monosemic criteria

SW
EW
Synset
Synset
Synset
Synset
SW
EW
SW
43
Combining Multiple Methods ...Ten class methods
  • Four polysemic criteria

SW
EW
Synset
Synset
Synset
Synset
SW
EW
Synset
EW
SW
Synset
44
Combining Multiple Methods ...Ten class methods
  • Variant criterion
  • Field criterion

lt..., EW, ..., EW, ...gt
SW
lt..., headword-EW, ..., Ind-EW, ...gt
SW
45
Combining Multiple Methods ...Ten class methods
  • Results

46
Combining Multiple Methods ...Conceptual
Distance methods
  • Conceptual Distance (Agirre et al. 94)
  • length of the shortest path
  • specificity of the concepts
  • using WordNet
  • Bilingual dictionary

47
Combining Multiple Methods ...Conceptual
Distance methods
  • Three conceptual distance methods
  • CD1 using pairwise word coocurrences
  • CD2 using headword and genus
  • CD3 using bilingual Spanish entries with
    multiple translations

48
Combining Multiple Methods ...Conceptual
Distance methods (Example CD2)
ltentitygt
ltobject, ...gt
ltartifact, artefactgt
lthouse, lodginggt
ltreligious residence, cloisergt
abadía_1_2 Iglesia o monasterio regido por un
abad o abadesa (abbey, a church or a monastery
ruled by an abbot or an abbess)
49
Combining Multiple Methods ...Conceptual
Distance methods (Example CD2)
ltentitygt
ltobject, ...gt
ltartifact, artefactgt
ltstructure, constructiongt
lthouse, lodginggt
ltbuilding, edificegt
ltplace of worship, ...gt
ltreligious residence, cloisergt
ltchurch, church buildinggt
ltabbeygt 06 ARTIFACT
abadía_1_2 Iglesia o monasterio regido por un
abad o abadesa (abbey, a church or a monastery
ruled by an abbot or an abbess)
50
Combining Multiple Methods ...Three CD methods
  • Results

51
Combining Multiple Methods ...Combining methods
  • Results

52
Combining Multiple Methods ...Resulting Spanish
WordNets
53
Mapping Conceptual Hierarchies Using Relaxation
Labelling
  • German Rigau i Claramunt
  • TALP Research Center
  • UPC

54
Mapping Conceptual Hierarchies using Relaxation
LabellingOutline
  • Setting
  • Relaxation Labelling Algorithm
  • Constraints
  • Experiments Results I (multilingual)
  • Experiments Results II (monolingual)
  • Further work

55
Mapping Conceptual Hierarchies using Relaxation
LabellingSetting
56
Mapping Conceptual Hierarchies using Relaxation
LabellingSetting
C1
C2
C3
C4
C5
C6
57
Mapping Conceptual Hierarchies using Relaxation
LabellingSetting
  • Connecting already existing Hierarchies
  • Relaxattion labelling Algorithn
  • Constraints
  • Between
  • Spanish taxonomy automatically derived from an
    MRD (Rigau et al. 98)
  • WordNet
  • using a bilingual MRD

58
Mapping Conceptual Hierarchies using Relaxation
LabellingSetting
animal
(Tops ltanimal, animate_being, ...gt)
(person ltbeast, brute, ...gt)
(person ltdunce, blockhead, ...gt)
ave
(animal ltbirdgt)
(artifact ltbird, shuttle, ...gt)
(food ltfowl, poultry, ...gt)
(person ltdame, doll, ...gt)
faisán
(animal ltpheasantgt)
(food ltpheasantgt)
rapaz
(animal ltbirdgt)
(artifact ltbird, shuttle, ...gt)
(food ltfowl, poultry, ...gt)
(person ltdame, doll, ...gt)
59
Mapping Conceptual Hierarchies using Relaxation
LabellingOutline
  • Setting
  • Relaxation Labelling Algorithm
  • Constraints
  • Experiments Results I (multilingual)
  • Experiments Results II (monolingual)
  • Further work

60
Mapping Conceptual Hierarchies using Relaxation
LabellingRelaxation Labelling Algorithm
  • Iterative algorithm for function optimization
    based on local information
  • it can deal with any kind of constraints
  • variables (senses of the taxonomy)
  • labels (synsets)
  • Finds a weight assignment for each possible label
    for each variable
  • weights for the labels of the same variable add
    up to one
  • weigth assignation satisfies -to the maximum
    possible extent- the set of constraints

61
Mapping Conceptual Hierarchies using Relaxation
LabellingRelaxation Labelling Algorithm
  • 1) Start with a random weight assigment
  • 2) Compute the support value for each label of
    each variable (according to the constraints)
  • 3) Increase the weights of the labels more
    compatible with context and decrease those and
    decrease those of the less compatible labels.
  • 4) If a stopping/convergence is satisfied, stop,
  • otherwiese go to step 2.

62
Mapping Conceptual Hierarchies using Relaxation
LabellingOutline
  • Setting
  • Relaxation Labelling Algorithm
  • Constraints
  • Experiments Results I (multilingual)
  • Experiments Results II (monolingual)
  • Further work

63
Mapping Conceptual Hierarchies using Relaxation
LabellingConstraints
  • Rely on the taxonomy structure
  • Coded with three characters
  • X Spanish Taxonomy, I (immediate),
  • Y English Taxonomy, A (ancestor)
  • X Relation, E (hypernym), O (hyponym), B (both)
  • Examples

IIE
AAB




64
Mapping Conceptual Hierarchies using Relaxation
LabellingHierarchical Constraints
  • II Constraints

NAACL2001
65
Mapping Conceptual Hierarchies using Relaxation
LabellingHierarchical Constraints
  • AI Constraints





AIE
AIB
AIO
NAACL2001
66
Mapping Conceptual Hierarchies using Relaxation
LabellingHierarchical Constraints
  • IA Constraints





IAE
IAB
IAO
NAACL2001
67
Mapping Conceptual Hierarchies using Relaxation
LabellingHierarchical Constraints
  • AA Constraints









AAE
AAB
AAO
NAACL2001
68
Mapping Conceptual Hierarchies using Relaxation
LabellingOutline
  • Setting
  • Relaxation Labelling Algorithm
  • Constraints
  • Experiments Results I (multilingual)
  • Experiments Results II (monolingual)
  • Further work

69
Combining Multiple Methods ...RANLP97Eight
class methods
  • Four monosemic criteria

Prec.
Cov.
SW
EW
Synset 92 5
Synset 89 1
Synset
Synset 89 2
SW
EW
SW
70
Combining Multiple Methods ...RANLP97Eight
class methods
  • Four polysemic criteria

Prec.
Cov.
SW
EW
Synset 80 8
Synset 75 2
Synset
Synset 58 17
SW
EW
Synset 61 60
EW
SW
Synset
71
Combining Multiple Methods ...RANLP97
Experiments Results
  • Poly TOK, FOK TOK, FNOK total
  • animal 279 (90) 30 (91) 209 (90)
  • food 166 (94) 3 (100) 169 (94)
  • cognition 198 (67) 27 (90) 225 (69)
  • communication 533 (77) 40 (97) 573 (78)
  • all TOK, FOK TOK, FNOK total
  • animal 424 (93) 62 (95) 486 (90)
  • food 166 (94) 83 (100) 249 (96)
  • cognition 200 (67) 245 (90) 445 (82)
  • communication 536 (77) 234 (97) 760 (81)

72
Combining Multiple Methods ...RANLP97
Experiments Results
piel
(substance ltskin, fur, peelgt)
marta
(substance ltsable, marte, coal_backgt)
visón
(substance ltmink, mink_coatgt)
73
Mapping Conceptual Hierarchies using Relaxation
LabellingOutline
  • Setting
  • Relaxation Labelling Algorithm
  • Constraints
  • Experiments Results I (multilingual)
  • Experiments Results II (monolingual)
  • Further work

74
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01Generalized Constraints
  • All Relationships
  • also-see, similar-to, attribute, antonym, etc.

R
R
75
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01Generalized Constraints
  • Non-structural constraints
  • W number of word coincidences
  • G word coincidences in glosses
  • F number of frame coincidences (verbs)

76
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01POS mapping depencences
Nouns
Adjectives
Adverbs
Verbs
77
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01Constraints for Verbs
  • Structural constraints
  • hyper/hyponymy
  • antonymy
  • also-see
  • Non-structural constraints
  • W, G and F

78
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01Constraints Adjectives
  • Structural constraints
  • Adj-to-Adj
  • antonymy, similar-to and also-see
  • Adj-to-Verb
  • participle-of
  • Adj-to-Noun
  • pertains and attribute
  • Non-structural constraints
  • W and G

79
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01Constraints Adverbs
  • Structural constraints
  • Adv-to-Adv
  • antonymy
  • Adv-to-Adj
  • derived
  • Non-structural constraints
  • W and G

80
A Complete... ACL00, NAACL01Example extra-POS
WN1.6
00843344a evangelical evangelistic
WN1.5
Similar to
02025107a evangelical evangelistic
00842521a enthusiastic
pertainym
02025107a evangelical
04237485n Gospel Gospels evangel
pertainym
04853575n Gospel Gospels evangel
81
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01Example extra-POS
82
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01 Results
  • Basic constraint set structural constraints
  • Nouns AA hyper/hyponym
  • Verbs AA hyper/hyponym, II also-see
  • Adjectives II antonymy, similar-to, also-see
  • Adverbs II antonymy

83
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01 Results
  • Basic constraint set structural constraints

Precision - recall
84
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01 Results
  • Basic constraint set W, G and F for verbs

Precision - recall
85
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01Results
  • Basic extra-POS relationships

Precision - recall
86
A Complete WN1.5 to WN1.6 Mapping ... ACL00,
NAACL01 Results
  • Basic extra-POS relationships WGF

Precision - recall
87
Mapping Conceptual Hierarchies using Relaxation
Labelling Conclusions
  • First complete mapping between Wordnet versions
  • Combining structural and non-structural
    information
  • Robust approach based on local information, but
    with global effects
  • Incremental POS approach
  • http//www.lsi.upc.es/nlp
  • 90 downloads (since November 2000)

88
Mapping Conceptual Hierarchies using Relaxation
Labelling Further Work
  • mapping other structures
  • WN-EDR, WN-LDOCE, etc.
  • Other language taxonomies to EuroWordNet
  • SpanishEWN to WN1.6
  • symmetrical philosophy rather than source-target

89
Mikrokosmos
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Departament de Llenguatges i Sistemes Informàtics
  • Universitat Politècnica de Catalunya

90
Mikrokosmos
Outline
  • Introduction
  • Representational Issues
  • The Lexicon
  • The Ontology
  • Acquisition Process
  • Lexicon Acquisition
  • Guidelines
  • Ontology/Lexicon Trade-off
  • Semantics in Action

91
Mikrokosmos
Introduction
  • Knowledge Base Machine Translation (KBMT)
  • CRL, NMSU
  • 5,000 concepts
  • Events
  • Objects
  • Properties
  • 7,000 Spanish word senses
  • 40,000 word senses
  • after expansion with productive Lexical Rules
  • comprar -gt comprador, comprable, ...
  • Text Meaning Representation

92
Mikrokosmos
Representational Issues The Lexicon
  • Typed Feature Structures (Pollard and Sag 87)
  • language-dependant
  • 10 zones
  • phonology
  • orthography
  • morphology
  • Syntactic (subcategorization)
  • Semantic (Lexical Semantic Representation)
  • syntax-semantic linking
  • stylistics
  • paradigmatic
  • syntacmatic

93
Mikrokosmos
Representational Issues The Lexicon
  • Adquirir-V1
  • syn subj cat NP
  • obj cat NP
  • sem acquire
  • agent HUMAN
  • theme OBJECT
  • Adquirir-V2
  • syn subj cat NP
  • obj cat NP
  • sem acquire
  • agent HUMAN
  • theme INFORMATION

94
Mikrokosmos
Representational Issues The Ontology
  • Taxonomic multi-hierarchical
  • 14 local or inherited links in average
  • language-impartial
  • EVENTS, OBJECTS, PROPERTIES
  • Methodology Guidelines

95
Mikrokosmos
Representational Issues The Ontology
  • ACQUIRE
  • DEFINITION The transfer of possession event
    where the
  • agent transfers an object to its possession
  • IS - A TRANSFER-POSSESSION
  • SOURCE HUMAN PLACE
  • THEME OBJECT (NOT HUMAN)
  • AGENT ANIMAL (DEFAULT HUMAN)
  • DESTINATION ANIMAL PLACE (DEFAULT HUMAN)
  • INHERITED
  • BENEFICIARY HUMAN

96
Mikrokosmos
Acquisition Process The Lexicon
  • Multi-lingual
  • French, English, Japanese, Russian, Spanish, etc.
  • Multi-media
  • Multi-process
  • Analysis
  • Generation (mono and multilingual)
  • MT
  • Summarization
  • IE
  • Speech Processing
  • Tools
  • corpus-search, lookup dictionary, ontology
    browser

97
Mikrokosmos
Acquisition Process The Ontology
  • Guidelines
  • 1) Do not add instances as concepts
  • Instances do not have their own instances
  • Concepts do not have fixed position in
    space/time
  • 2) Do not decompose concepts further
  • 3) Use close concepts
  • 4) Do not add EVENTs with particular arguments
  • 5) Do not add concepts with instance-specific
    aspects,
  • temporal relations
  • 6) Do not add language-specific concepts
  • 7) Do not add ontologycal concepts for collections

98
Mikrokosmos
Acquisition Process Ontology/Lexicon Trade-off
  • Daily negociations
  • lexicon acquirers
  • ontology acquirers
  • Possibilities
  • one-to-one mapping
  • lexicon unspecification
  • lexicon ontology balance

99
Mikrokosmos
Acquisition Process Ontology/Lexicon Trade-off
  • one-to-one mapping
  • Problems
  • Lexical every word in a language is a concept
  • conceptual cuire in french is not ambiguous

PREPARE-FOOD INST COOKING-EQUIPMENT
COOK INST STOVE
BAKE INST OVEN
cook cuire sur le feu
bake cuire ou four
100
Mikrokosmos
Acquisition Process Ontology/Lexicon Trade-off
  • Lexicon Unspecification
  • Problems
  • BAKE is not in the ontology

PREPARE-FOOD INST COOKING-EQUIPMENT
bake cuire ou four INST OVEN
cook cuire sur le feu
101
Mikrokosmos
Acquisition Process Ontology/Lexicon Trade-off
  • Lexicon-Ontology Balance

PREPARE-FOOD INST COOKING-EQUIPMENT
BAKE INST OVEN
FRY INST STOVE INST FRYING-PAN
cook cuire
bake
102
Mikrokosmos
Semantics in Action
  • El grupo Roche, a través de su compañía en
    España, adquirió Doctor Andreu.
  • El grupo Roche adquirió Doctor Andreu a través
    de su compañía en España.
  • La adquisición de Doctor Andreu por el grupo
    Roche fue hecha a través de su compañía en
    España.
  • ACQUIRE-1 Agent ORGANIZATION-1
  • Theme ORGANIZATION-2
  • Instrument ORGANIZATION-3
  • ORGANIZATION-1 Object-Name Grupo Roche
  • ORGANIZATION-2 Object-Name Doctor Andreu
  • ORGANIZATION-3 Location España

103
Mikrokosmos
Semantics in Action
  • Onto-Search Ontological search mechanism to
    check constraints
  • check-onto(ACQUIRE, EVENT) 1
  • since ACQUIRE is a type of EVENT
  • check-onto(ORGANIZATION, HUMAN) 0.9
  • since ORGANIZATION HAS-MEMBER HUMAN

104
Mikrokosmos
Semantics in Action
  • 1) a-través-de INSTRUMENT, LOCATION
  • adquirir require PHYSICAL-OBJECT
  • 2) en LOCATION, TEMPORAL
  • España is not a TEMPORAL-OBJECT
  • 3) adquirir ACQUIRE, LEARN
  • Doctor Andreu is not an INFORMATION
  • 4) Doctor Andreu ORGANIZATION, HUMAN
  • the Theme of ACQUIRE is not HUMAN
  • 5) compañía CORPORATION, SOCIAL-EVENT
  • ORGANIZATIONs typically fill the INSTRUMENT slot
    of ACQUIRE acts

105
Mikrokosmos
Experiment WSD
  • Text 1 2 3 4 Mean
  • words 347 385 370 353 364
  • words/sentence 16.5 24.0 26.4 20.8 21.4
  • open-class words 183 167 177 177 176
  • ambiguous words 57 42 57 35 48
  • syntax 21 19 20 12 18
  • correct 51 41 45 34 43
  • 97 99 93 99 97

106
Mikrokosmos
Experiment WSD
  • Text Mean Mean Unseen
  • words 364 390
  • words/sentence 21.4 26
  • open-class words 176 104
  • ambiguous words 48 26
  • syntax 18 9
  • correct 43 23
  • 97 97

107
WordNet2
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Departament de Llenguatges i Sistemes Informàtics
  • Universitat Politècnica de Catalunya

108
WordNet2
Outline
  • Introduction
  • Text Inferences
  • Defining Features
  • Plausible inferences
  • Inference Rules
  • Semantic Paths
  • What WordNet cannot do

109
WordNet2
Introduction
  • (Harabagiu 98)
  • Commonse reasoning requires extensive knowledge
  • 100 millions of concepts and relations
  • WordNet
  • represents almost all English words
  • 100.000 synsets
  • linked by semantic relations
  • WordNet2
  • each synset has a gloss that, when disambiguated
    may increase the number of relations
  • WordNet glosses into semantic networks
  • NEW RELATIONS

110
WordNet2
Text Inferences
  • German was hungry
  • He opened the refrigerator
  • hungry (feeling a need or desire to eat)
  • eat (take in solid food)
  • refrigerator (an appliance in which foods can be
    stored at low temperature)

111
WordNet2
Defining Features
  • Transform each concepts gloss into a graph
    where concepts are nodes and lexical relations
    are links
  • ltculturegt (all the knowledge shared by society)
  • ltsharegt --AGENT--gt ltsocietygt
  • ltdoctorgt (licensed medical practitioner)
  • ltmedical practitionergt --ATRIBUTTE--gt ltlicensedgt

112
WordNet2
Defining Features
ship
OBJECT
guide
PURPOSE
LOCATION
pilot
person
water
GLOSS
ATTRIBUTE
ATTRIBUTE
difficult
qualified
113
WordNet2
Inference Rules
  • Rule 1 Rule 2
  • VC1 IS-A VC2 VC1 IS-A VC2
  • VC2 IS-A VC3 VC2 ENTAIL VC3
  • ------------------------- ----------------------
    ---
  • VC1 IS-A VC3 VC1 ENTAIL VC3
  • Rule 3 Rule 2
  • VC1 IS-A VC2 VC1 IS-A VC2
  • VC2 R_IS-A VC3 VC2 R_ENTAIL VC3
  • ------------------------- ----------------------
    ---
  • VC1 PLAUSIBLE (not VC3) VC1 EXPLAINS VC3
  • 16 1 regles

114
WordNet2
Semantic Paths
  • 0) Create and load the KB
  • 1) Place markers on KB concepts
  • 2) Propagate markers
  • The algorithm avoids cycles
  • 3) Detect collisions
  • To each marker collision it corresponds a path
  • 4) Extract Inferences

115
WordNet2
Semantic Paths
  • Inference sequence
  • German was hungry
  • German felt a desire to eat
  • German felt a desire to take in food
  • COLLISION Germanhe felt a desire to take
    food, stored in an appliance, which he opened
  • He opened an appliance where food is stored
  • He opened the refrigerator

116
WordNet2
What WordNet cannot do
  • Major WordNet limitations
  • 1) The lack of compound concepts
  • 2) The small number of causation and
    entailment relations
  • 3) the lack of preconditions for verbs
  • 4) the absence of case relations

117
ThoughtTreasure
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Departament de Llenguatges i Sistemes Informàtics
  • Universitat Politècnica de Catalunya

118
ThoughtTreasure
Overview
  • a comprehensive platform for
  • NLP English, French
  • commonsense reasoning
  • A hotel room has a bed, night table, ...
  • People has fingernails
  • soda is a drink
  • one hangs up at the end of a phone call
  • the sky is blue
  • dogs bark
  • someone who is 16 years old is a teenager

119
ThoughtTreasure
Overview
  • 25,000 concepts organized into a hierarchy
  • EVIAN -gt FLAT-WATER -gt DRINKING-WATER
  • 55,000 words (English, French)
  • food lt-gt aliment lt-gt FOOD
  • 50,000 asertions about concepts
  • green-pea is green
  • 100 scripts

120
ThoughtTreasure
Overview
  • Text Agents for recognizing names, phones, etc
  • mechanisms for learning new words
  • X-phile is someone who likes X
  • a syntactic parser
  • a NL generator
  • a semantic parser
  • an anaphoric parser
  • planning agents for achieving goals
  • understanding agents

121
ThoughtTreasure
Example
  • Who created Bugs Bunny?
  • 1.0 (create human-interrogative-pronoun
    Bugs-Bunny)
  • 0.9 (create rock-group-the-Who Bugs-Bunny)
  • 1.0 (create Tex-Avery Bugs-Bunny)
  • 0.1 (not (create rock-group-the-Who Bugs-Bunny))

122
Meaning
  • German Rigau i Claramunt
  • http//www.lsi.upc.es/rigau
  • TALP Research Center
  • Departament de Llenguatges i Sistemes Informàtics
  • Universitat Politècnica de Catalunya

123
Meaning
Overview
  • Bases de Conocimiento
  • Enriquecimiento automático de EWN (modelos
    verbales, etc.)
  • Aproximación mixta (KB ML)
  • Q/A
  • Problema
  • ambigüedad estructural y léxica
  • Aproximación
  • localizar automáticamente ejemplos de sentidos
    (Leacock et al. 98, Mihalcea y Moldovan 99)
  • WSD a gran escala (Boosting, SVM, transductivos
    )
  • Acquisición Conocimiento (Ribas 95, McCarthy 01)

124
MeaningExploiting EWN Semantic Relations
125
MeaningExploiting EWN Semantic Relations
partido 1 Todos los partidos piden reformas
legales para TV3. La derecha planea agruparse en
un partido. El diputado reiteró que ni él ni UDC,
como partido, han recibido dinero de
Pellerols. partido 2 Pero España puso al
partido intensidad, ritmo y coraje. El
seleccionador cree que el partido de hoy contra
Italia dará la medida de España El Racing no gana
en su campo desde hace seis partidos.
126
MeaningExploiting EWN Semantic Relations
partido 1 No negociaremos nunca com un partido
político que sea partidario de la independencia
de Taiwan. Una vez más es noticia la desviación
de fondos destinadoss a la formación ocupacional
hacia la financiación de un partido
político. Estas lleyess fueron votadas gracias a
un consenso general de los partidos
políticos. partido 2 Rivera pide el suporte de
la afición para encarrilar las semifinales. Sólo
el equipo de Valero Ribera puede sentenciar una
semifinal como lo hizo ayer en un Palau Blaugrana
completamente entregado. El Racing ganó los
cuartos de final en su campo.
127
Meaning
Arquitecture
Italian Web Corpus
English Web Corpus
WSD
WSD
Italian EWN
English EWN
ACQ
ACQ
UPLOAD
UPLOAD
Multilingual Central Repository
PORT
PORT
PORT
PORT
Basque EWN
Spanish EWN
ACQ
ACQ
UPLOAD
UPLOAD
Basque Web Corpus
Catalan EWN
Spanish Web Corpus
WSD
Catalan Web Corpus
WSD
Write a Comment
User Comments (0)
About PowerShow.com