Latin WordNet project - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Latin WordNet project

Description:

Latin WordNet project Stefano Minozzi Laboratorio di Informatica Umanistica Universit degli Studi di Verona Latin WordNet project Laboratorio di Informatica ... – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 22
Provided by: MCGILLIVR
Category:

less

Transcript and Presenter's Notes

Title: Latin WordNet project


1
Latin WordNet project
  • Stefano Minozzi
  • Laboratorio di Informatica Umanistica Università
    degli Studi di Verona

2
Latin WordNet project
  • Laboratorio di Informatica Umanistica Università
    degli Studi di Verona
  • http//www.cyllenius.net/labium/
  • The Cognitive and Communication Technologies
    (TCC) division Fondazione Bruno Kessler
    Trento
  • http//cit.fbk.eu/en/research

3
Historical creditsLatin WordNet project owes to
  • Princeton WordNet lexical database for the
    English language (was created and is being
    maintained at the Cognitive Science Laboratory of
    Princeton University under the direction of
    psychology professor George A. Miller.
    Development began in 1985.)
  • MultiWordNet a multilingual lexical database in
    which the Italian WordNet is strictly aligned
    with Princeton WordNet v. 1.6. (Developed since
    1994, at Istituto Trentino di Cultura now
    Fondazione Bruno Kessler)

4
MultiWordnetmultilingual lexical matrix
language
meaning
lemma
5
In Latin WordNet are represented
  • Semantic part of speech
  • Nouns
  • Verbs
  • Adjectives
  • Adverbs
  • Lexical relations that connect words
  • Meanings are considered a constant through the
    various languages, while the lexicalization of a
    meaning is a language-specific variable

6
Structure of the database
7
the synset ( group of synonims) is the building
block of WordNet
v00682542 express an idea, etc. in words \"He
said that he wanted to marry her\" \"tell me
what is bothering you\" \"state your opinion\"
synset lemma
v00682542 adnuntio
v00682542 dico
v00682542 effor
v00682542 enuntio
v00682542 for
v00682542 inquam
v00682542 inseco
v00682542 loquor
v00682542 narro
synset word
v00682542 state
v00682542 say
v00682542 tell
synset word
v00682542 dire
v00682542 enunciare
v00682542 enunziare
v00682542 raccontare
8
The synsets are linked with relations
9
Ralations for adjectives and adverbs
10
  • Moreover the synsets are connected with semantic
    field labels in order to create a domain-related
    dictionaries

11
Building the semantic network
12
  • Build a semantic network from scratch is very
    time consuming
  • Resources available permits a different approach
  • Automatic assignment of synsets
  • Manual correction of the results

13
Building blocks
  • Latin to italian MRD (mostly from G. B. Conte
    E. Pianezzola)
  • Latin to english MRD (mostly from OLD, via
    William Whitaker's Words)
  • Italian and English branches of MultiWordnet

14
We developed a number of assignment strategies
  • Multilingual intersection method ? exploits
    multilingual nature of MultiWordNet
  • Generic probability ? for very specialized words,
    where polisemy is really limited
  • Gloss correspondence ? exploits glosses present
    in the MRD
  • Intersection of synsets ? assigns a lemma to a
    synset when a number of the translation
    equivalents addresses to the same synset

15
Intersection method
amor, is
love, affection the beloved Cupid affair
desire, passion sexual passion illicit passion
amore persona amata, amore questioni amorose,
amorazzi storie d'amoreamore,
desiderio Amoregli Amori, gli Amorini
Intersection
amor, is
n04478900
n05567241
n05607724
n05608483
n07109169
Synsets from italian
Synsets from english
16
Generic probability
abactor, oris ? rustler, cattle_thief
one_who_drives_off
n07541894
SYNSET
17
Gloss correspondence
punctum, i ? point, dot point, spot small_hole,
pin_prick sting, small_puncture (of_insect)
vote, tick tiny_amount full-stop, period
(punctuation)
PERIOD
n05126526
n09715092
n10843624
n10868422
n10869183
n10954173
n10961157
n10982844
n10988653
n05126526 Period point full_stop stop
full_point a punctuation mark (.) placed at the
end of a declarative sentence to indicate a full
stop or after abbreviations
18
Intersezione di synset
punctum, i ? point, dot point, spot small_hole,
pin_prick sting, small_puncture (of_insect)
vote, tick tiny_amount full-stop, period
(punctuation)
POINT (24 synset) n02582551n03150523n03150944
n03151033n03719894n03720036n03958380n044
81751n04514257n04589546n04867079n04955967
n05110203n05126526n06351684n06745866n0978
0630n09869507n09933792n09962048n10018378n
10025218n10044643n10898122
DOT (2 synset) n05096549 n10025218
19
Lexical Gaps
LEXICAL UNIT ? FREE COMBINATION
abactor, is ? gap latin-TO-italian ladro di
bestiame
20
Consistency of the database
Latin Noun Verb Adj Adv TOTAL
SYNSETS 5621 2283 775 294 8973
LEMMAS 4777 2609 1259 479 9124
WORD SENSES 13060 10062 2054 732 25908
21
  • Latin WordNet can be browsed online
  • http//multiwordnet.itc.it/english/home.php
  • The database of Latin WordNet will soon be
    available from European Language Resource
    Association
  • http//www.elra.info/
Write a Comment
User Comments (0)
About PowerShow.com