The Collaborative Open Dictionary Development - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

The Collaborative Open Dictionary Development

Description:

Word content accessible to computational agents ... Synonymy: (dog, canine) Antonymy: (rich, poor) Hyponomy: (maple, tree) = ISA relation ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 55
Provided by: SOM19
Category:

less

Transcript and Presenter's Notes

Title: The Collaborative Open Dictionary Development


1
The Collaborative Open Dictionary Development
  • Thatsanee Ch.
  • thatsanee_at_tcllab.org

2
The Collaborative Open Dictionary Development
  • TCL's Computational Lexicon
  • Asian WordNet
  • Asian Language Resources

3
Classical Semantic Computational Lexicons
  • Representing the meaning of a word (minimally)
    requires
  • Distinguishing senses of word
  • We walked along the bank of Chao Phra Ya river.
  • He has an account at this bank.
  • Indicating inferences
  • Being human gt Being animate

4
Computational Lexicon
  • Explicit representation of word meaning
  • Word content accessible to computational agents
  • Word meaning linked to word syntax and morphology
  • Multilingual lexical links
  • Language resources for NLP systems
  • syntactic subcategorization frames for parsing
  • semantic selectional preferences for ambiguity
    reduction
  • semantic classes for WSD, semantic tagging, etc.

5
Computational Lexicons
  • Terminological based
  • EDR,
  • Network based
  • Wordnet, CyC
  • Framenet
  • Constraint based
  • UW
  • TCLLEX

6
EDR EJ Dictionary
Terminological based
ltRecord Numbergt EJB1054678 ltHeadword
Informationgt ltHeadwordgt
belabor ltGrammar Informationgt ltPart of
Speechgt Verb ltSemantic Informationgt
ltConcept Identifiergt 3cecd7 ltHeadconceptgt
ltEnglish Headconceptgt blister
ltJapanese Headconceptgt ?????????
ltConcept Explicationgt ltEnglish Concept
Explicationgt to attack with sharp words
ltJapanese Concept Explicationgt
??????? ltCorrespondence Informationgt
ltCorrespondence Word Informationgt
ltCorrespondence Word Categorygt 0
ltCorrespondence Word Notationgt (???)????????
ltCorrespondence Word Categorygt 0
ltCorrespondence Word Notationgt (????)?????
ltCorrespondence Word Categorygt 0
ltCorrespondence Word Notationgt ???? ltManagement
Informationgt ltManagement History Recordgt
DATE"95/3/10"
7
Wordnet (Princeton)
Network based Terminological based
  • A large lexical-semantic resource, organised as a
    semantic network.
  • To create a lexical thesaurus (not a dictionary)
    which models the lexical organization used by
    human.
  • Words are arranged in clusters of synsets to help
    identify the meaning and differentiate it from
    other meanings.
  • The overall organizing principle of wordnet is in
    terms of semantic relations.
  • Where no synonyms are available to distinguish
    concepts, glosses are used.

8
WordNet
  • About 150,000 lexical items
  • http//wordnet.princeton.edu

9
Semantic relations in Wordnet
Network based Terminological based
  • Synonymy (dog, canine)
  • Antonymy (rich, poor)
  • Hyponomy (maple, tree) ISA relation
  • Meronymy (body, limb) HASA relation (part of)
  • Entailments (snore, sleep) for verbs

10
Wordnet
Network based Terminological based
11
CYC (Cycorp, Inc.)
Network based
  • Cyc KB (over 120,000 concepts a million
    assertions)
  • - Ontology - English lexicon
  • Concept is defined as a constant, which can
    represent a collection (e.g. the set of all
    people)
  • An individual object (e.g. a particular person)
  • A word (e.g. the English word)
  • A relation (e.g. a predicate, function, slot,
    attribute)
  • The entry for the predicate mother
  • mother
  • (mother ANIM FEM)
  • isa FamilyRelationSlot BinaryPredicate
  • the predicate mother takes 2 arguments,
  • 1st must be an element of the collection
    Animal,
  • 2nd must be an element of the collection
    FemaleAnimal

12
FrameNet (Berkeley)
Frame based
  • To create a computational lexicon which describes
    the semantic frames and valencies of verbs,
    nouns, and adjectives.

13
Thematic Roles (Fillmore 1968)
  • TR describe the conceptual participants in a
    situation in a generic way, independent from
    their grammatical realization.
  • Agent, Patient, Object, Recipient, Instrument,
    Source, Goal, Beneficient, Experiencer,

14
Thematic Roles example annotated
  • The window broke.
  • A rock broke the window.
  • John broke the window with a rock.
  • The window pat broke
  • A rock inst broke the window pat
  • John ag broke the window pat with a rock
    inst

15
The Berkeley FrameNet (1996)
  • Frame an inventory of conceptual structures
    modelling a prototypical situation like
    COMMERCIAL_TRANSACTION, COMMUNICATION_REQUEST,
    SELF_MOTION
  • Semantic roles are locally valid only in Frame
    Elements (FE)

16
The Berkeley FrameNet (1996)
  • FEs of the COMMUNICATION_REQUEST frame
  • SPEAKER, ADDRESSEE, MESSAGE,
  • FEs of the COMMERCIAL_TRANSACTION frame
  • BUYER, SELLER, GOODS, PRICE,

17
The Berkeley FrameNet (1996)
  • A set of target words associated with each
    frame for COMMERCIAL_TRANSACTION
  • Buy, sell, pay, spend, cost, charge
  • Price, change, debt, credit, merchant, broker,
    shop
  • Tip, fee,

18
An example
  • Airbus sells five A380 superjumbo planes to China
    Southern for 220 million Euro.
  • China Southern buys five A380 superjumbo planes
    from Airbus for 220 million Euro.
  • Airbus arranged with China Southern for the sales
    of five A380 superjumbo planes at a price of 220
    million Euro.
  • Five A380 superjumbo planes will go for 220
    million Euro to China Southern.

(seller, buyer, goods, price)
19
COMMERCIAL_TRANSACTION
  • SELLER Airbus
  • BUYER China Southern
  • GOODS five A380 superjumbo planes
  • PRICE 220 million Euro

20
The Berkeley FrameNet (1996)
  • Current release 700 frames (8,000 lexical units)
  • http//framenet.icsi.berkeley.edu/

21
FrameNet
Frame based
22
UNL Knowledge base UW
Constraint based
23
(TCLs Computational Lexicon) TCLLEX
  • Design the frame-based lexicon representation
  • Create the Ontology and Terminology
  • Propose a computational framework
  • Reuse the existing conceptual hierarchy
    (thesaurus) and lexicon

24
TCLs Computational Lexicon
Constraint based
  • Design...
  • Representativity
  • Logical and Semantic constraints
  • Expressiveness
  • Thoroughness and incrementality
  • Computationality (Operations)
  • Similarity...Differentiation
  • Relativity
  • Inheritance
  • Unification

25
Logical Constraints
Representativity
  • Vertical relation
  • Is-a 189 classes -gt
  • The logical constraints can be attached to a word
    of any category type. They illustrate the logical
    relationship among word senses in the lexicon.

26
Semantic Constraints
  • The semantic constraints are attached to a verb
    or an adjective. They represent the relationship
    among thematic roles in a verb or adjective
    pattern.
  • Horizontal relation 16 relations

27
Semantic Constraints
28
Representation
  • ???
  • Morphological
  • Syntactic
  • Category V
  • Subcategory VACT
  • V Pattern SUBVOBJ
  • Semantic
  • Logical Constraint
  • Is-a drive
  • Semantic Constraint
  • Agent Individual
  • Complement Vehicle
  • English drive
  • ???
  • Morphological
  • Syntactic
  • Category V
  • Subcategory VACT
  • V Pattern SUBVOBJ
  • Semantic
  • Logical Constraint
  • Is-a displace
  • Semantic Constraint
  • Agent Person
  • Object Person
  • English expel

29
Representation
  • ???
  • Morphological
  • Syntactic
  • Category V
  • Subcategory VACT
  • V Pattern SUBVOBJ
  • Semantic
  • Logical Constraint
  • Is-a change
  • Semantic Constraint
  • Agent Organic structure
  • Object Material
  • English eliminate

30
Representation
  • ????????????
  • Morphological
  • Syntactic
  • Category N
  • Subcategory NCMN
  • Semantic
  • Logical Constraint
  • Is-a Career
  • English tailor
  • ??
  • Morphological
  • Syntactic
  • Category N
  • Subcategory NCMN
  • Semantic
  • Logical Constraint
  • Is-a Vehicle
  • English car

31
Representation
  • ???
  • Morphological
  • Syntactic
  • Category N
  • Subcategory NCMN
  • Semantic
  • Logical Constraint
  • Is-a Container
  • English bowl
  • ???
  • Morphological
  • Syntactic
  • Category V
  • Subcategory VACT
  • V Pattern SUBV
  • Semantic
  • Logical Constraint
  • Is-a utter
  • Semantic Constraint
  • Agent Fowl
  • English crow

32
Representation
  • ????????????? ??? 1 Object
    Container
  • ???????? ??? 2 Object
    Implement
  • ????????????????????
  • ??????????????????????????
  • ???????????????? ??? 3 Object
    Collector
  • ??????????????????? ??? 4 Object Lane
  • ?????? ??? 5 Object Body
    part
  • ?????? ??? 5 Object Body
    part
  • ??????????????????????????????

33
Expressiveness (content words)
  • Express all part of speech (content words)
  • Distinguish senses by the framework
  • ???
  • Morphological
  • Syntactic
  • Category N
  • Subcategory NCMN
  • Semantic
  • Logical Constraint
  • Is-a Container
  • English bowl
  • ???
  • Morphological
  • Syntactic
  • Category V
  • Subcategory VACT
  • V Pattern SUBV
  • Semantic
  • Logical Constraint
  • Is-a utter
  • Semantic Constraint
  • Agent Fowl
  • English crow

34
Computationality
  • Similarity

object
??? ?? vehicle
??????? material
object
?????? ?? vehicle
35
Computationality
  • Similarity

object
object
??? ?????? plant
??? ?????? plant
??? body part
result
result
????? ??? lane
??? ????? clothing
??? lane
?? body part
object
??? ???????? monetary
???? price
36
TCLLEX Statistics
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
Asian WordNet
  • English WordNet
  • Asian Language Terminology

47
Asian WordNet Construction
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
http//www.tcllab.org/tcllex
Write a Comment
User Comments (0)
About PowerShow.com