LEXUS a flexibele web based lexicon application - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

LEXUS a flexibele web based lexicon application

Description:

orthography. lemmatized form. Inflected form. little more. complex incl. 1:N relations ... orthography/ /gender/ /number/ /tense/ /person/ /mood/ ISO TC37/SC4 en LMF ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 38
Provided by: mar241
Category:

less

Transcript and Presenter's Notes

Title: LEXUS a flexibele web based lexicon application


1
LEXUS a flexibele web based lexicon application
Jacquelijn Ringersma
www.mpi.nl/lexus lexus_at_mpi.nl
LEXUS a flexibele web based lexicon application,
June 2007
2
Outline
Background (ISO/TC 37/SC 4 group LMF and
DCR) Why Lexus? Lexus Demo
3
Background Problems with lexica
Heterogenity of lexica Structure Structure of
lexicon depends on language, linguistic theory,
purpose
4
Background - Problems with lexica
Heterogeneity of lexica Structure Structure of
lexicon depends on language, linguistic theory,
purpose Heterogeneity of lexica conceptual
Large variation in linguistic concepts
(attribute) and value naming
e.g. the concept noun N, n, no, noun
5
Background - Problems with lexica
Heterogeneity of lexica Structure Structure of
lexicon depends on language, linguistic theory,
purpose Heterogeneity of lexica conceptual
Large variation in linguistic concepts
(attribute) and value naming Heterogeneity of
lexica Format Large variation in formats (XML,
Shoebox, Chat, Word)
6
Background - Problems with lexica
Heterogeneity of lexica Structure of lexicon
depends on language, linguistic theory,
purpose Large variation in linguistic concepts
(attribute) and value naming Large variation in
formats (XML, Shoebox, Chat, Word)
Data interoperability problem cross lexica
searches, merging, comparison
Archive requirements Representation format
(XML) One archive exploitation framework
7
ISO TC37/SC4 en LMF
ISO TC37/SC4 is about standardization in LR
Management LMF Linguistic Markup Framework
(structure) DCR Data Category Registries
(concepts)
8
ISO TC37/SC4 en LMF
ISO TC37/SC4 is about standardization in LR
Management LMF Linguistic Markup
Framework DCR Data Category Registries
LMF
lexicon schema is seen as lexical attributes
(data categories) grouped together with others
(data components) and embedded in a tree structure
9
Lexical DB
/lemma/ /POS/ /gender/ /key form/
Lexical Entry
Form
Sense
/orthography/ /gender/ /number/ /tense/ /person/ /
mood/
10
Lexical DB
/lemma/ /POS/ /gender/ /key form/
Lexical Entry
Form
Sense
/orthography/ /gender/ /number/ /tense/ /person/ /
mood/
11
ISO TC37/SC4 en LMF
What makes LMF standard?
Every lexical entry consists of Form and
Sense Every leaf of the lexicon tree consists of
a pair DataCategory, value Using these pairs
make the data in the lexicon interoperable (lexica
can be merged)
12
ISO TC37/SC4 en LMF
ISO TC37/SC4 is about standardization in LR
Management LMF Linguistic Markup
Framework DCR Data Category Registries
DCR
Flat list of linguistic concepts Contains is_a
relations that are part of the concept
definition (transitive verb is_a verb)
13
ISO TC37/SC4 en LMF
14
ISO TC37/SC4 en LMF
15
ISO TC37/SC4 en LMF
ISO TC37/SC4 is about standardization in LR
Management LMF Linguistic Markup
Framework DCR Data Category Registries
DCR
Flat list of linguistic concepts Contains is_a
relations that are part of the concept
definition (transitive verb is_a verb)
The aim of LMF/DCR a modular structure for
content interoperability between (all aspects) of
lexical resources.
16
Back to LEXUS?
  • LEXUS is based on the ISO recommendations
  • LMF
  • - Default structure in Lexus is LMF compliant
  • User may extend this default structure as
    desired
  • (but not adjust the default structure!)

17
Back to LEXUS?
LEXUS is based on the ISO recommendations It
uses LMF default structure is LMF
compliant DCR LEXUS Offers user the ISO Data
Category Repository
18
Why LEXUS?
LEXUS is based on the ISO recommendations It
uses LMF default structure is LMF
compliant DCR Data Category Registries
In LEXUS Search through one or multiple
lexica Merge Merging of lexica (in
development) Multi-media lexica Lexical
encyclopedia Relations Creation of visual
semantical networks Linking Linking to the
archive resources
19
Why LEXUS?
LEXUS is based on the ISO recommendations It
uses LMF default structure is LMF
compliant DCR Data Category Registries lexoBlocks
Users are offered standard lexicon
structures, which are LMF compliant and use DCR
(in development)
Above all LEXUS is a presentation
platform! Web-based available to world-wide
users Multi-media/relations suitable for
speakers communities to view the data
20
Lexus workspace
After login, you enter in your workspace
In your workspace you may create new lexica add
and remove lexica import lexica from other
formats edit existing lexica search the lexica
in your workspace.
21
Lexus create new lexica
Lexical Resource ? New
Now first build the lexicon schema (data
components and data categories)
22
Lexus create new lexica
Now first build the lexicon schema (data
components and data categories)
Data components and data categories can be added
to the default schema the user has maximum
flexibility!
23
Lexus create new lexica
Lexicon schema data categories may be selected
from the DCR (Shoebox or ISO) or the user may
define his own data categories
24
Lexus print view and html view
Before inserting lexical entries define the
print view and the html view
25
Lexus lexical entries
Inserting/editing lexical entries
The value of a data category can be easily
changed and new entries can be added.
26
Lexus import from Shoebox
Shoebox files may be imported with or without
.typ file
27
Lexus import from Shoebox
Shoebox data categories will be imported directly
under the LexicalEntry component, for now in a
random order
28
Lexus import from Shoebox
Shoebox content of lexical entries will be copied
into the Lexus lexical entries
29
Lexus, multi media
LEXUS allows to add to lexical entries Audio
file Video file Image file These can be
represented in the html view of the lexical
entry
30
Lexus, multi media
31
Lexus, relations
Relations Semset definitions may be translated
in a visual representation e.g. nmo birds
is_a tpile tpoo animals te fish is_a
tpile tpoo animals
32
Lexus, relations
33
Lexus, links in images
LEXUS in development (Marquesan project) Links
in images Parts of images can be selected and
links can be inserted from these selection to the
corresponding lexical entries
34
Lexus, links in images
35
Lexus, links to archived files
LEXUS in development (Marquesan project) Link
to the resources in the archive With ANNEX the
link can be made precisely to the time location
where the lexical entry is uttered.
36
Lexus, links to archived files
37
Lexususer guides
LEXUS manual available from www.mpi.nl/lexus
Follow the documentation link LEXUS A4 guide
(new lexicon only) on www.mpi.nl/corpus/a4guides/l
exus.pdf When you need support please
contact Jacquelijn.Ringersma_at_mpi.nl
Write a Comment
User Comments (0)
About PowerShow.com