TMF a tutorial - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

TMF a tutorial

Description:

TMF - Terminological Markup Framework. Laurent Romary - Laboratoire Loria ... TMF: ... equivalent to the underlying model of TMF. Definitions - cont. ... – PowerPoint PPT presentation

Number of Views:108
Avg rating:3.0/5.0
Slides: 52
Provided by: Laurent128
Category:
Tags: tmf | tmf | tutorial

less

Transcript and Presenter's Notes

Title: TMF a tutorial


1
TMF - a tutorial
  • TMF - Terminological Markup Framework
  • Laurent Romary - Laboratoire Loria

2
Three parts
  • Part 1 Basic concepts
  • Part 2 Representing data categories
  • Part 3 Designing (schemas and) filters

3
TMF - a tutorialPart 1 Basic concepts
  • TMF - Terminological Markup Framework
  • Laurent Romary - Laboratoire Loria

4
  • Background - ISO etc.
  • The need for abstraction
  • Structure and content of terminological data -
    picture virtual-actual
  • The meta-model (structural skeleton)
  • Describing data categories
  • Styles and vocabularies
  • XTMF as a mapping tool - examples
  • Further work extending the model to a wider
    scope (language engineering)

5
Overview
6
General principles
  • Expressing constraints on the representation of
    computerized terminologies
  • What is the underlying structure of computerized
    terminologies?
  • Which data-category is used and under which
    conditions?
  • Maintaining interoperability between
    representations
  • Providing a conceptual tool to compare two given
    formats

7
Definitions
  • TMF Terminological Mark-up Framework
  • Definition of underlying structures and
    mechanisms needed for the computer representation
    of terminological data
  • Independence with regards any specific format
  • GMT Generic Mapping Tool
  • Abstract XML format equivalent to the underlying
    model of TMF

8
Definitions - cont.
  • TML Terminological Mark-up Language
  • One specific representation format generated
    within TMF
  • E.g. DXLT is a possible TML

9
A family of formats
TMF

TML1
TML2
TML3
TMLi
(DXLT)
(Geneter)
GMT
10
Meta-model
  • Representing the underlying structure of
    terminological data

11
(No Transcript)
12
Meta-model description
  • Terminological Data Collection (TDC)
  • A collection of data containing information on
    concepts of specific concept fields.
  • Terminological Entry (TE)
  • An entry containing information on terminological
    units (i.e., subject-specific concepts, terms,
    etc.).
  • Example Domain description, Conceptual
    relations etc.

13
Meta-model description - cont.
  • Language Section (LS)
  • The part of a terminological entry containing
    information related to one language.
  • Note One terminological entry may contain
    information on one, two or more languages.
  • Term Section (TS)
  • The part of a language section giving information
    about a term.
  • Example Term status (e.g. abbreviation), Usage
    information (temporal, geographical etc.)

14
Meta-model description - cont.
  • Term Component Section (TCS)
  • The section of a term section giving information
    about components of a term.
  • Example Component grammatical information (Part
    of speech)

15
Meta-model description - cont.
  • Global Information (GI)
  • Technical and administrative information applying
    to the entire data collection .
  • Example title of the data collection, revision
    history
  • Complementary Information (CI)
  • Information supplementary to terminology-related
    information.
  • Example bibliographical source, documentary
    language or description thereof.

16
The structural skeleton
Terminological Data Collection (TDC)
Global Information (GI)
Complementary Information (CI)

Terminological Entry (TE)

Language Section (LS)

Term Level (TL)

Term Component Level (TCL)
17
How does this work?
  • Walking through an example

18
DXLT example
  • lttermEntry id'ID67'gt
  • ltdescrip type'subjectFieldgtmanufacturinglt/descr
    ipgt
  • ltdescrip type'definition'gtA value between 0 and
    1 used in ...lt/descripgt
  • ltlangSet lang'en'gt
  • lttiggt
  • lttermgtalpha smoothing factorlt/termgt
  • lttermNote type'termType'gtfullFormlt/termNotegt
  • lt/tiggt
  • lt/langSetgt
  • ltlangSet lang'hu'gt
  • lttiggt
  • lttermgtAlfa ...lt/termgt
  • lt/tiggt
  • lt/langSetgt
  • lt/termEntrygt

19
Identifying the structural skeleton
TE Terminological Entry LS Language Section TS
Term Section
20
TMF information model
idID67 subjectField manufacturing  definitio
nA value
TE
LS
LS
lang hu 
lang en 
termalpha smoothing factor termTypefullForm
TS
term
TS
21
GMT representation
  • ltstruct typeTEgt
  • ltfeat typeidgtID67lt/featgt
  • ltfeat typesubjectFieldgtmanufacturinglt/featgt
  • ltfeat typedefinitiongtA value between 0 and 1
    used in ...lt/featgt
  • ltstruct typeLSgt
  • ltfeat typelanggtenlt/featgt
  • ltstruct typeTSgt
  • ltfeat typetermgtalpha smoothing
    factorlt/featgt
  • ltfeat typetermTypegtfullFormlt/featgt
  • lt/structgt
  • lt/structgt
  • ltstruct typeLSgt
  • ltfeat typelanggthult/featgt
  • ltstruct typeTSgt
  • ltfeat typetermgtAlfa ...lt/featgt
  • lt/structgt
  • lt/structgt
  • lt/structgt

22
(No Transcript)
23
TML à la mode ISO
  • Ingredients
  • A structural skeleton
  • (take the TMF Metamodel)
  • A reference Data Category Registry
  • ISO 12620 is a good place to find one
  • Recette
  • Choose some data categories from the registry
  • You can even constrain the values of your datcats
  • Associate a style and vocabulary to each datcat
  • You can inspire yourself from others (DXLT)
  • Serve it hot to your software guy with a piece of
    SALT software

24
GMT
  • Generic Mapping Tool

25
Background
  • Interoperability principle
  • If any two TMLs have exactly the same DCS, even
    though they differ radically in style and
    vocabulary, they are equivalent.
  • Consequence
  • It is always possible to define a filter from one
    TML to another when they are interoperable
  • GMT is the intermediate representation to do so

26
From one TML to another
  • GMT - Generic mapping tool
  • an abstract XML representation
  • identification of levels
  • ltstruct typeLSgtlt/structgt
  • a recursive element
  • representation of data-categories
  • ltfeat typedefinitiongtlt/featgt

27
The tmf element
  • Description
  • The tmf element is the root element for any valid
    XTMF document. It contains both the global
    information that corresponds to a terminological
    data collection, the collection itself, and the
    complementary information comprising external
    resources in particular, which are needed for
    describing the various terminological entries.
  • Content model
  • lt!ELEMENT tmf (struct)gt

28
The struct element
  • Description
  • The struct element should be used to represent a
    locus in a given structural skeleton. The struct
    element is recursive and may also contain feat
    and/or brack elements to express attributes
    belonging to the corresponding level of the meta
    model.
  • Attributes
  • type level in the meta model (TDC, TE, LS, TS or
    TCS)
  • Content model
  • lt!ELEMENT struct ((featbrack), struct)gt
  • lt!ATTLIST struct type (TDCTELSTSTCS)
    REQUIREDgt

29
The feat element
  • Description
  • The feat element represents any feature that is
    either directly attached to a locus in the
    structural skeleton (represented by a struct
    element).
  • The feat element accepts the following
    attributes
  • type categorises the feat element through the
    reference to the name of the corresponding data
    category.
  • Content model (DTD)
  • lt!ELEMENT feat (PCDATA annot)gt
  • lt!ATTLIST feat type CDATA REQUIREDgt

30
Bracketing information
31
Rationale
  • Describing the context of use of a given data
    category
  • Example 1
  • Classification Code AG1
  • Classification System Lenoc
  • Example 2
  • Transaction type modification
  • Responsible person Mr. X
  • Date 23 avril 1988

32
Formal model
  • Hierarchical feature structure
  • Constraint Type given by  main  (first) data
    category

33
GMT description
  • Bracketing features
  • ltbrackgt
  • ltfeat typeclassificationCodegtxxxlt/featgt
  • ltfeat typeclassificationSystemgtLenoclt/featgt
  • lt/brackgt
  • Rem no type for  brack 

34
Annotating content
35
Rationale
  • Why should we annotate specific content?
  • To identify components which are not explicitly
    expressed as a specific part of a terminological
    entry
  • E.g. Characteristics of a concept
  • To relate a component to another entry or an
    external resource
  • E.g. bibliographical reference

36
Formal model
?
37
XML model
  • Mixed content
  • lt!element feat (PCDATAannot)gt
  • Attributes
  • type categorises the annot element through the
    reference to the name of the corresponding data
    category.
  • Rem. Problems with mixed content in XML schemas

38
GMT description
  • Annotating information
  • ltfeat typedefinitiongt
  • pencil whose
  • ltannot typecharacteristicgt casing lt/annotgt
  • is fixed around a cental graphite medium which
    is used for writing or making marks
  • lt/featgt

39
(No Transcript)
40
Representation of relations
41
XML links
  • Transparency as to the actual location of a
    resource (internal vs. external)
  • Maybe useful to identify ontologies
  • External links between concepts

entry i
entry i
entry j
entry j
42
Representation in GMT
  • Two attributes
  • Target - a pointer to a  struct  element in the
    case the feature expresses a relation between the
    current locus and another locus in the structural
    skeleton
  • Source - a pointer to a  struct  element in
    cases where the feature is described external to
    the locus to which it is supposed to be attached.

43
Some examples
  • Simple atomic feature attached directly to a
    locus
  • ltfeat type"conceptIdentifier"gtID67lt/featgt
  • Basic feature whose value is a reference to a
    locus in the structural skeleton
  • ltfeat type"partWhole" target"TE24"/gt
  • Basic feature anchored at the locus in the
    structural skeleton whose id attribute value is
    TE24
  • ltfeat type"conceptIdentifier" source"TE24"gtID67lt
    /featgt
  • Compound feature anchored at TE 23 and which
    makes reference to TE 24
  • ltfeat type"partWhole" source"TE23"
    targetTE24/gt

44
Styles and vocabularies
45
(No Transcript)
46
Implementating a DatCat
  • Definitions
  •  style  The way a given DatCat is implemented
    as an XML object
  •  vocabulary   symbols needed to express the
    implementation of a given DatCat in its
    associated style 
  • E.g.
  • DatCat /definition/
  • Vocabulary def
  • Style Element
  • ltdefgtpencil whose casing lt/defgt

DatCat value
47
Implementating a DatCat (Cont.)
  • Definition
  •  anchor  the XML element(s) to which the
    implementation of a given DatCat can be attached
  • E.g.
  • lttiggt
  • lttermgtalpha smoothing factorlt/termgt
  • lt/tiggt

48
Styles - element
  • Element
  • Def. The Datcat is implemented as an element,
    child of its anchor
  • Vocabularies  the name of the corresponding
    element
  • E.g.
  • ltdefgtpencil whose casing lt/defgt
  • lttermgtalpha smoothing factorlt/termgt

DatCat value
49
Styles - typedElement
  • typedElement
  • Def. The Datcat is implemented as a generic XML
    element, which is a child of the anchor, and
    which is further specified by means of a type
    attribute. Its content is the value of the
    feature in the structural skeleton.
  • Vocabularies  the element name and the value of
    the type attribute
  • E.g.
  • lttermNote typedefinitiongtBla, bla,
    blalt/termNotegt

DatCat value
50
Styles - attribute
  • Attribute
  • Def. The Datcat is implemented as an attribute
    of its anchor
  • Vocabularies  the name of the corresponding
    attribute
  • E.g.
  • lttermEntry id'ID67'gt lt/termEntrygt
  • ltldl language 'en'gt lt/ldlgt

DatCat value
51
  • ValuedElement
  • TypedValuedElement
Write a Comment
User Comments (0)
About PowerShow.com