Title: TMF a tutorial
1TMF - a tutorial
- TMF - Terminological Markup Framework
- Laurent Romary - Laboratoire Loria
2Three parts
- Part 1 Basic concepts
- Part 2 Representing data categories
- Part 3 Designing (schemas and) filters
3TMF - a tutorialPart 1 Basic concepts
- TMF - Terminological Markup Framework
- Laurent Romary - Laboratoire Loria
4- Background - ISO etc.
- The need for abstraction
- Structure and content of terminological data -
picture virtual-actual - The meta-model (structural skeleton)
- Describing data categories
- Styles and vocabularies
- XTMF as a mapping tool - examples
- Further work extending the model to a wider
scope (language engineering)
5Overview
6General principles
- Expressing constraints on the representation of
computerized terminologies - What is the underlying structure of computerized
terminologies? - Which data-category is used and under which
conditions? - Maintaining interoperability between
representations - Providing a conceptual tool to compare two given
formats
7Definitions
- TMF Terminological Mark-up Framework
- Definition of underlying structures and
mechanisms needed for the computer representation
of terminological data - Independence with regards any specific format
- GMT Generic Mapping Tool
- Abstract XML format equivalent to the underlying
model of TMF
8Definitions - cont.
- TML Terminological Mark-up Language
- One specific representation format generated
within TMF - E.g. DXLT is a possible TML
9A family of formats
TMF
TML1
TML2
TML3
TMLi
(DXLT)
(Geneter)
GMT
10Meta-model
- Representing the underlying structure of
terminological data
11(No Transcript)
12Meta-model description
- Terminological Data Collection (TDC)
- A collection of data containing information on
concepts of specific concept fields. - Terminological Entry (TE)
- An entry containing information on terminological
units (i.e., subject-specific concepts, terms,
etc.). - Example Domain description, Conceptual
relations etc.
13Meta-model description - cont.
- Language Section (LS)
- The part of a terminological entry containing
information related to one language. - Note One terminological entry may contain
information on one, two or more languages. - Term Section (TS)
- The part of a language section giving information
about a term. - Example Term status (e.g. abbreviation), Usage
information (temporal, geographical etc.)
14Meta-model description - cont.
- Term Component Section (TCS)
- The section of a term section giving information
about components of a term. - Example Component grammatical information (Part
of speech)
15Meta-model description - cont.
- Global Information (GI)
- Technical and administrative information applying
to the entire data collection . - Example title of the data collection, revision
history - Complementary Information (CI)
- Information supplementary to terminology-related
information. - Example bibliographical source, documentary
language or description thereof.
16The structural skeleton
Terminological Data Collection (TDC)
Global Information (GI)
Complementary Information (CI)
Terminological Entry (TE)
Language Section (LS)
Term Level (TL)
Term Component Level (TCL)
17How does this work?
- Walking through an example
18DXLT example
- lttermEntry id'ID67'gt
- ltdescrip type'subjectFieldgtmanufacturinglt/descr
ipgt - ltdescrip type'definition'gtA value between 0 and
1 used in ...lt/descripgt - ltlangSet lang'en'gt
- lttiggt
- lttermgtalpha smoothing factorlt/termgt
- lttermNote type'termType'gtfullFormlt/termNotegt
- lt/tiggt
- lt/langSetgt
- ltlangSet lang'hu'gt
- lttiggt
- lttermgtAlfa ...lt/termgt
- lt/tiggt
- lt/langSetgt
- lt/termEntrygt
19Identifying the structural skeleton
TE Terminological Entry LS Language Section TS
Term Section
20TMF information model
idID67 subjectField manufacturing definitio
nA value
TE
LS
LS
lang hu
lang en
termalpha smoothing factor termTypefullForm
TS
term
TS
21GMT representation
- ltstruct typeTEgt
- ltfeat typeidgtID67lt/featgt
- ltfeat typesubjectFieldgtmanufacturinglt/featgt
- ltfeat typedefinitiongtA value between 0 and 1
used in ...lt/featgt - ltstruct typeLSgt
- ltfeat typelanggtenlt/featgt
- ltstruct typeTSgt
- ltfeat typetermgtalpha smoothing
factorlt/featgt - ltfeat typetermTypegtfullFormlt/featgt
- lt/structgt
- lt/structgt
- ltstruct typeLSgt
- ltfeat typelanggthult/featgt
- ltstruct typeTSgt
- ltfeat typetermgtAlfa ...lt/featgt
- lt/structgt
- lt/structgt
- lt/structgt
22(No Transcript)
23TML à la mode ISO
- Ingredients
- A structural skeleton
- (take the TMF Metamodel)
- A reference Data Category Registry
- ISO 12620 is a good place to find one
- Recette
- Choose some data categories from the registry
- You can even constrain the values of your datcats
- Associate a style and vocabulary to each datcat
- You can inspire yourself from others (DXLT)
- Serve it hot to your software guy with a piece of
SALT software
24GMT
25Background
- Interoperability principle
- If any two TMLs have exactly the same DCS, even
though they differ radically in style and
vocabulary, they are equivalent. - Consequence
- It is always possible to define a filter from one
TML to another when they are interoperable - GMT is the intermediate representation to do so
26From one TML to another
- GMT - Generic mapping tool
- an abstract XML representation
- identification of levels
- ltstruct typeLSgtlt/structgt
- a recursive element
- representation of data-categories
- ltfeat typedefinitiongtlt/featgt
27The tmf element
- Description
- The tmf element is the root element for any valid
XTMF document. It contains both the global
information that corresponds to a terminological
data collection, the collection itself, and the
complementary information comprising external
resources in particular, which are needed for
describing the various terminological entries. - Content model
- lt!ELEMENT tmf (struct)gt
28The struct element
- Description
- The struct element should be used to represent a
locus in a given structural skeleton. The struct
element is recursive and may also contain feat
and/or brack elements to express attributes
belonging to the corresponding level of the meta
model. - Attributes
- type level in the meta model (TDC, TE, LS, TS or
TCS) - Content model
- lt!ELEMENT struct ((featbrack), struct)gt
- lt!ATTLIST struct type (TDCTELSTSTCS)
REQUIREDgt
29The feat element
- Description
- The feat element represents any feature that is
either directly attached to a locus in the
structural skeleton (represented by a struct
element). - The feat element accepts the following
attributes - type categorises the feat element through the
reference to the name of the corresponding data
category. - Content model (DTD)
- lt!ELEMENT feat (PCDATA annot)gt
- lt!ATTLIST feat type CDATA REQUIREDgt
30Bracketing information
31Rationale
- Describing the context of use of a given data
category - Example 1
- Classification Code AG1
- Classification System Lenoc
- Example 2
- Transaction type modification
- Responsible person Mr. X
- Date 23 avril 1988
32Formal model
- Hierarchical feature structure
- Constraint Type given by main (first) data
category
33GMT description
- Bracketing features
- ltbrackgt
- ltfeat typeclassificationCodegtxxxlt/featgt
- ltfeat typeclassificationSystemgtLenoclt/featgt
- lt/brackgt
-
- Rem no type for brack
34Annotating content
35Rationale
- Why should we annotate specific content?
- To identify components which are not explicitly
expressed as a specific part of a terminological
entry - E.g. Characteristics of a concept
- To relate a component to another entry or an
external resource - E.g. bibliographical reference
36Formal model
?
37XML model
- Mixed content
- lt!element feat (PCDATAannot)gt
- Attributes
- type categorises the annot element through the
reference to the name of the corresponding data
category. - Rem. Problems with mixed content in XML schemas
38GMT description
- Annotating information
- ltfeat typedefinitiongt
- pencil whose
- ltannot typecharacteristicgt casing lt/annotgt
- is fixed around a cental graphite medium which
is used for writing or making marks - lt/featgt
-
39(No Transcript)
40Representation of relations
41XML links
- Transparency as to the actual location of a
resource (internal vs. external) - Maybe useful to identify ontologies
- External links between concepts
entry i
entry i
entry j
entry j
42Representation in GMT
- Two attributes
- Target - a pointer to a struct element in the
case the feature expresses a relation between the
current locus and another locus in the structural
skeleton - Source - a pointer to a struct element in
cases where the feature is described external to
the locus to which it is supposed to be attached.
43Some examples
- Simple atomic feature attached directly to a
locus - ltfeat type"conceptIdentifier"gtID67lt/featgt
- Basic feature whose value is a reference to a
locus in the structural skeleton - ltfeat type"partWhole" target"TE24"/gt
- Basic feature anchored at the locus in the
structural skeleton whose id attribute value is
TE24 - ltfeat type"conceptIdentifier" source"TE24"gtID67lt
/featgt - Compound feature anchored at TE 23 and which
makes reference to TE 24 - ltfeat type"partWhole" source"TE23"
targetTE24/gt
44Styles and vocabularies
45(No Transcript)
46Implementating a DatCat
- Definitions
- style The way a given DatCat is implemented
as an XML object - vocabulary symbols needed to express the
implementation of a given DatCat in its
associated style - E.g.
- DatCat /definition/
- Vocabulary def
- Style Element
- ltdefgtpencil whose casing lt/defgt
DatCat value
47Implementating a DatCat (Cont.)
- Definition
- anchor the XML element(s) to which the
implementation of a given DatCat can be attached - E.g.
- lttiggt
- lttermgtalpha smoothing factorlt/termgt
- lt/tiggt
48Styles - element
- Element
- Def. The Datcat is implemented as an element,
child of its anchor - Vocabularies the name of the corresponding
element - E.g.
- ltdefgtpencil whose casing lt/defgt
- lttermgtalpha smoothing factorlt/termgt
DatCat value
49Styles - typedElement
- typedElement
- Def. The Datcat is implemented as a generic XML
element, which is a child of the anchor, and
which is further specified by means of a type
attribute. Its content is the value of the
feature in the structural skeleton. - Vocabularies the element name and the value of
the type attribute - E.g.
- lttermNote typedefinitiongtBla, bla,
blalt/termNotegt
DatCat value
50Styles - attribute
- Attribute
- Def. The Datcat is implemented as an attribute
of its anchor - Vocabularies the name of the corresponding
attribute - E.g.
- lttermEntry id'ID67'gt lt/termEntrygt
- ltldl language 'en'gt lt/ldlgt
DatCat value
51- ValuedElement
- TypedValuedElement