ISO TC37SC4 N433 - PowerPoint PPT Presentation

1 / 89
About This Presentation
Title:

ISO TC37SC4 N433

Description:

Terminological concept modelling vs.conceptual data modelling ... of relations between concepts, e.g. generic, partitive and temporal relations. ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 90
Provided by: bodil
Category:
Tags: iso | tc37sc4 | n433 | partitive

less

Transcript and Presenter's Notes

Title: ISO TC37SC4 N433


1
ISO TC37/SC4 N433 Busan 2007 ONTOLOGIES
TAXONOMIES Bodil Nistrup Madsen Department of
International Language Studies and Knowledge
Technology www.cbs.isv.dk DANTERMcentret www.dan
term.dk Copenhagen Business School
Copenhagen Business School
2
Overview
  • Terminological ontologies
  • Concept clarification ontology, taxonomy, data
    model etc.
  • An ontology of ontologies
  • A taxonomy for lexical data
  • Terminological concept modelling vs.conceptual
    data modelling
  • Modelling partiel equivalence between concepts

3
Overview
  • Terminological ontologies
  • Concept clarification ontology, taxonomy, data
    model etc.
  • An ontology of ontologies
  • A taxonomy for lexical data
  • Terminological concept modelling vs.conceptual
    data modelling
  • Modelling partiel equivalence between concepts

4
  • Terminological Ontologies
  • Principles
  • feature specifications (modelling
    characteristics of concepts)
  • dimensions / subdividing dimensions (modelling
    subdivision criteria)
  • dimension specifications
  • constraints
  • Tools
  • i-Term i-Model, DANTERMcentret
  • CAOS 2 prototype (Computer-Aided Ontology
    Structuring), Group of Computational Linguistics

5
Terminological principles presented by means of
examples from i-Term i-ModelTerminology and
Knowledge Management System DANTERMcentretwww.i-
term.dk
6
subdivision criteria
Extract of an ontology for prevention in i-Model
feature specification attribute-value pair
polyhierarchy
inheritance
7
In terminological concept modelling only relevant
subconcepts are registered. This means that not
all possible combinations of concepts from two
or more groups (dimensions) will be registered,
e.g. a concept universal secondary prevention is
not relevant.
8
(No Transcript)
9
Concept oriented !
10
Associative concept relations
  • OntoQuery Ontology-based Querying
  • www.OntoQuery.dk
  • Madsen, Bodil Nistrup, Bolette Sandford
    Pedersen Hanne Erdman Thomsen
  • The Role of Semantic Relations in a
    Content-based Querying System a Research
    Presentation from the OntoQuery Project.
  • In Simov, Kiril Atanas Kiryakov (eds.)
    Proceedings from OntoLex 2000, Workshop on
    Ontologies and Lexical Knowledge Bases, Sept.
    8-10 2000, Sozopol, Bulgaria,.

11
source relation
source-target relation
ship quay
dynamic location relation
activity-source relation
disembarkation ship
target relation
location relation
activity-target relation
landing airport
static location relation
activity static location relation
swim water
entity static location relation
oasis dessert
activity-agent relation
activity relation
heal doctor
activity-patient relation
semantic relation
teach student
activity-instrument relation
paint brush
coffee making coffee
activity-result relation
agent relation
agent-patient relation
drawer drawee
agent-instrument relation
surgeon scalpel
role relation
patient relation
agent-result relation
drawer draft
instrument relation
patient-instrument relation
wood - plane
Madsen, Pedersen Thomsen, 2001 The relations
location, activity and role
patient-result relation
student graduate
result relation
instrument-result relation
coffee machine coffee
12
CAOS Version
13
CAOSComputer-Aided Ontology Structuring
  • Bodil Nistrup Madsen
  • Hanne Erdman Thomsen
  • Carl Vikner
  • Bo Krantz Simonsen
  • Jacob M. Christensen
  • Group of Computational Linguistics
  • ISV

14
CAOS implements more restrictive terminological
principles than i-Model. CAOS helps the user in
setting up consistent ontologies adhering to the
terminological principles. CAOS is based on the
UML notation, but extensions are needed.
15
The backbone of this concept modelling is
constituted by characteristics modelled by formal
feature specifications, i.e. attribute-value
pairs.
16
(No Transcript)
17
Three subordinate concepts automatically
generated on the basis of the dimension
specification. No terms yet!
18
  • Attempt at creating an illegal polyhierarchy a
    concept universal selective prevention with two
    superordinate concepts within the same group
    (dimension TARGET GROUP).

19
  • The CAOS prototype is based on UML notation, but
    extensions are needed for terminological concept
    modelling
  • UML class diagrams
  • not possible to represent several dimensions,
    from which one may be chosen as the subdividing
    dimension
  • no notation for the specification of dimension
    values, at least not in the way it is done in
    CAOS
  • no notation for feature specifications (it is
    possible to use a facility of UML which comes
    close to feature specifications as used in CAOS
    in specializations it is possible to introduce
    attributes with initial values).

20
Overview
  • Terminological ontologies
  • Concept clarification ontology, taxonomy, data
    model etc.
  • An ontology of ontologies
  • A taxonomy for lexical data
  • Terminological concept modelling vs.conceptual
    data modelling
  • Modelling partiel equivalence between concepts

21
Clarification needed !
classification taxonomy thesaurus wordnet
ontology concept system
22
class category keyword (syn)set t
ype concept term
23
eCat Terminology task CEN CWA 15045 Multilingual
Catalogue Strategies for eCommerce and eBusiness
Bodil Nistrup Madsen DANTERMcentret, the Danish
Terminology Centre - www.danterm.dk Håvard
Hjulstad Standards Norway, havard_at_hjulstad.com or
hhj_at_standard.no
24
Based on eCat Terminology task CEN CWA
15045 Multilingual Catalogue Strategies for
eCommerce and eBusiness
knowledge representation
classification system
model
subject classification system
taxonomy
data model
ontology
25
Translations of definitions from the OIO Concept
database
26
Based on eCat Terminology task CEN CWA
15045 Multilingual Catalogue Strategies for
eCommerce and eBusiness
knowledge representation
classification system
model
subject classification system
taxonomy
data model
ontology
27
Translations of definitions from the OIO Concept
database
28
CEN CWA 15045
ontology NOTE An ontology may comprise all kinds
of relations between concepts, e.g. generic,
partitive and temporal relations. Synonyms
concept model, concept system A data model
should always be based on an ontology, but
sometimes a data model, represented by means of
an ER or a UML diagram, is referred to as an
ontology. Our recommendation is to use the
term ontology only as defined here. Please
observe that the term conceptual model is
referring to a kind of data model. An ontology
may typically be used for the precise description
of concepts.
29
CEN CWA 15045
taxonomy NOTE A taxonomy is a kind of
classification system, that comprises exclusively
generic relations between the categories, in
contrast to an ontology, which is a kind of model
that may comprise all kinds of relations between
concepts. A taxonomy may typically be used for
defining the types of data categories used within
a specific field, eg. within the field of product
description.
30
Based on eCat Terminology task CEN CWA
15045 Multilingual Catalogue Strategies for
eCommerce and eBusiness
knowledge representation
classification system
model
subject classification system
taxonomy
data model
ontology
31
Translations of definitions from the OIO Concept
database
32
CEN CWA 15045
conceptual data model data model that represents
an abstract view of the real world ISO/IEC
11179-3 2003(E), 3.2.8 information model data
model that represents the organization of
information in a manner that reflects the
structure of an information system Amended from
ISO/IEC FCD 11179-3 2003(E), 3.2.13
33
concept in an ontology
concept definition
data category in a taxonomy
data category definition
class in a classification system
class description
class in a data model
class description
Concept definitions form the basis for the
definitions / descriptions of data categories,
classes in classification systems, classes in
data models
34
Overview
  • Terminological ontologies
  • Concept clarification ontology, taxonomy, data
    model etc.
  • An ontology of ontologies
  • A taxonomy for lexical data
  • Terminological concept modelling vs.conceptual
    data modelling
  • Modelling partiel equivalence between concepts

35
ontology
METHOD
terminological ontology
POINT OF VIEW
LEVEL
SUBJECT
PURPOSE
LANGUAGE
FORMALIZATION
domain specific ontology
philoso-phical ontology
pragmatic ontology
general ontology
task specific ontology
task inde-pendant ontology
formal ontology
not formal onto-logy
universal ontology
language inde-pendant ontology
language inde-pendant ontology
application specific ontology
top level ontology
specific ontology
With input from Guarino, Nicola (1998). Formal
Ontology and Information Systems,.
Bodil Nistrup Madsen, Alting på sin plads og
plads til alting. Om at ordne og udnytte viden om
verden. I Anita Nuopponen, Bertha Toft, Johan
Myking (eds.) I Terminologins tjänst. Festskrift
för Heribert Picht på 60-årsdagen. Proceedings of
the University of Vaasa, Reports, Vaasa 2000, s.
71-91.
36
(No Transcript)
37
(No Transcript)
38
Hanne Erdman Thomsen On the basis of Gómez-Pérez
et al (2004) Ontological Engineering
39
TC 37 Terminology and other language and content
resources Ontology Task Force Provo, August
2007
40
Members of TC37 Ontology Task Force team SC
1 Donald Chapin Hanne Erdman Thomsen Hendrick
Kockaert SC 2 Gerhard Budin SC 3 Bodil Nistrup
Madsen (convenor) Klaus-Dirk Schmitz Sue Ellen
Wright  
41
SC 4 Koiti Hasida Jae Sung Lee Key-Sun
Choi Nicoletta Calzolari ISO/IEC JTC 1
SC32 Bruce Bargmeyer TC37 Secretariat Christian
Galinski
42
  • Organization of work
  • Concept clarification
  • Different types of knowledge representation
    resources
  • What is the difference between ontology,
    taxonomy, thesaurus etc.
  • Result systematic overview (concept system with
    definitions)
  • 2) Overview of ontologies and projects 'outside'
    TC37- examples!

43
Organization of work 3) Overview of related
ongoing projects, existing standards, proposals
for future projects within TC 37 4) Proposal
for a strategy for TC 37 including future
co-ordination by the Ontology Task Force
44
New title 2005-08-25 Systems to manage
terminology, knowledge and content
ISO/TC 37/SC 3 N542
New scope 2005-08-25 Standardization of
specifications and modelling principles for
systems to manage terminology, knowledge and
content with respect to semantic interoperability
Examples of future projects
45
  • Principles for building taxonomies for metadata
  • Examples
  • A taxonomy for lexical and terminological data.
  • A taxonomy for any other kind of data
    collection.
  • TC 37 should be a pioneer within this field!
  • Such taxonomies, which describe the contents of
    data collections, also comprise systematic
    definitions and examples which will make it
    easier to classify data elements.
  • Motivation
  • It is extremely important to be able to describe
    elements of data collections systematically in
    order to build databases / IT systems for
    storage, management and exchange of data. Many
    metadata vocabularies are not built on the basis
    of the principles of taxonomies, which means that
    they may be incomplete, inconsistent and
    difficult to use.

46
  • Principles for the use of concept models
    (ontologies) in developing data models
  • Examples
  • Concept model for central concepts that form the
    basis of the development of a data model for a
    terminology database.
  • Concept model for any other kind of database /
    IT system (e.g. Electronic Health Care Systems)
    and the corresponding data model.
  • This project is not the same as the NWI in TC 37
    SC 1 on Guidelines for applying concept modelling
    in terminology work (N 273).
  • Motivation
  • Many data models are still developed without
    being based on a concept model. The two concepts
    concept model (ontology) and conceptual data
    model are very often mixed up. Ontology is often
    used for conceptual data model.

47
Principles for the development and use of meta
models This project should comprise guidelines
based on the experience gained from the
development of the meta model in ISO 16642. It
is related to the previous proposal. Motivation
It is a non-trivial job to develop a meta model
How detailed should / could a meta model be?
How to build specific data models on the basis
of a meta model?
48
Overview
  • Terminological ontologies
  • Concept clarification ontology, taxonomy, data
    model etc.
  • An ontology of ontologies
  • A taxonomy for lexical data
  • Terminological concept modelling vs.conceptual
    data modelling
  • Modelling partiel equivalence between concepts

49
Proposal for a Taxonomy of Lexical Metadata
Categories for ISO TC 37 Terminology and Other
Language Applications
50
ISO TC 37 published a standard in 1999 specifying
data categories used in terminological resources,
ISO 126201999, Computer assisted terminology
management ? Data Categories. In 2003, TC 37/SC
3 initiated a revision of the existing document
with the intention of creating a family of data
category standards designed to meet the needs of
terminologists and other language experts
developing a variety of electronic linguistic
resources. The intention was to include data
categories for a variety of applications,
including for example terminological and
lexicographical data collections as well as
machine translation lexica, cf. SC 3 Systems to
manage terminology, knowledge and content SC 2
Terminographical and lexicographical working
methods SC 4 Language resource management.
51
At the same time it was suggested to set up a
Data Category Registry (DCR) for all the above
mentioned kinds of lexical data, cf. also Ide
Romary (2004). The DCR is intended to be
compliant with ISO 11179-3, Information
technology Metadata registries (MDR) Part 3
Registry metamodel and basic attributes.
52
The data categories of ISO 126201999 were
classified in three major groups, and the groups
were further subdivided into ten sub-groups A.1
term A.2 term-related information A.3
equivalence A.4 subject field A.5 concept-related
description A.6 concept relation A.7 conceptual
structures A.8 note A.9 documentary language A.10
administrative information
Term and term-related data categories
Descriptive data categories
Administrative data categories
53
This structure is not homogenous, i.e. it
reflects various subdividing criteria
(dimensions), and it does not give a very clear
overview of the data categories. One dimension is
for example term-related information vs.
concept-related description. Here it is not clear
why e.g. subject field and concept relation do
not fall within the group concept-related
description. An example of term-related
information is A.2.1.18.1 collocation, while an
example of concept-related information is A.5.3
context (a text or part of a text in which a term
occurs). Types of contexts can, among others,
include defining context (a context that
contains substantial information about a concept,
but that does not possess the formal rigor of a
definition) and linguistic context (context that
illustrates the function of a term in discourse,
but that provides no conceptual information).
54
It seems as if the structure of ISO 126201999 is
to some extent based on the structure typically
found in a terminological entry. Since the above
mentioned DCR of TC 37 will also include data
categories of dictionaries, this structure is not
very appropriate. Consequently it was decided
to give up a classification of the categories.
It is however difficult to ensure completeness,
consistency, user-friendliness and extensibility
of the above mentioned DCR, if there is no
structure of the data categories.
55
The structure of the DCR As already mentioned the
DCR will contain data categories that are
relevant in various areas, such as terminology,
lexicography and machine translation. These areas
are referred to as thematic domains. In August
2007 there was an introductory meeting for the TC
37 Data Category Registry, in which all
Sub-Committees and Working Groups that have any
activities involving data categories were
requested to nominate experts to serve on the
Thematic Domain Group (TDG). The idea was, that
each TDG should be charged with the specification
of domain-specific data categories for a specific
data processing environment within TC 37.
56
Figure 1 (from Wright 2004) illustrates clearly
that the various subsets of the DCR, i.e. the
thematic domains, will overlap. For example, the
data categories part of speech and grammatical
gender will be relevant in all three different
thematic domains.
57
Collections of Lexical data - description of
data categories and data structure - Part 1
Taxonomy for the classification of information
types STANLEX  
Danish Standard DS 2394-1 (1998)
58
STANLEX taxonomy Main categories based on
linguistic disciplines
  • etymological information
  • grammatical information
  • graphical information
  • phonetic information
  • semantic information
  • usage
  • In addition to these categories there are some
    categories for administrative information and
    structural information.
  • This taxonomy was developed by a group of
    terminologists, lexicographers and people working
    with machine translation and other kinds of
    natural language processing.

59
Examples
60
(No Transcript)
61
All main groups, categories and subcategories are
defined and exemplified. The structure of this
taxonomy gives a much clearer overview of the
data categories than the original structure of
ISO 126201999, and it is clearly better than a
plain alphabetical list. The use of a taxonomy
makes it much easier to check whether the DCR of
ISO TC 37 comprises all relevant data categories
within a certain group. In the case of proposals
for new data categories it is also much easier to
check whether the category is already in the DCR,
maybe under another category name.
62
Proposal for a taxonomy for lexical metadata
categories On the background of the above
mentioned advantages of using a taxonomy for the
classification of metadata categories it is
suggested that the principles of the taxonomy of
DS 2394-11998 are used for the structuring data
categories in the DCR for lexical data in ISO TC
37. There will no doubt be a need for more
categories and subcategories than those found in
DS 2394-11998, but it will be easy to fit new
categories into the structure, as long as they
are mutually independent. There may also be a
need for adjustments of the structure, since
there do exist different ways of classifying
lexical data.
63
  • However, DS 2394-11998 is a good starting point,
    and using the principles of this taxonomy will
    ensure
  • completeness
  • consistency
  • user-friendliness
  • extensibility

64
Overview
  • Terminological ontologies
  • Concept clarification ontology, taxonomy, data
    model etc.
  • An ontology of ontologies
  • A taxonomy for lexical data
  • Terminological concept modelling vs.conceptual
    data modelling
  • Modelling partiel equivalence between concepts

65
  • Ontologies (concept models) and conceptual data
    models have different aims
  • ontologies aim at concept clarification and
    mutual understanding of concepts and consistent
    use of terms
  • conceptual data models aim at specifying the
    information types of an IT system and their
    mutual relationships
  • In order to produce a well-functioning database
    it is necessary to know the concept model for the
    domain underlying the database structure, i.e.
    you have to be familiar with the central concepts
    of the domain in which the database is going to
    function.

66
NB! The attributes of the data model give no
information about the meaning of the classes, but
only a specification of what kind of information
will be given about the entities represented by
the classes in question.
67
Terminological concept modelling vs. conceptual
data modelling
  • ontologies and data models do have something in
    common
  • but
  • there is no one-to-one correspondence between an
    ontology and the data model of the database
  • There is no one-to-one mapping between concepts
    and characteristics in the ontology and classes
    and attributes in the data model.
  • Some concepts correspond to attributes or values
    in the data model - some concepts may not
    correspond to classes, attributes or values.

68
Draft concept system NORDTERM Terminology of
terminology in i-Model (here translated into
English)
69
More elaborate Danish version
70
Conceptual data modelling for DANTERM / CAOS
databases represented in UML
71
is expressed by
1..
1..
class
belongs to
1..
attributes
association
0.. zero, one or more 1.. one or more
0..
multiplicity
1..
is related to
is related to
1..
72
is expressed by
1..
1..
belongs to
1..
information about primary key (pk) foreign keys
(fk) and data types (String), may be added to the
attributes
0..
1..
is related to
is related to
1..
73
is expressed by
1..
1..
belongs to
1..
extra class between classes in a many-to-many
relationship
0..
1..
is related to
is related to
1..
74
is expressed by
1..
1..
Reflexive association One concept in one
position in a concept system is related to one or
several concepts in the same concept system.
belongs to
1..
0..
1..
is related to
is related to
1..
75
A concept system for concepts may comprise
concepts such as superordinate concept and
subordinate concept (both subordinate concepts
to concept). These concepts are not found in
the data model.
76
Conceptual data model There are no corresponding
classes superordinate and subordinate concepts
in the conceptual data model rather, they will
be represented by means of the attributes C-ID1
and C-ID2 on the class concSystRel, and the
corresponding table concSystRel relates two
concepts to each other together with a
specification of which relation type (attribute
R-ID) holds between them.
belongs to
1..
0..
1..
is related to
is related to
1..
77
Draft concept system NORDTERM Terminology of
terminology in i-Model (here translated into
English)
Another example concepts such as intension and
extension, which are very important in a concept
system for the understanding of central concepts
like concept and characteristic, will not be
found in an UML diagram for a terminology
database.
78
Models as the basis for development of IT systems
Conceptual data model
Logical data model
Physical data model
Ontology
Recommendation Always develop an ontology before
developing a conceptual data model!
79
Overview
  • Terminological ontologies
  • Concept clarification ontology, taxonomy, data
    model etc.
  • An ontology of ontologies
  • A taxonomy for lexical data
  • Terminological concept modelling vs.conceptual
    data modelling
  • Modelling partiel equivalence between concepts

80
Modelling partiel equivalence between concepts
In multilingual terminology work concept systems
will be established for all languages in a
project. In some cases one concept in one
language may be partially equivalent to several
concepts in one or several other languages. This
also means that the concept systems of the
individual languages differ. Therefore it should
be possible in a terminology database to
establish an equivalence relationship between one
concept in one language and two or three concepts
in another language.
81
partial equivalence
Concept 1 English
Concept 1 German
partial equivalence
Concept 2 German
partial equivalence
Concept 2 English
This could for example be the case in terminology
within the field of education. One level in the
English education system may correspond to two
levels in the German education system and vice
versa.
82
Concept 1 English
Entry 1
Concept 1 German
Concept 2 German
Entry 2
Entry 3
Concept 2 English
This means that each concept in a terminology
database may be linked to several terminological
entries.
83
Entry 1
In many cases the Terminology Management System
will build on a simpler model Concept 1 in
English and Concept 1 in German are duplicated in
the database, since one concept can not be linked
to several entries.
partial equivalence
Concept 1 English Version 1
Concept 1 German Version 1
Entry 2
partial equivalence
Concept 1 English Version 2
Concept 2 German
Entry 3
partial equivalence
Concept 1 German Version 2
Concept 2 English
84
0..
terminological entry
1
subject field
1
degree of equivalence
1..
0..
note
entry_concept
0..1
0..
0..
1..
originating person
1
concept
1..
1
1..
1
1
origination date
1
1..
0..1
definition
term
SC 3 Design, implementation and use of
terminology management systems
0..
1..
1
0..
0..
1..
1
0..1
context
grammar
source identifier
85
The system of courts in Italy 22 concepts
86
The Danish system 13 concepts
87

Generic concepts within Health Care in Denmark
(top core ontology translated into English
Problems in translating e.g. activity, conduct,
action.
88
Conclusions
  • Domain specific ontologies should be based on the
    principles of terminological ontologies.
  • Concept clarification is needed wrt. ontology,
    taxonomy, data model etc.
  • An ontology of ontologies would help in this
    concept clarification
  • A taxonomy for lexical data for TC 37 will ensure
    completeness, consistency, user-friendliness and
    extensibility
  • Terminological ontologies and conceptual data
    models differ
  • Introducing multilingualism in ontologies
    requires modelling of partiel equivalence between
    concepts

89
Handelshøjskolen i København
Thank you for your attention!
Bodil Nistrup Madsen Department for International
Language Studies and Knowledge Technology www.cbs.
isv.dk DANTERMcentret www.danterm.dk Copenhagen
Business School
Write a Comment
User Comments (0)
About PowerShow.com