Ontology Merging - PowerPoint PPT Presentation

About This Presentation
Title:

Ontology Merging

Description:

Element location of one representation maps to element address of the other ... Homonym terms. The meaning of a term could be different in an other context ... – PowerPoint PPT presentation

Number of Views:134
Avg rating:3.0/5.0
Slides: 41
Provided by: csd6
Category:

less

Transcript and Presenter's Notes

Title: Ontology Merging


1
Ontology Merging
  • Kyriakos Kritikos (??)
  • Miltos Stratakis (MET)

2
Representation Matching
  • Problem of creating semantic mappings between two
    data representations
  • Mapping examples
  • Element location of one representation maps to
    element address of the other
  • Contact-phone maps to agent-phone
  • Listed-price maps to price (1 tax-rate)
  • Fundamental step in numerous data management
    applications
  • But, manual effort in semantic mapping has become
    intensive, due to the expansive development of
    the above applications

3
Applications of Representation Matching (I)
  • Schema integration (early 1980s)
  • Need to merge a set of given schemas into a
    single global schema
  • Data warehousing - Data mining (early 1990s)
  • Need to translate data between multiple databases
  • Data coming from multiple sources must be
    transformed to data conforming to a single target
    schema
  • Knowledge Base construction (late 1980s, all
    1990s)
  • Used in AI
  • KBs store complex types of entities and
    relationships, using extended database schemas
    (ontologies)
  • Requirement of semantic mapping between the
    involved ontologies (ontology matching problem)

4
Applications of Representation Matching (II)
  • Data integration systems (recent years)
  • Provide an uniform query interface to a big
    number of data sources, by enabling users to pose
    queries against a mediated schema
  • Need to use a set of semantic mappings between
    the mediated schema and the local schemas of the
    data sources
  • Peer data management systems (recent years)
  • Allow peers to query and retrieve data directly
    from each other
  • Need of creation of semantic mappings among the
    peers

5
Using Ontologies as Representations
  • Ontology Explicit specification of a
    conceptualization
  • Can be used
  • In an integration task to
  • Describe the semantics of the information sources
  • Make the content explicit
  • For the identification and association of
    semantically corresponding information concepts

6
Content Explication
  • The way the ontologies are employed for content
    explication can be different
  • We can identify three different directions
  • Single ontology approaches
  • Multiple ontology approaches
  • Hybrid ontology approaches

7
Single Ontology Approaches
  • Use one global ontology providing a shared
    vocabulary for the specification of the semantics
  • Can be applied to integration problems where all
    information sources to be integrated provide
    nearly the same view on a domain
  • Not effective if one information source has a
    different view on a domain

8
Multiple Ontology Approaches
  • Each information source is described by its own
    ontology
  • Each source ontology can be developed without
    respect to other sources or their ontologies
  • Can simplify the integration task
  • Supports the change of sources
  • Not effective in comparing different source
    ontologies, due to the lack of a common
    vocabulary

9
Hybrid Ontology Approaches
  • Semantics of each source is described by its own
    ontology, but these ontologies are built from a
    global shared vocabulary to make them comparable
  • The shared vocabulary contains basic terms of a
    domain which are combined in the local ontologies
    in order to describe more complex semantics
  • New sources can easily be added without the need
    of modification
  • But, existing ontologies can not easily be reused

10
The need for Ontology Matching (Integration)
  • Semantic Web evolution
  • Requirement for formal descriptions of parts of
    our human environment (i.e. descriptions of parts
    of the real world)
  • These descriptions, in various degrees of
    formalness and specificity, are the ontologies
  • To form a real web of semantics, ontologies from
    different sources should be linked and related to
    each other
  • Problem The reuse of existing ontologies is
    often not possible without considerable effort
  • Ontologies need to
  • Be integrated (i.e. merged into a new ontology)
  • Be aligned (i.e. they have to be brought into
    mutual agreement)

11
Ontology Integration Process
  • Consists of three steps
  • Find the places in the ontologies where they
    overlap
  • Relate concepts that are semantically close via
    equivalence and subsumption relations (aligning)
  • Check the consistency, coherency and
    non-redundancy of the result

12
Technical Problems with Ontology Combination
  • The technical problems that underlie the
    difficulties in ontology merging and aligning
    are
  • The mismatches that may exist between separate
    ontologies (Mismatches between Ontologies)
  • The synchronization of the changes made to an
    ontology with the revisions to the applications
    and data sources that use them (Ontology
    Versioning)

13
Mismatches between Ontologies
  • Key type of problems that hinder the combined use
    of independently developed ontologies
  • We distinguish two levels at which these
    mismatches may appear
  • Language or meta-model level
  • Level of the language primitives that are used to
    specify an ontology
  • Mismatches at this level are between the
    mechanisms to define classes, relations etc.
  • Ontology or model level
  • Level of the actual ontology of a domain
  • A mismatch at this level is a difference in the
    way the domain is modelled

14
Language level Mismatches
  • Occur in combinations of ontologies written in
    different ontology languages
  • We distinguish four types of this level
    mismatches
  • Syntax
  • Different ontology languages often use different
    syntaxes
  • Constitutes probably the simplest kind of
    language level mismatch
  • Logical representation
  • Existence of different representations of logical
    notions
  • Focused in which language constructs should be
    used to express something, not in whether
    something can be expressed
  • Semantics of primitives
  • Sometimes, although the same name is used for a
    language construct in two languages, the
    semantics may differ (e.g. when there are several
    interpretations of A equalTo B )
  • Language expressivity
  • Implies that some languages are able to express
    things that are not expressible in other
    languages (e.g. some languages have constructs to
    express negation and others have not)

15
Ontology level Mismatches
  • Happen in combination of two or more ontologies
    that describe (partly) overlapping domains
  • We can distinguish the mismatches of this level
    in four classifications
  • Conceptualization mismatch
  • A difference in the way a domain is interpreted,
    which results in different ontological concepts
    or different relations between those concepts
  • Explication mismatch
  • A difference in the way the conceptualization is
    specified
  • Terminological mismatch
  • A difference in the way the terms are described
  • Encoding mismatch
  • Values in the ontologies may be encoded in
    different formats (e.g. a date may be represented
    as dd/mm/yyyy or as mm-dd-yy)
  • Terminological and encoding mismatches can be
    considered as specialized explication mismatches

16
Conceptualization Mismatches
  • We distinguish two types of these mismatches
  • Scope
  • When two classes seem to represent the same
    concept, but do not have exactly the same
    instances (e.g. several administrations use
    slightly different concepts of employee)
  • Model coverage and granularity
  • The mismatches of this level are in the part of
    the domain that is covered by the ontology or in
    the level of detail to which that domain is
    modelled
  • For example, one ontology might model cars but
    not trucks, another might represent trucks but
    only classify them into a few categories, while a
    third one might make very specified distinctions
    between types of trucks based on their general
    physical structure, weight etc.

17
Explication Mismatches
  • We distinguish two types of these mismatches
    focused on the style of modeling
  • Paradigm
  • Different paradigms can be used to represent
    concepts such as time, action, plans etc.
  • For example, the use of different top-level
    ontology is a mismatch of this type
  • Concept description
  • Several choices can be made for the modeling of
    concepts in the ontology
  • For example, we can consider the place where the
    distinction between scientific and non-scientific
    publications is made
  • A dissertation can be modelled as dissertation lt
    book lt scientific publication lt publication, or
    as dissertation lt scientific book lt book lt
    publication

18
Terminological Mismatches
  • We distinguish two term types in which there can
    be these mismatches
  • Synonym terms
  • Concepts could be represented by different names
  • For example, an ontology may use the term car
    and another ontology may use the term
    automobile
  • Homonym terms
  • The meaning of a term could be different in an
    other context
  • For example, the term conductor has a different
    meaning in a music domain than in an electric
    engineering domain

19
Ontology Versioning
  • In an open domain, the changes in the ontologies
    used are unavoidable, so it becomes very
    important to keep track of these changes
  • Although the problem is introduced by subsequent
    changes to one specific ontology, the most
    important problems are caused by the dependencies
    on that ontology
  • A versioning scheme should pay attention of the
    following aspects
  • The relation between succeeding revisions of one
    ontology
  • The relation between the ontology and its
    dependencies
  • Instance data that conforms to the ontology
  • Other ontologies that are built from or import
    the ontology
  • Applications that use the ontology

20
Versioning Scheme Requirements
  • Identification
  • For every use of a concept or a relation, a
    versioning framework should provide an distinct
    reference to the intended definition
  • Change tracking
  • A versioning framework should make the relation
    of one version of a concept or relation to other
    versions of that construct explicit
  • Transparent translating
  • A versioning framework should as far as possible
    automatically perform conversions from one
    version to another, to enable transparent access

21
Practical Problems with Ontology Combination
  • Finding alignments
  • It is difficult to find the terms that need to be
    aligned
  • Diagnosis
  • The consequences of a specific mapping
    (unforeseen implications) are difficult to see
  • Repeatability of merges
  • The sources that are used for the merging
    continue to evolve
  • The alignments that are created for the merging
    should be as much reusable as possible for the
    merging of the revised ontologies
  • Very important in the context of ontology
    maintenance

22
Problems Overview
23
Super-imposed Metamodel
  • Transforms information between representations.
  • Approach
  • Represent info from diff models in a uniform way
  • Provide a mapping formalism.
  • Technique
  • Ontology langs are represented in a meta-model
    through RDF triples.
  • Mapping specified by production rules over RDF
    triples.
  • Mapping rules provide integration at schema and
    instance level.
  • -
  • Handles only language mismatches but not
    expressivity.
  • Mappings are specified manually.

24
OKBC
  • A generic interface to KRS.
  • A KR lang is mapped to OKBC Knowledge Model (KM).
  • Interoperability achieved at the level of OKBC
    KM.
  • Solves language mismatches but not expressivity.
  • -
  • Notions requiring higher level of expressivity
    are lost.
  • Does not express terminological axioms like
    covering, disjointness, partition , exclusion.

25
OntoMorph (I)
  • Transformation system for symbolic knowledge.
  • Facilitates
  • Ontology merging.
  • Rapid generation of KB translators.
  • Provides 2 mechanisms
  • Syntactic rewriting via pattern-directed rewrite
    rules.
  • Semantic rewriting that modulates
  • syntactic rewriting via semantic models.
  • logical inference via an integrated KR system.
  • OntoMorph architecture facilitates incremental
    development and scripted replay of transforms.

26
OntoMorph (II)
  • Focuses on aligning ontologies through 3 steps
  • Design transforms to bring sources to mutual
    agreement.
  • Editing sources to carry out the transforms.
  • Taking the union of the morphed sources.
  • Steps
  • 2 is facilitated by transforming ontos in common
    format.
  • 1 is less automatable and involves human
    negotiation.
  • Language mismatches but not expressivity.
  • Ontology level mismatches but not coverage of
    model
  • Repeatability
  • -
  • Transforms are expressed manually.
  • Merging is not dealt at all.

27
Scalable Knowledge Composition
  • Developed algebra for onto composition that
  • Operates on directed label graphs like ontos.
  • Each operator has input a graph of
    semi-structured data and transforms it to a
    graph.(composable)
  • Operations are knowledge driven by using
    articulation rules that are
  • Logical rules (semantic implication between
    terms)
  • Functional rules (conversion between terms across
    ontos)
  • Intersection op produces articulation onto that
    contains terms that are related and their
    relations.
  • Solves conceptual and terminological mismatches.
  • Rules are expressed by engineer and lexical
    knowledge.
  • Repeatability.
  • -
  • Most rules specified manually.
  • No support for merging.

28
Chimaera (I)
  • Chimaera is onto merging and diagnosis tool.
  • Supports ontology browsing and editing.
  • It is targeted at lightweight ontologies.
  • Supports 2 merging tasks
  • Joins two similar terms under the same name.
  • Identifies terms that should be related by
    subsumption, disjointness or instance relations
    and provides support for the introduction of
    these relations.
  • Chimaera also generates by heuristics
  • Name resolution lists for related terms.
  • Taxonomy resolution lists where it suggests
    taxonomy areas for reorganization.

29
Chimaera (II)
  • Has diagnostic support for
  • Verifying
  • Validating
  • Critiquing ontologies.
  • Solves mismatches at terminological and scope of
    concept level.
  • Helps alignment by providing possible edit
    points.
  • Diagnosis of the merging process
  • -
  • Not automatic everything requires user
    interaction.
  • No repeatability.
  • Use of local context for edit points.

30
Prompt
  • Prompt is interactive ontology-merging tool.
  • Guides the user by
  • Making suggestions based on linguistic-similarity
    matches and syntactic clues.
  • By detecting conflicts of one realization of a
    suggestion.
  • By proposing conflict resolution strategies.
  • For every op it populates 3 sets
  • Changes performed automatically.
  • New suggestions for the user.
  • Conflicts introduced like name conflicts,
    dangling references, redundancy in
    class-hierarchy and inconsistencies.
  • Prompt points to places requiring change and for
    every place it proposes new actions.
  • Adv disadv same as Chimaera but supports
    repeatability.

31
FCA-Merge (I)
  • FCA-Merge
  • A bottom-up approach for ontology-merging
  • Offers a global structural desc of the merge
    process
  • Its mechanism based on instances of 2 ontos.
  • The merge process contains 3 steps
  • Instance extraction by natural language
    techniques and computation of 2 formal contexts
    based on extracted instances.
  • Derivation of a common context and computation of
    pruned concept lattice by math techniques of FCA.
  • Generation of merged-ontology based on concept
    lattice with the help of engineer and OntoEdit

32
FCA-Merge (II)
  • Restrictions
  • Input documents should be domain-dependent.
  • Each doc should cover all concepts from source
    ontos.
  • Each doc must separate the concepts well enough
    gt if concepts not separated rightly by the
    method, the engineer should provide more and
    better docs.
  • s and s
  • Terminological and scope of concepts mismatches.
  • Finding alignments with the help of the lattice.
  • Diagnosis of results by using OntoEdit.
  • Repeatability by storing the pruned concept
    lattice.

33
GLUE (I)
  • Applies machine learning techniques for
    alignment.
  • 3 main points
  • Computation of joint probability distribution of
    every concepts involved. In this way
  • Any similarity measure can be computed with JBD.
  • Approach applicable to broad range of
    ontology-matching problems.
  • Multi-strategy learning for computing JBD. In
    this way
  • Many types of info can be used to maximize the
    matching accuracy.
  • System extensible to new learners.
  • Exploits domain restrictions and general
    heuristics for maximizing matching accuracy by
    using relaxation labeling.
  • Process compose of 3 main steps performed by the
    automatable components Distribution Estimator,
    Similarity Estimator and Relaxation Labeler.

34
GLUE (II)
  • Restrictions
  • Only 1-1 mapping of concepts.
  • Nodes not matched cause insufficient training
    data.
  • Implementation of base learners resulted in
    single general-purpose text classification.
  • Nodes not matched cause they are ambiguous. User
    interaction is needed in this way.
  • Some pair of nodes should not be examined at all.
  • s and s
  • Local scope of concepts and proper
    classification.
  • Finding alignments and repeatability automatic.
  • Different encoding is solved by adding
    appropriate learner.

35
Anchor-Prompt (I)
  • Has input a pair of similar pairs provided by
    user or by heuristics.
  • Its algorithm analyzes the paths in the onto
    sub-graph and determined which classes frequently
    appear in similar positions.
  • Extends the approaches used in Prompt.
  • It is implemented upon OKBC protocol.
  • It finds only 1-1 mappings between concepts

36
Anchor-Prompt (II)
  • Limitations
  • Very long paths dont produce accurate results.
  • Path-length0 (Chimaera), Path-length1 (Prompt).
  • Incidental matches can be produced (simil limit).
  • When comparing a deep ontology with many slots
    and a shallow ontology that has slot relating top
    classes, then results are same with Prompt.
  • s
  • Concept scope mismatches are dealt with.
  • Finding alignments and repeatability are
    automatic tasks.

37
SHOE
  • An HTML-based ontology language.
  • Provides a rule mechanism for alignment
  • Common items are mapped by inference rules.
  • Terminological diffs are mapped by if-and-only-if
    rules.
  • Scope diffs require mapping of categories where
    the one subsumes the other.
  • Encoding diffs handled by mapping individual
    values.
  • Provides version numbers to ontologies and
    facilitates both identification of the revisions
    and explicit specification of its relation to
    other revisions (change-tracking).

38
Conclusions (I)
  • Discovered 4 different approaches that handle
    interoperability at the language level
  • Aligning the meta-model.
  • Layered interoperability.
  • Transformation rules.
  • Mapping onto a common knowledge model.
  • We found tools that suggest alignments and
    mappings with the use of heuristics. There are
    two types of heuristics
  • Linguistic based-matches (FCA-Merge).
  • Structural and model similarity (Chimaera and
    Prompt).

39
Conclusions (II)
  • We found tools that semi-automate or
    fully-automate the merging process but having
    only 1-1 mappings of concept using different
    techniques
  • Computation of pruned concept lattice
    (FCA-Merge). Linguistic and FCA techniques.
  • Machine learning techniques (GLUE).
  • Using global instead of local context
    (Anchor-Prompt).
  • Interoperability at the model can be achieved by
    a common top level ontology. Conform to a common
    standard.

40
Conclusions (III)
  • Different approaches for diagnosing or checking
    the results of assignments
  • Domain independent verification and validation
    checks name conflicts, dangling references etc.
  • Validation that requires reasoning redundancy at
    the class hierarchy, value restrictions violated
    etc.
  • Several tools support an executable specification
    of mappings and transforms (SKC,OntoMorph,Prompt,F
    CA-Merge,GLUE,Anchor-Prompt).
  • Most techniques and tools dont deal versioning.
Write a Comment
User Comments (0)
About PowerShow.com