oMAP: Combining Classifiers for Aligning Automatically OWL Ontologies - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

oMAP: Combining Classifiers for Aligning Automatically OWL Ontologies

Description:

oMAP: Combining Classifiers for Aligning Automatically ... algo. 0.74. 0.78. 0.66. 0.83. 0.12. 0.60. 0.72. 0.92. 0.48. 0.50. 0.64. 0.93. 0.10. 0.59. 0.60. 0.66 ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 27
Provided by: miente
Category:

less

Transcript and Presenter's Notes

Title: oMAP: Combining Classifiers for Aligning Automatically OWL Ontologies


1
oMAP Combining Classifiers for Aligning
Automatically OWL Ontologies
Raphaël Troncy, Umberto Straccia
Tuesday 22th of November, 2005
2
Agenda
  • Motivations
  • oMAP
  • A formal framework
  • The different classifiers used
  • Evaluation
  • Conclusion

3
Motivations
  • Heterogeneity of information systems
  • Ontologies as a solution to data heterogeneity on
    the Web
  • Ontologies are themselves heterogeneous
  • knowledge representation language
  • degree of formalization
  • Semantic Web
  • More and more OWL/RDF ontologies on the Web
  • Need for comparing/reusing/merging ontologies
  • partially covering the same domain
  • different version of the same ontology

4
Aligning Ontologies
  • A matching operator KnowledgeWeb, 2005
  • Input a set of discrete entities (tables, XML
    elements, classes, properties)
  • Output
  • relationship holding between the entities
    (subsumption, equivalence, disjointness)
  • a confidence measure
  • Automatic vs manual techniques
  • Numerous work from various communities
  • schema matching, machine learning, data
    integration

5
Example
Equivalence Subsumption Disjointness
6
Agenda
  • Motivations
  • oMAP
  • A formal framework
  • The different classifiers used
  • Evaluation
  • Conclusion

7
oMAP A Formal Framework
  • Inspirations
  • Formal work in data exchange Fagin et al., 2003
  • GLUE combining several specialized components
    for finding the best set of mappings Doan et
    al., 2003
  • Notations
  • A mapping is a tuple M (T, S, ?)
  • S et T are the source and target ontologies
  • Si is an OWL entity (class, datatype property,
    object property) of the ontology
  • ? is a set of mapping rules aij Tj ? Si

8
oMAP Overall Strategy
  • A three step process
  • Form possible ? sets and estimate its quality
    based on the quality measures for its mapping
    rules
  • For each mapping rule Tj ? Si, estimate its
    confidence aij which also depends on the ? it
    belongs to
  • Use heuristics to build iteratively the final set
    of mappings

9
oMAP Combining Classifiers
  • Weight of a mapping rule
  • aij w (Si,Tj, ?)
  • Using different classifiers
  • w (Si,Tj,CLk) is the classifier's approximation
    of the rule Tj ? Si
  • Combining the approximations
  • Use of a priority list CL1 CL2 CLn

10
Terminological Classifiers
  • Same entity names (or URI)
  • Same entity name stems

11
Terminological Classifiers
  • String distance name
  • WordNet distance name
  • lcs is the longest common substring between Si
    and Tj
  • sim

12
Machine Learning-Based Classifiers
  • Collecting individuals
  • label for the named individuals
  • data value for the datatype properties
  • type for the anonymous individuals and the range
    of object properties
  • Recursion on the OWL definition
  • depth parameter

13
Machine Learning-Based Classifiers
  • Example
  • Individual (x1 type (Conference)
  • value (label "Int. Conf. on WISE") value
    (location x2) )
  • Individual (x2 type (Address)
  • value (city "New York city") value (country
    "USA") )
  • u1 ("Int. Conf. on WISE", "Address")
  • u2 ("Address", "New York City", "USA")
  • Naïve Bayes text classifier
  • kNN text classifier

14
Structural and Semantics-Based Classifier
  • If Si and Tj are property names
  • If Si and Tj are concept names1

1 Where D D(Si) D(Tj) D(Si) represents the
set of concepts directly parent of Si
15
Structural and Semantics-Based Classifier
  • Let CS(QR.C) and DT(QR.D), then1
  • Let CS(op C1Cm) and DT(op D1Dm), then2

1 Where Q,Q are quantifiers, R,R are property
names and C,D concept expressions 2 Where op, op
are concept constructors and n,m 1
16
Structural and Semantics-Based Classifier
  • Complexity
  • number of mapping rules
  • number of possible ? sets
  • Reduction of the space
  • considering ? sets that contain mapping rules for
    the classes
  • considering the range of the datatype properties
    (XML Schema taxonomy)
  • Local maximum heuristic
  • pick a concept and consider only the entities
    involved in its closure definition (detect cycles
    !)
  • choose the best local ? set
  • iterate the process until the convergence

17
Structural and Semantics-Based Classifier
  • Possible values for wop and wQ weights
  • wop wQ

18
Agenda
  • Motivations
  • oMAP
  • A formal framework
  • The different classifiers used
  • Evaluation
  • Conclusion

19
Evaluation
  • More and more techniques / tools for aligning
    ontologies KW D2.2.3, 2005
  • difficult to compare all the approaches
    theoretically
  • pragmatism evaluation campaign and contest
  • I3CON based on the NIST Text Retrieval
    Conference model
  • EON systematic benchmark tests on all OWL
    constructs
  • OAEI http//oaei.inrialpes.fr
  • Alignment API Euzenat, ISWC 2004
  • common format for representing / exchanging the
    alignments found
  • tools and metrics for evaluating these alignments

20
Evaluation EON Contest
  • 4 competitors Karlsruhe, INRIA, Fujitsu,
    Stanford
  • 3 series of tests on bibliographic ontologies
  • simple tests identity, specialization/generalizat
    ion of the language
  • systematic tests some features of the initial
    ontology are progressively discarded
  • complex tests aligning 4 real ontologies
    available on the Web
  • Results 2 groups but inadequacy /
    incomplete-ness of the tests

21
Evaluation oMAP and EON
22
Evaluation oMAP and OAEI
23
Agenda
  • Motivations
  • oMAP
  • A formal framework
  • The different classifiers used
  • Evaluation
  • Conclusion

24
Conclusion
  • oMAP a formal framework for aligning
    automatically OWL ontologies
  • Combining several specific classifiers
  • terminological classifiers
  • machine learning-based classifiers
  • structural and semantics-based classifier
  • Empirical evaluation on benchmark tests
  • using traditional information retrieval metrics
  • machine resources, memory, computation time not
    yet considered

25
Future Work
  • Using additional classifiers
  • kNN, KL-distance, WordNet or other terminological
    resources
  • straightforward theoretically but practically
    difficult
  • Finding complex alignment
  • name firstName lastName
  • OWL and rule-based languages
  • take into account this additional expressivity

26
Useful Links
  • oMAP http//homepages.cwi.nl/troncy/oMAP/
  • Tutorial Schema and Ontology Matching _at_ ESWC
    http//dit.unitn.it/accord/Presentations/ESWC'05-
    MatchingHandOuts.pdf
  • Alignment API http//co4.inrialpes.fr/align/align
    .html
  • OAEI http//oaei.inrialpes.fr/
  • State of the Art
  • P. Shvaiko and J. Euzenat A Survey of
    Shema-based Matching Approaches. Journal on Data
    Semantics (JoDS), 2005
  • KW Consortium State of the Art on Ontology
    Alignment. Knowledge Web D2.2.3, 2004
Write a Comment
User Comments (0)
About PowerShow.com