An Empirical Study of InstanceBased Ontology Mapping - PowerPoint PPT Presentation

About This Presentation
Title:

An Empirical Study of InstanceBased Ontology Mapping

Description:

... van der Meij, Stefan Schlobach, Shenghui Wang. STITCH_at_CATCH funded by NWO ... Scenario 1 can be evaluated differently (e.g. cross-validation on test-data) ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 30
Provided by: stefanant
Category:

less

Transcript and Presenter's Notes

Title: An Empirical Study of InstanceBased Ontology Mapping


1
An Empirical Study of Instance-Based Ontology
Mapping
  • Antoine Isaac, Lourens van der Meij, Stefan
    Schlobach, Shenghui Wang
  • STITCH_at_CATCH funded by NWO
  • Vrije Universiteit Amsterdam
  • Koninklijke Bibliotheek Den Haag
  • Max Planck Instutute Nijmegen

2
Metamotivation
  • Ontology mapping in practise
  • Based on real problems in the host institution at
    the Dutch Royal Library
  • Task-driven
  • Annotation support
  • Merging of thesauri
  • Real thesauri (100 years of tradition)
  • Really messy
  • Conceptually difficult
  • Inexpressive
  • Generic Solutions to Specific Questions Tasks
  • Using Semantic Web Standards (SKOSification)

3
Overview
  • Use-case
  • Instance-based mapping
  • Evaluation
  • Experiments
  • Results
  • Conclusions

4
The Alignment Task Context
  • National Library of the Netherlands (KB)
  • 2 main collections
  • Legal Deposit all Dutch printed books
  • Scientific Collections history, language
  • Each described (indexed) by its own thesaurus

5
A need for thesaurus mapping
  • The KB wants
  • (Scenario 1) Possibly discontinue one of both
    annotation and retrieval methods.
  • (Scenario 2) Possibly merge the thesauri
  • We try to explore mapping
  • (Task 1) In case of single/new/merged retrieval
    system, find books annotated with old system,
    facilitated by using mappings
  • (Task 2) Candidate terms for merged thesaurus
  • We make use of the doubly annotated corpus to
    calculate Instance-Based mappings

6
Overview
  • Use-case
  • Instance-based mapping
  • Evaluation
  • Experiments
  • Results
  • Conclusions

7
Calculating mappings using Concept Extensions
8
Standard approach (Jaccard)
  • Use co-occurrence measure to calculate similarity
    between 2 concepts e.g.

Elements of B
B
G
Elements of G
Joint Elements
Set of books in the library
Similarity 5/9 55 (overlap, e.g. Degree of
Greenness )
Similarity 1/7 14 (overlap, e.g. Degree of
Greenness )
9
Issues with this measure (sparse data)
  • What is more reliable?
  • We need
  • more reliable measures
  • Or thresholds (at least n doubly annotated books)

Or
?
Jacc 1/1 100
Jacc 18/21 86
The second solution is worse bB
MemberOfParliament and bG Cricket
10
Issue with measure (hierarchy)
Consider a hierarchy
Jacc(B,G) ½ 50
B
Jacc(B,G) 2/6 33

B
G
Non hierarchical
Hierarchical Elements
Set of books in the library
11
An empirical study of instance-based OM
  • We experimented with three dimensions

Jaccard Corrected Jaccard Pointwise Mutual
Information Log Likelihood Ratio Information Gain
0 10
Similarity measure
Threshold
Yes No
Hierarchy
Why only 2 thresholds? Because of evaluation
costs!
12
Overview
  • Use-case
  • Instance-based mapping
  • Evaluation
  • Experiments
  • Results
  • Conclusions

13
Evaluation building a gold standard
Possible Thesaurus relations ( SKOS)
GTT
Brinkman
14
User Evaluation Statistics
  • 3 evaluators with 1500 evaluations
  • 90 agreement ONLYEQ
  • If some evaluator says "equivalent", 73 of other
    evaluators say the same
  • Comparing two evaluators, correspondence in
    assignment is best for equivalence, followed by
    "No Link", "Narrower than", "Broader than", at or
    above 50 agreement, "Related To" has 35
    agreement.
  • There are correlations between evaluators.
  • For example, Ev1 and Ev2 agreed much more on
    saying that there is no link than the Ev3.

15
Evaluation Interpretation What is a good
mapping?
  • Is use case specific. We considered
  • ONLYEQ Only Equivalent answer ? correct
  • NOTREL EQ, BT,NT ? correct
  • ALL EQ, BT, NT, RT ? correct
  • ONLYEQ ? NOTREL ? ALL
  • The question is obviously do they produce the
    same results

16
Evaluation validity of the (different) methods
Answer is yes All evaluations produce the same
results (in different scales)
17
A remark about Evaluation
  • Use of mappings strongly task dependant
  • Scenario 1 (legacy data/annotation support) and
    Scenario 2 (thesaurus merging) require different
    mappings.
  • Our evaluation is useful (correct) for Scenario 2
    (intensional)
  • Scenario 1 can be evaluated differently (e.g.
    cross-validation on test-data)
  • See our paper at the Cultural Heritage Workshop.

18
Overview
  • Use-case
  • Instance-based mapping
  • Evaluation
  • Experiments
  • Results
  • Conclusions

19
Experiments Setup, Data and Thesauri
  • We calculated
  • 5 different similarity measures with
  • Threshold 0 and 10
  • Hierarchy yes or no.
  • Based on on
  • 24.061 GTT concepts with
  • 4.990 Brinkman concepts based on
  • 243.886 books with double annotations

20
Experiments Result calculation
  • Average precision at similarity position i
  • Pi Ngood,i/Ni
  • (take the first i mappings, and return the
    percentage of correct ones)
  • Example
  • This means that from the first 798 mappings 86
    were correct
  • Recall is estimated based on lexical mappings
  • F-measure is calculated as usual

100
86
798th mapping
21
Overview
  • Use-case
  • Instance-based mapping
  • Evaluation
  • Experiments
  • Results
  • Conclusions

22
Results Three research questions
  • What is the influence of the choice of threshold?
  • What is the influence of hierarchical
    information?
  • What is the best measure and setting for
    instance-based mapping?

23
What is the influence of the choice of threshold?
Threshold needed for Jaccard
Threshold NOT needed for LLR
24
What is the influence of hierarchical information?
Results are inconclusive!
25
Best measure and setting for instance-based
mapping?
We have two winners!
10
The corrected Jaccard measures
26
Conclusion
  • Summary
  • About 80 precision at estimated 80 recall
  • Simple measures perform better, if statistical
    correction applied, (threshold or explicit
    statistical correction)
  • Hierarchical aspects unresolved
  • Some measures really unsuited
  • Future work
  • Generalize results
  • Other use cases, web directories,
  • Study other measures

27
Thank you.
28
Similarity measures Formulae
  • Jaccard
  • Corrected Jaccard assign a smaller score to less
    frequently co-occurring annotations.

29
Information Theoretic Measures
  • Pointwise Mutual Information
  • Measures the reduction of
  • uncertainty that the annotation
  • of one concept yields for the
  • annotation with another concept.
  • -gt disadvantage inadequate for spare data
  • LogLikelihoodRatio
  • Information Gain
  • Information gain is the difference in entropy,
  • determine the attribute that distinguishes best
    between positive an negative example
Write a Comment
User Comments (0)
About PowerShow.com