Benchmarking ontology-based annotation tools for the Semantic Web - PowerPoint PPT Presentation

About This Presentation
Title:

Benchmarking ontology-based annotation tools for the Semantic Web

Description:

New metrics for evaluation. New visualisation tools. Development of usability criteria ... New metrics. Usability evaluation. Visualisation software ... – PowerPoint PPT presentation

Number of Views:97
Avg rating:3.0/5.0
Slides: 20
Provided by: Dia571
Category:

less

Transcript and Presenter's Notes

Title: Benchmarking ontology-based annotation tools for the Semantic Web


1
Benchmarking ontology-based annotation tools for
the Semantic Web
  • Diana Maynard
  • University of Sheffield, UK

2

3
What?
  • Work in the context of the EU Network of
    Excellence KnowledgeWeb
  • Case studies in field of bioinformatics
  • Developing benchmarking tools and test suites for
    ontology generation and evaluation
  • New metrics for evaluation
  • New visualisation tools
  • Development of usability criteria

4
Why?
  • Increasing interest in the use of ontologies in
    bioinformatics, as a means of accessing
    information automatically from large databases
  • Ontologies such as GO enable annotation and
    querying of large databases such as SWISS-PROT.
  • Methods for IE have become extremely important in
    these fields.
  • Development of OBIE applications is hampered by
    lack of standardisation and suitable metrics for
    testing and evaluation
  • Main focus till now on performance over practical
    aspects such as usability and accessibility.

5
Gene Ontology
  • Collaborative ontology construction has been
    practiced in the gene ontology community for a
    long time compared with other communities.
  • This makes it a good case study for testing
    applications and metrics.
  • Used in KnowledgeWeb to show that the SOA tools
    supporting communities creating their own
    ontologies can be further advanced by suitable
    evaluation techniques, amongst other things.

6
Automatic Annotation Tools
  • Semantic annotation is used to create metadata
    linking the text to one or more ontologies
  • Enables us to combine and associate existing
    ontologies, to perform more detailed analysis of
    the text, and to extract deeper and more accurate
    knowledge
  • Semantic annotation generally relies on
    ontology-based IE techniques
  • Suitable evaluation metrics and tools for these
    new techniques are currently lacking

7

8
Requirements for Semantic Annotation Tools
  • Expected functionality level of automation,
    target domain, text size, speed
  • Interoperability ontology format, annotation
    format, platform, browser
  • Usability installation, documentation, ease of
    use, aesthetics
  • Accessibility flexibility of design, input and
    display alternatives
  • Scalability text and ontology size
  • Reusability range of applications

9
Performance Evaluation Metrics
  • Evaluation metrics mathematically define how to
    measure the systems performance against
    human-annotated gold standard
  • Scoring program implements the metric and
    provides performance measures
  • for each document and over the entire corpus
  • for each type of annotation
  • may also evaluate changes over time
  • A gold standard reference set also needs to be
    provided this may be time-consuming to produce
  • Visualisation tools show the results graphically
    and enable easy comparison

10
GATE AnnotationDiff Tool
11
Correct and incorrect instances attached to
concepts
12
Evaluation of instances by source
13
Methods of evaluation
  • Traditional IE is evaluated in terms of
    Precision, Recall and F-measure.
  • But these are not sufficient for ontology-based
    IE, because the distinction between right and
    wrong is less obvious
  • Recognising a Person as a Location is clearly
    wrong, but recognising a Research Assistant as a
    Lecturer is not so wrong
  • Similarity metrics need to be integrated so that
    items closer together in the hierarchy are given
    a higher score, if wrong

14
Learning Accuracy
  • LA Hahn98 originally defined to measure how
    well a concept had been added in the right level
    of the ontology, i.e. ontology generation
  • Later used to measure how well the instance has
    been added in the right place in the ontology,
    i.e. ontology population.
  • Main snag is that it doesnt consider the height
    of the Key concept, only the height of the
    Response concept.
  • Also means that similarity is not bidirectional,
    which is intuitively wrong.

15
Balanced Distance Metric
  • We propose BDM as an improvement over LA
  • Considers the relative specificity of the
    taxonomic positions of the key and response
  • Does not distinguish between the directionality
    of this relative specificity, e.g. Key can be a
    specific concept (e.g. 'car') and the response a
    general concept (e.g. 'relation'), or vice versa.
  • Distances are normalised wrt average length of
    chain
  • Makes the penalty in terms of node traversal
    relative to the semantic density of the concepts
    in question

16
BDM the metric
  • BDM is calculated for all correct and partially
    correct responses

CP distance from root to MSCA DPK distance
from MSCA to Key DPR distance from MSCA to
Response n1 average length of the set of chains
containing the key or the response concept,
computed from the root concept.
17
Augmented Precision and Recall
BDM is integrated with traditional Precision and
Recall in the following way
18
Conclusions
  • Semantic annotation evaluation requires
  • New metrics
  • Usability evaluation
  • Visualisation software
  • Bioinformatics field is a good testbench, e.g.
    evaluation of protein name taggers
  • Implementation in GATE
  • Knowledge Web benchmarking suite for evaluating
    ontologies and ontology-based tools

19

A final thought on evaluation
We didnt underperform. You overexpected.
Write a Comment
User Comments (0)
About PowerShow.com