Title: Evaluating Ontology Search
1- Evaluating Ontology Search
- Towards Benchmarking in Ontology Search
- Paul Buitelaar, Thomas Eigner
- Competence Center Semantic Web
- Language Technology Lab
- DFKI GmbH
- Saarbrücken, Germany
-
2Overview
- Ontology Search
- Knowledge reuse (integration with Ontology
Learning) - OntoSelect
- Browse (ontologies, labels, classes, properties)
- Search by topic
- Evaluating Ontology Search
- Benchmark (evaluation) data set
- Experiment (compare SWOOGLE, OntoSelect)
- Conclusions
3Ontology Search
- There are more and more ontologies published on
the (Semantic) Web - Available as RDFS or OWL files (also still DAML)
- Opens up possibilities for reuse of knowledge
- Access through ontology search engines and/or
(manual/automatic) organization in ontology
libraries - But increasingly harder to find the right one
for your application - Increasing research in ontology search/selection
(Alani et al., Buitelaar et al., Ding et al.,
Sabou et al.) SWOOGLE, OntoSelect, Watson
4OntoSelect
- Ontology Library and Search Engine
- http//olp.dfki.de/OntoSelect
- Monitors the web for ontologies with automatic
harvesting and indexing - Browse and search
- On ontologies, classes, properties and
(multilingual) labels - Ontology search integrates relevance feedback
over Wikipedia for search term - Ontology publishing
- Submit ontologies - will be automatically
integrated - Statistics
- On formats, languages, labels used, ontology
publishing
Paul Buitelaar, Thomas Eigner, Thierry Declerck
OntoSelect A Dynamic Ontology Library with
Support for Ontology Selection In Proc. of the
Demo Session at the International Semantic Web
Conference, Hiroshima, Japan, Nov. 2004.
5OntoSelect Browse
6Ontology Search
7Keyword as Wikipedia Topic
8Keyword Expansion (Extraction)
Relevance Feedback from Wikipedia
9Ranked Results (Browsable)
10Search Criteria
- Relevance criteria address ontology content,
structure, status - Coverage - Term Matching
- How many of the terms in a text collection are
covered by labels for classes and properties? - Structure - Properties Relative to Classes
- How detailed is the knowledge structure that the
ontology represents? - Connectedness - Number of Included Ontologies
- Is the ontology connected to other ontologies and
how well established are these?
11Evaluation Benchmark
- Benchmark 15 Wikipedia topics and 57 manually
assigned ontologies out of 1056 cached through
OntoSelect - 15 Wikipedia topics were selected out of the set
of all (37284) class/property labels in
OntoSelect, by - Filtering out labels that did not correspond to a
Wikipedia page gt 5658 labels / topics - 5658 labels were used as search terms in SWOOGLE
to filter out labels that returned less than 10
ontologies (out of the 1056 in OntoSelect) gt 3084
labels / topics - Out of 3084 labels we manually selected useful
topics, e.g. we left out very short labels (v)
and very abstract ones (thing) gt 50 topics - We randomly selected 15 for which we manually
checked the ontologies retrieved from OntoSelect
and SWOOGLE gt 15 topics with 57 assigned
ontologies
12Evaluation Benchmark by Topic
- 15 (Wikipedia) topics with number of assigned
ontologies - Atmosphere (2)
- Biology (11)
- City (3)
- http//www.mindswap.org/2003/owl/geo/geoFeatures.o
wl - http//www.glue.umd.edu/ katyn/CMSC828y/location.d
aml - http//www.daml.org/2001/02/geofile/geofile-ont
- Communication (10)
- Economy (1)
- Infrastructure (2)
- Institution (1)
- Math (3)
- Military (5)
- Newspaper (2)
- Oil (0)
- Production (1)
- Publication (6)
- Railroad (1)
- Tourism (9)
13Evaluation Experiment
- Comparison of (average) results between SWOOGLE
and OntoSelect - Use OntoSelect benchmark
- 15 topics (queries)
- 57 assigned ontologies (relevance assessments)
- 1056 ontologies (data set)
- Use different configurations for OntoSelect
- With/without keyword expansion/extraction
- With/without class names (in addition to labels)
- With/without property labels
- Weighting of relevance criteria
14Evaluation Results
15Evaluation Weighting of title
16Conclusions
- Conclusions on evaluation are too early
- Many more configurations (weights) to compare
- Extend the benchmark
- Comparison with other ontology search engines
- Main contribution of the presented work
- First comprehensive benchmark for topic-driven
evaluation of ontology search - (Extended) Benchmark will be made publicly
available - http//olp.dfki.de/OntoSelect