Title: Shiyan Ou1, Viktor Pekar1, Constantin Orasan1,
1Development and Alignment of a Domain-Specific
Ontology for Question Answering
- Shiyan Ou1, Viktor Pekar1, Constantin Orasan1,
- Christian Spurk2, Matteo Negri3
- 1Research Group in Computational Linguistics,
University of Wolverhampton, UK - 2German Research Centre for Artificial
Intelligence GmbH (DFKI), Germany - 3Fondazione Bruno Kessler FBK, Italy
2Structure
- Introduction to QALL-ME
- The QALL-ME ontology
- Alignment to WordNet and SUMO
- How the ontology is used for data encoding
- Conclusions
3Introduction to QALL-ME
- QALL-ME (Question Answering Learning technologies
Multilingual Multimodal Environment) is an
EU-funded project which aims to establish a
shared infrastructure for multilingual and
multimodal question answering in the domain
tourism. - Projects website http//qallme.fbk.eu/
- In the QALL-ME system
- users pose natural language questions in several
languages (both in textual and speech modality)
using a variety of input devices (e.g. mobile
phones), and - returns a list of specific answers formatted in
the most appropriate modality, ranging from small
texts, maps, videos, and pictures. - A domain-specific ontology for the tourism domain
was developed and shared among all the partners.
4The ontology in the project
WP3 Multilingual question interpretation
WP4 Annotation of entities Indexing of
data Retrieval of data
QALL-ME ontology
WP5 Multilingual answer extraction
WP9 Evaluation
See more in O39 Multilingual Resources
(Ambasadeurs) at 1305
5Design of the ontology
- Analysis of data from content providers
- Analysis of users requirements
- Inspired by similar ontologies such as Harmonise,
eTourism, Hi-Touch, TAGA, GETESS - Harmonise and eTourism focus on static
information (e.g. accommodation and
events/activities), rather than dynamic
information related to travel business (e.g.
customers and itineraries) as the TAGA and
Hi-Touch ontologies do. - Similar to eTourism as is written in OWL rather
RDFs - but wider coverage than each individual existing
ontology - Introspection
6Technical details of the ontology
- Encoded using OWL DL, since it has more
expressive power than OWL Lite and has more
efficient reasoning support than OWL Full - Used Protégé-OWL as the editor and RacerPro7 as
the reasoner - The ontology contains
- 122 classes (concepts),
- 55 datatype properties and
- 52 object properties which indicate the
relationships among the 122 classes. - 15 top-level classes.
- The class hierarchy has a maximum depth of 4.
7Part of the ontology (cinema/movies)
8Ontology alignment
- The QALL-ME ontology was designed as a model of
the narrow knowledge domain of tourism. - The QALL-ME ontology was complemented with
information from WordNet (and implicitly
MultiWordNet) and SUMO via alignment - The QALL-ME ontology is being changed so fully
manual alignment was not a solution - Fully automatic alignment is not precise enough,
but maybe semi-automatic alignment is a solution
9Ontology alignment (II)
- The alignment relied on
- String similarity of element identifiers (e.g.
chalet ? chalet_1, SiteFacilityForChildren ?
facility_) - Structural similarity for disambiguation (e.g.
uses the semantic distance to aligned concepts) - Definition similarity for disambiguation
(similarity between comments in the ontology and
WordNet glosses is used) - Structural similarity for unmatched concepts is
calculated to all the nouns in WordNet
10Ontology alignment (III)
- The overall accuracy of the fully automatic
alignment is clearly suboptimal (precision of 49
and recall of 31), - Error analysis
- We noticed that for concept names with
unambiguous matches in WordNet the algorithm
performs without any errors - The poor disambiguation performance is due to the
very different depths of the two ontologies - Only a few concepts have comments which are
useful for definition similarity - Semi-automatic alignment requires under 30
minutes to obtain perfect alignment
11Example of alignment
- QALL-ME SUMO WN2.1 WN2.1 gloss
- Accommodation _at_inhabits 02647858 living
quarters provided for - public convenience
- "overnight accommodations
- are available"
- Chalet _at_Building 02973228 a Swiss house with
a sloping - roof and wide eaves or a
- house built in this style
- PostOffice _at_Organization 08034771 an
independent agency of the - federal government
- responsible for mail delivery
12Semantic annotation and database organization
- The ontology was used to encode the data
- Annotated data from the content providers was
converted to RDF triplets - The RDF documents can be stored in databases or
plain text files - The Jena RDF API was used for the operations
13Semantic annotation and database organization
14Content retrieval
- For retrieval SPARQL is used
- SPARQL is a query language for accessing RDF
graphs by the W3C RDF Data Access Working Group - SPARQL provides interoperability between languages
15- What movie starring Halle Berry is on in
Birmingham? - Class MovieShow ?
- Property isInSite, Range Cinema ?
- Property hasPostalAddress, Range
PostalAddress ? - Property isInDestination, Range Destination
- Property name, Range string ltBirminghamgt
- Property hasEventContent, Range Movie ?
- Property name, Range string ltunknowngt
- Property hasStar, Range Star ?
- Propertyname, Range string ltHalle Berrygt
16- PREFIX qme http//qallme.itc.it/ontology/qallme-t
ourism.owl - PREFIX xsd http//www.w3.org/2001/XMLSchema
- SELECT ?movieName
- WHERE
- ?MovieShow qmeisInSite ?Cinema.
- ?Cinema qmehasPostalAddress ?PostalAddress.
- ?PostalAddress qmeisInDestination ?Destination.
- ?Destination qmename Birminghamltxsdstringgt
- ?MovieShow qmehasEventContent ?Movie.
- ?Movie qmename ?movieName.
- ?Movie qmehasStar ?Star.
- ?Star qmename Halle Berryltxsdstringgt
17Conclusions
- The QALL-ME ontology was specifically designed
for the domain of tourism - The ontology is playing an important role in
several parts of the project - The current ontology went through several
revisions before reaching the current stage (and
it may change again!!!) - Both the ontology and its alignment to WordNet
and SUMO will be made freely available on the
projects website
18Thank you !
Projects website http//qallme.fbk.eu/