Title: Ontology-Based Free-Form Query Processing for the Semantic Web
1Ontology-Based Free-Form Query Processing for
the Semantic Web
- Mark Vickers
- Brigham Young University
- MS Thesis Defense
Supported by
2Presentation Overview
- Web Queries
- Explanation of AskOntos
- Demo
- Evaluation
- Future Work and Conclusion
3Web Queries Challenges
- Example Searching for a car
- Cannot specify constraints
- Documents returned (usually too many)
- Takes time to read through documents
- Determine relevance
- Find information (price, year, etc.)
4Web Queries Opportunities
- Semantic web
- Proposed ontology-based framework for making
information machine-readable - Uses markup languages to identify information
- A search program can look for only those pages
that refer to a precise concept - -Tim Berners-Lee
- How should semantic web be searched?
5Solution AskOntos a Query System for the
Semantic Web
- Allows free-form queries over semantically
annotated pages - Processes queries using information extraction
- Returns tables of extracted values
6AskOntos Overview
7Extraction Ontologies
Object sets Relationship sets Participation
constraints Lexical Non-lexical Primary object
set Aggregation Generalization/Specialization
8Extraction Ontologies
Data Frame
Internal Representation float
Value Phrase
Value Expression \s\s(\d1,3)(\.\d2)?
Left Context
Key Word Phrase
Key Word Expression (Pprice)(Ccost)
Operation Phrase
Operator gt
Expression (more\sthan)(more\scostly)
9Annotating Web Pages
10Annotating Web Pages
11Step 1. Parse Query
Find me the and of all
s I want a
price
mileage
red
Nissan
1996
or newer
gt Operator
12Step 2. Find Related Ontology
Find me the price and mileage of all red Nissans
I want a 1996 or newer
Similarity value 5
Similarity value 2
13Step 3. Formulate XQuery Expression
- Conjunctive and aggregate queries run over
selected ontologys extracted values - Value-phrase-matching words determine conditions
- Conditions
- Color red
- Make Nissan
- Year gt 1996
gt Operator
14Step 3. Formulate XQuery Expression
For
Let
Where
Return
15Step 4. Run XQuery Expression Over
Ontologys Extracted Data
- Uses Qexo 1.7, GNUs XQuery engine for Java
- Orders results according to number of values
16Demo
17Evaluation of AskOntos
- Success Measure ability to translate free-form
queries into formal queries - Extraction ontologies car ads, house ads,
countries, movies, and diamond ads - 3 rounds of testing
- 50 queries each (gathered from other CS students)
- 1st round discarded due to queries
- Minor improvements on system between rounds
18Query Translation Metrics
Find me the price and mileage of all red Nissans
I want a 1996 or newer.
for doc in document("file///.../Car.OWL")/rd
fRDF for Record in doc/owlThing
where(Color"red" or empty(Color)) and
(Make"Nissan" or empty(Make)) and
(Year"1996" or empty(Year)) return ltRecord
ID"id"gt ltPricegtPricelt/Pricegt
ltColorgtColorlt/Colorgt
ltMakegtMakelt/Makegt
ltYeargtYearlt/Yeargt lt/Recordgt
Human conversion
Return-Clause Names Price,
Mileage,Color, Make, Year Conditions
(Color,,red),
(Make,,Nissan),
(Year,gt,1996)
Automated conversion
Precision Recall
Return-Clause Names 100 80
Conditions 66 66
Return-Clause Names Price,Color, Make,
Year Conditions (Color,,red),
(Make,,Nissan),
(Year,,1996)
19Results
20Result Analysis
- Common reasons for errors
- 1. Word not in lexicon
- 5 Bedrooms, 3 Bath, study, game room, 2 car
garage, and lt 250,000
21Result Analysis
2. Mistakes in regular expressions
- Which countries use the euro?
22Result Analysis
What are the models from 2005
23Conclusion/Contributions
- AskOntos
- Is a free-form query system for the semantic web
- Applies information extraction for query
processing - Answers questions with extracted data values
- Contributions
- Web queries that use semantic annotations
- Web queries returning answers from extracted data
- Processing free-form queries using ontologies
24Future Work
- Disjunction and negation
- Fuzzy queries
- Spellchecker
25(No Transcript)
26TREC 2004 QA Question Topics
27Related Research
Similarities Differences
QUEST (1999) Uses Ontologies Graphic-based interface Returns generated documents and graphs
SHOE (2000) Returns tables of data Form-based interface
AQUA (2004) Natural language interface Uses ontology as part of query translation process For single domain environment Part-of-speech recognition Uses ontology for term replacement Returns passages
28Related Research
Similarities Differences
Bernstein et al. (2005) Natural language interface Allows only subset of English (Attempto Controlled English) queries
SWSE (2005) Natural language interface Returns semantically annotated data No part-of-speech recognition Query context found by matching RDF labels, comments and literals Uses WordNet
NaLIX (2006) Converts natural language query to same XML query language Limited to parsing ability of MINIPAR For XML database Query terms expanded with WordNet
29Simple Multiple-Record Documents
Genealogy Domain from Troy Walkers thesis
Highest-Fanout Separator
VSM Separator
 records returned correct precision recall
simple1 19 20 19 95.00 100.00
Simple2 19 17 17 100.00 89.47
Simple3 11 11 11 100.00 100.00
Simple4 9 9 9 100.00 100.00
Simple5 12 13 11 84.62 91.67
Simple6 12 11 10 90.91 83.33
Simple7 14 10 10 100.00 71.43
Simple8 5 7 5 71.43 100.00
Simple9 14 14 14 100.00 100.00
Simple10 15 15 15 100.00 100.00
Total 130 127 121 95.28 93.08
 records returned correct precision recall
simple1 19 22 19 86.36 100.00
simple2 19 20 0 0.00 0.00
simple3 11 14 11 78.57 100.00
simple4 9 10 9 90.00 100.00
simple5 12 16 12 75.00 100.00
simple6 12 23 9 39.13 75.00
simple7 14 22 13 59.09 92.86
simple8 5 10 0 0.00 0.00
simple9 14 16 14 87.50 100.00
simple10 15 16 0 0.00 0.00
Total 130 169 87 51.48 66.92
30Complex Multiple-Record Documents
 records returned missed extra correct precision recall
complex1 10 10 0 0 10 100.00 100.00
complex2 15 15 0 0 15 100.00 100.00
complex3 12 12 0 0 12 100.00 100.00
complex4 7 9 1 3 6 66.67 85.71
complex5 16 15 1 0 15 100.00 93.75
complex6 15 16 2 3 13 81.25 86.67
complex7 13 12 1 0 12 100.00 92.31
complex8 10 10 0 0 10 100.00 100.00
complex9 19 20 1 2 18 90.00 94.74
complex10 10 10 1 1 9 90.00 90.00
complex11 15 11 4 0 11 100.00 73.33
complex12 15 15 0 0 15 100.00 100.00
complex13 11 11 0 0 11 100.00 100.00
complex14 16 18 1 3 15 83.33 93.75
complex15 8 8 2 2 6 75.00 75.00
complex16 8 9 0 1 8 88.89 100.00
complex17 10 11 0 0 11 100.00 110.00
complex18 4 1 3 0 1 100.00 25.00
complex19 8 11 0 3 8 72.73 100.00
complex20 16 13 4 1 12 92.31 75.00
Total 238 237 21 19 218 91.98 91.60
31Scaling to the Web
- Ontologies crawl and harvest web pages
- Ontologies extract values from pages
- Ontologies indexed
- Queries extracted by relevant ontologies
- Rely on Google-like technology