Evaluation of Partial Path Queries on XML Data - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

Evaluation of Partial Path Queries on XML Data

Description:

Suggested Running Routes. Holiday Inn. Directions to the river running path: ... (Not Crossable from River Road) This is a section from the Twin Cities Marathon ... – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 56
Provided by: TD1
Category:

less

Transcript and Presenter's Notes

Title: Evaluation of Partial Path Queries on XML Data


1
Evaluation of Partial Path Queries on XML Data
Stefanos Souldatos
2
Evaluation of Partial Path Queries on XML Data
  • Querying XML data
  • Partial path queries
  • Query evaluation
  • Experiments
  • Conclusion

?
3
Difficulties on querying XML Data
Creta
4
Difficulties on querying XML Data
Search for hotel Name Xiaoying Wu Place Athens
Center, Heraklio Purpose Sightseeing ? struc
tural difference
Parthenon (438 BC)
Phaistos Disk (1700 BC)
Creta
5
Difficulties on querying XML Data
Search for hotel Name Theodore Dalamagas Place
Islands Purpose Sea sports ? structural inco
nsistency
Windsurf
Jet ski
Creta
6
Difficulties on querying XML Data
Search for hotel Name Dimitri Theodoratos Plac
e Heraklio Purpose HDMS Conference ? unknown
structure
HDMS 2008
Creta
7
Difficulties on querying XML Data
Search for hotel Name Stefanos Souldatos Place
Any island Purpose Escape from PhD! ? multi
ple sources
Creta
?
theHotel.gr
1400 islands
hotels.gr
holidays.gr
8
Difficulties on querying XML Data
Q1. Can we use XPath to express our queries?
Q2. Can we use existing techniques to evaluate
our queries?
Creta
9
Can we Use XPath?
Path queries expressed in XPath
of structure
0
100
path queries
keyword search
10
Can we Use XPath?
Path queries expressed in XPath
of structure
0
100
path queries
keyword search
1//City descendant-or-selfancestor-or-selfI
sland
2/City//Island
11
Can we Use XPath?
Path queries expressed in XPath
of structure
0
100
path queries
keyword search
partial path queries
12
Evaluation of Partial Path Queries on XML Data
  • Querying XML data
  • Partial path queries
  • Query evaluation
  • Experiments
  • Conclusion

?
?
13
Partial Path Queries
  • Query processing
  • Full form (13 inference rules)
  • Unsatisfiability (cycles)
  • Redundant nodes (4 patterns)
  • Canonical form

14
Partial Path Queries
  • Query processing
  • Full form (13 inference rules)
  • Unsatisfiability (cycles)
  • Redundant nodes (4 patterns)
  • Canonical form

15
Partial Path Queries
  • Query processing
  • Full form (13 inference rules)
  • Unsatisfiability (cycles)
  • Redundant nodes (4 patterns)
  • Canonical form

16
Partial Path Queries
  • Query processing
  • Full form (13 inference rules)
  • Unsatisfiability (cycles)
  • Redundant nodes (4 patterns)
  • Canonical form

17
Partial Path Queries
  • Query processing
  • Full form (13 inference rules)
  • Unsatisfiability (cycles)
  • Redundant nodes (4 patterns)
  • Canonical form

18
Partial Path Queries
  • Query processing
  • Full form (13 inference rules)
  • Unsatisfiability (cycles)
  • Redundant nodes (4 patterns)
  • Canonical form

19
Partial Path Queries
  • Query processing
  • Full form (13 inference rules)
  • Unsatisfiability (cycles)
  • Redundant nodes (4 patterns)
  • Canonical form

20
Partial Path Queries
  • Query processing
  • Full form (13 inference rules)
  • Unsatisfiability (cycles)
  • Redundant nodes (4 patterns)
  • Canonical form

21
Partial Path Queries
  • Query processing
  • Full form (13 inference rules)
  • Unsatisfiability (cycles)
  • Redundant nodes (4 patterns)
  • Canonical form

22
Evaluation of Partial Path Queries on XML Data
  • Querying XML data
  • Partial path queries
  • Query evaluation
  • Experiments
  • Conclusion

?
?
?
23
Naive Techniques
NT1. Producing all possible path queries
24
Naive Techniques
NT1. Producing all possible path queries
r
a
c
b
d
e
f
g
25
Naive Techniques
NT1. Producing all possible path queries
r
r
r
r
a
a
a
a
c
b
c
b
c
b
c
b
d
d
d
d
e
f
e
f
e
f
e
f
g
g
g
g
26
Naive Techniques
NT1. Producing all possible path queries
? too many queries to evaluate
27
Naive Techniques
NT2. Decomposing into binary relationships
28
Naive Techniques
NT2. Decomposing into binary relationships
r
a
a
a
b
c
Stack-Tree-Desc or PathStack
b
c
d
d
d
d
f
e
f
g
29
Naive Techniques
NT2. Decomposing into binary relationships
r
a
a
a
b
c
Merge-join
b
c
d
d
d
d
f
e
f
g
30
Naive Techniques
NT2. Decomposing into binary relationships
? intermediate results
31
Naive Techniques
NT3. Decomposing into root-to-leaf paths
32
Naive Techniques
NT3. Decomposing into root-to-leaf paths
PathStack
33
Naive Techniques
NT3. Decomposing into root-to-leaf paths
? overlapping between paths ? intermediate result
s
34
Advanced Techniques
PartialMJ. Using a spanning tree
35
Advanced Techniques
PartialMJ. Using a spanning tree
Remove edges to create a spanning tree
36
Advanced Techniques
PartialMJ. Using a spanning tree
37
Advanced Techniques
PartialMJ. Using a spanning tree
PathStack
38
Advanced Techniques
PartialMJ. Using a spanning tree
Join conditions (identity, structural, path)
39
Advanced Techniques
PartialMJ. Using a spanning tree
Join conditions (identity, structural, path)
40
Advanced Techniques
PartialMJ. Using a spanning tree
Join conditions (identity, structural, path)
41
Advanced Techniques
PartialMJ. Using a spanning tree
? overlapping between paths ? intermediate result
s
42
Advanced Techniques
PartialPathStack. Employ a topological order
r
a
c
b
d
e
f
g
43
Advanced Techniques
PartialPathStack. Employ a topological order
PartialPathStack
44
Advanced Techniques
  • PathStack
  • Path queries
  • Indegree 1
  • Outdegree 1
  • O(input output)
  • PartialPathStack
  • Partial path queries
  • Indegree 1
  • Outdegree 1
  • O(inputindegree outputoutdegree)

45
Evaluation of Partial Path Queries on XML Data
  • Querying XML data
  • Partial path queries
  • Query evaluation
  • Experiments
  • Conclusion

?
?
?
?
46
Queries for Experiments
Q1/Q5
Q2/Q6
Q3/Q7
Q4/Q8
47
Experiment 1 (fixed datasets)
Execution time on fixed datasets
  • Synthetic Data
  • IBM AlphaWorks XML generator
  • 2.5 million nodes
  • Benchmark Data
  • Treebank
  • 2.5 million nodes

48
Experiment 1 (fixed datasets)
Treebank
49
Experiment 1 (fixed datasets)
Treebank
path queries
50
Experiment 1 (fixed datasets)
Treebank
too many results
51
Experiment 1 (fixed datasets)
Synthetic
52
Experiment 2 (size of dataset)
Execution time varying the size of the tree
  • Synthetic
  • IBM AlphaWorks XML generator
  • 1 - 3 million nodes

53
Experiment 2 (size of dataset)
PartialMJ
PartialMJ
PartialPathStack
PartialPathStack
Q2
Q3
Q7
PartialMJ
PartialPathStack
54
Evaluation of Partial Path Queries on XML Data
  • Querying XML data
  • Partial path queries
  • Query evaluation
  • Experiments
  • Conclusion

?
?
?
?
?
55
Conclusion
  • Partial path queries
  • PartialPathStack

56
Future Work
57
Questions?
  • Querying XML data
  • Partial path queries
  • Query evaluation
  • Experiments
  • Conclusion

?
?
?
?
?
Write a Comment
User Comments (0)
About PowerShow.com