Title: VOX Order-sensitive View Maintenance of Materialized XQuery Views
1VOX Order-sensitive View Maintenance of
Materialized XQuery Views
- ER 2003 October 14th 2003
- Katica Dimitrova, Maged El-Sayed and Elke
Rundensteiner - Worcester Polytechnic Institute
- Now at Microsoft
2Motivation
- Views in general
- Information integration
- Access control, privacy, ..etc
- Data warehouses
- XML Views (EXTRA useful)
- Information inter-portability
- Crossing gaps between different data models
- Materialized Views
- Fast access over complex views
- Increased availability
- Query optimization
View
View Definition Query
RDB
XML
Other Sources
3Maintaining Materialized Views
When sources are updated, materialized view may
become inconsistent.
View
View
update
- Methods of view maintenance
- Recomputation
- recompute view from scratch from base data
View Definition Query
- Incremental view maintenance
- compute changes to view in response to changes to
base sources
Source 1
Source 2
Sources 3..n
Incremental view maintenance is usually cheaper
than full recomputation.
update
4Goal
- Incrementally maintaining XQuery views
- Why is it a challenge?
- XML features
- Hierarchical
- Optional elements
- Self-typed
- IDRefs
- Ordered
- Expressiveness of XQuery language
- Complex operations tagging, unnesting,
aggregation, .. - Expected large auxiliary information
View
View Definition XQuery
XML Source
XML Source
XML Source
5Basics of VOX Approach Algebraic
- General approaches to view maintenance
- Algorithmic Fixed procedure exists for fixed
view type - Algebraic - Update propagation rules for each
algebra operator and each update type
XML View
D2
Update
D2 Update
Algebra Tree
Operator
Operator
XQuery Definition
D1
D1 Update
Execution
View Maintenance
XML Source
XML Source
XML Source
time
Update
6Example
- ltbibgt
- ltbookgt
- ltpricegt 65.95 lt/pricegt
- lttitlegt Advanced Programming in
- the Unix environment lt/titlegt
- lt/bookgt
- ltbookgt
- lttitlegt TCP/IP Illustrated lt/titlegt
- lt/bookgt
- ltbookgt
- ltpricegt39.95lt/pricegt
- lttitlegt Data on the Web lt/titlegt
- lt/bookgt
- lt/bibgt
ltcheap_bookgt lttitlegtTCP/IP
Illustratedlt/titlegt lt/cheap_bookgt
ltpricegt55.48lt/pricegt
Bib.xml
Bib.xml
List all books that cost less than 60
Insert element ltpricegt55.48lt/pricegt into second
book
7Background on XML Algebra XAT
- XQuery ?XAT algebra tree ZR02
- XAT Operators
- XAT SQL Operators Select, Project
- XAT XML Operators Navigate Unnest, Navigate
Collection, Tagger, Combine ..
view
?col1
T ltresultgtcol2lt/ result gt col1
C col2
Tltcheap_bookgtcol3lt/cheap_bookgtcol2
ltresultgt for b in document("bib.xml")/bib/book
where b/price/text() lt 60 return
ltcheap_bookgt b/title lt/cheap_bookgt lt/resu
ltgt
? (col5 lt 60.0)
?b, price/text()col5
?b, title col3
? s6, /book b
S bib.xml s6
bib.xml
8Background on XML Algebra XAT Data Model
- XAT Data Model (XAT Table)
- Order sensitive table of tuples
- Columns denote user-specified or internally
generated variable bindings - A cell in a tuple holds an XML node for a
sequence of XML nodes - The XAT algebra has ordered bag semantics
view
Input
Output
?col1
col5 col3
65.95 lttitlegtAdvanc ..lt/titlegt
lttitlegtTCP/IP lt/titlegt
39.95 lttitlegt Data on ..lt/titlegt
T ltresultgtcol2lt/ result gt col1
C col2
Tltcheap_bookgtcol3lt/cheap_bookgt col2
?b, price/text()col5
? (col5 lt 60.0)
?b, price/text()col5
col3 b
lttitlegt Advanc ..lt/titlegt ltbookgt ltpricegt 65.95 lt/pricegt lttitlegtAdvanc ..lt/titlegt lt/bookgt
lttitlegt TCP/IP lt/titlegt ltbookgt lttitlegt TCP/IP lt/titlegt lt/bookgt
lttitlegt Data on ..lt/titlegt ltbookgt ltpricegt 39.95 lt/pricegt lttitlegt Data on ..lt/titlegt lt/bookgt
?b, title col3
? s6, /book b
S bib.xml s6
bib.xml
9Order in XAT Context View Maintenance
Non Order-sensitive
Order-sensitive
col3
lttitlegt Data on the Web lt/titlegt
col3
lttitlegt Data on the Web lt/titlegt
col3
lttitlegt TCP/IP Illustrated lt/titlegt
lttitlegt Data on the Web lt/titlegt
col3
lttitlegt Data on the Web lt/titlegt
lttitlegt TCP/IP Illustrated lt/titlegt
col3
lttitlegt Data on the Web lt/titlegt
? (col5 lt 60.0)
? (col5 lt 60.0)
col5 col3
65.95 lttitlegt Advanced Prog lt/titlegt
lttitlegt TCP/IP Illustrated lt/titlegt
39.95 lttitlegt Data on the Web lt/titlegt
col5 col3
65.95 lttitlegt Advanced Prog lt/titlegt
lttitlegt TCP/IP Illustrated lt/titlegt
39.95 lttitlegt Data on the Web lt/titlegt
55.48
55.48
10Our Approach to Maintaining Order
- Use node identity
- Why?
- Already present as concept in XQuery
- Can be reference to base XML data set
- Can encode structure and order
11Lexicographical Keys LexKeys
b
bib
- Multi-level lexicographical keys
- Comparison
- b.h lt b.t bab lt bd.cc b.b lt b.b.c
- Advantages
- It is always possible to generate a key between
two keys - The deletion of a LexKey in a sequence does not
affect other LexKeys
b.h
b.t
book
book
b.n
book
b.t.r
b.n.f
title
b.h.k
price
price
b.t.k
b.h.r
price
title
b.n.m
title
12LexKeys - References to source XML nodes
col3 b
lttitlegt Advanc ..lt/titlegt ltbookgt ltpricegt 65.95 lt/pricegt lttitlegtAdvanc ..lt/titlegt lt/bookgt
lttitlegt TCP/IP lt/titlegt ltbookgt lttitlegt TCP/IP lt/titlegt lt/bookgt
lttitlegt Data on ..lt/titlegt ltbookgt ltpricegt 39.95 lt/pricegt lttitlegt Data on ..lt/titlegt lt/bookgt
Storage Manager
col3 b
b.h.r b.h
b.n.m b.n
b.t.r b.t
bib.xml
b
bib
b.h
b.t
book
book
b.n
?b, title col3
book
?b, title col3
b.t.r
b
b.h
b.n
b.t
b.h.k
b
ltbookgt ltpricegt 65.95 lt/pricegt lttitlegtAdvanc ..lt/titlegt lt/bookgt
ltbookgt lttitlegt TCP/IP lt/titlegt lt/bookgt
ltbookgt ltpricegt 39.95 lt/pricegt lttitlegt Data on ..lt/titlegt lt/bookgt
title
price
b.t.k
b.h.r
price
title
b.n.m
title
13LexKeys - References to constructed nodes
Storage Manager
view
col2
y.c
y.b
Constructed Nodes
?col1
Skeleton
LexKey
T ltresultgtcol2lt/ result gt col1
cheap_book
y.b
C col2
b.t.r
Tltcheap_bookgtcol3lt/cheap_bookgtcol2
Tltcheap_bookgtcol3lt/cheap_bookgtcol2
cheap_book
y.c
? (col5 lt 60.0)
b.n.m
?b, price/text()col5
col3
b.n.m
b.t.r
bib.xml
?b, title col3
b
bib
b.h
b.t
? s6, /book b
book
book
b.n
S bib.xml s6
book
b.t.r
bib.xml
title
14Order Among XAT Tuples
1
col3 b
b.h.r b.h
b.n.m b.n
b.t.r b.t
- Notion designate order schema to XAT tables
- Ordering by LexKeys in columns in order schema
yields correct tuple order. - Comparison operation lt on tuples.
?b, title col3
1
b
b.h
b.n
b.t
1
2
3
b.h lt b.n lt b.t
15Order Schema Computation
- Calculated in a postorder traversal of the tree
- Schema Computation Rules
Operator Operator op(R) Order Schema OSQ, Q op(R)
Tagger Tpattern col (R) Tpattern col (R) OSR
16Order Among Nodes in a Cell
- Most collections of XML nodes are in document
order - Navigate Collection, XML Union,
- Combine creates a collection in which nodes may
be in order different then one encoded in node
identity
- Concept of overriding order
LexKey with overriding order
Overriding Order (LexKey) Key (LexKey)
Node identity part, by default also represents
order
Optional, only represents order when present
- Notation key order
- Examples
- b.c.b h
- b.c.b
17The Impact of Using LexKeys on View Maintenance
- XML algebra now has (non-ordered) bag semantics
- Gained distributiveness with regard to bag union
and difference - Compact intermediate results
col3 b
b.t.r b.t
col3 b
b.t.r b.t
b.n.m b.n
? (col5 lt 60.0)
col5 col3 b
b.h.k.m b.h.r b.h
b.n.m b.n
b.t.k.m b.t.r b.t
b.n.f.m
18Update Propagation Strategy
XML View
XAT
Update XQuery
XML Source
XML Source
XML Source
Storage Manager
Rainbow
19Update Propagation Rules
-
- Use distributiveness with regard to bag union
- Reuse rules from relational view maintenance for
XAT SQL operators - Provide rules for XAT XML operators
20Update Propagation Rules Example - Navigate
Unnest on Insertion of Tuples
- Qold ? col,pathcol (Rold)
- RnewRold ?R
- Qnew ? col,pathcol (Rold ?R)
- ? col,pathcol (Rold) ? col,pathcol
(?R) - Qold ?Q
- represents bag union
Q
u (?Q)
? col,pathcol
? col,pathcol
R
u (?R)
Execution
View Maintenance
time
Propagate u(?Q)
21View Maintenance Example
view
Insert element ltpricegt55.48lt/pricegt into second
book
u (?y.cb.n, result1/col2 x,
col1, 1)
?col1
col1
x
Rainbow
T ltresultgtcol2lt/ result gt col1
u (?c, col2, 1) ?c y.cb.n
Constructed XDOMs
C col2
Skeleton
LexKey
b
b.t
b.n
b
b.t
u (?s) ?s (b.n, y.c )
result
x
y.bb.t
y.cb.n
Tltcheap_bookgtcol3lt/cheap_bookgtcol2
cheap_book
y.b
u (?s) ?s (b.n, b.n.m)
tid
3
tid
3
2
b.t.r
? (col5 lt 60.0)
u (?c, col5, 2) ?c b.n.f.m
col5 col3 b tid
b.h.k.m b.h.r b.h 1
b.n.f.m b.n.m b.n 2
b.t.k.m b.t.r b.t 3
?b, price/text()col5
b
b.h
bib
bib.xml
b.t
u (?b.n.f, priceb.n.f b.n, b, 2)
book
b.n
book
?b, title col3
b.t.r
book
b.h.k
b.n.m
u (?b.n.f, priceb.n.f b.n, b, 2)
price
title
b tid pid
b.h 1 1
b.n 2 1
b.t 3 1
b.n.f
b.t.k
title
b.h.k.m
price
? s6, /book b
price
65.95
b.n.f.m
b.h.r
u (?b.n.f, bookb.n/priceb.n.f b,
s6, 1)
b.t.k.m
55.48
title
39.95
Storage Manager
S bib.xml s6
?b.n.f, bookb.n/priceb.n.f b
22View Maintenance Example
Insert element ltpricegt55.48lt/pricegt into second
book
Rainbow
Constructed XDOMs
Skeleton
LexKey
result
result
x
y.bb.t
y.bb.t
y.cb.n
y.cb.n
cheap_book
cheap_book
cheap_book
y.b
title
b.n.m
title
b.t.r
TCP/IP Illustrated
Data on the Web
b
b.h
bib
bib.xml
b.t
book
b.n
book
b.t.r
book
b.h.k
b.n.m
price
title
b.n.f
b.t.k
title
b.h.k.m
price
price
65.95
b.n.f.m
b.h.r
b.t.k.m
55.48
title
39.95
Storage Manager
23Experimental Evaluation
- Implemented in Java on top of Rainbow system
- Experimental evaluation
Basic performance comparison
Varying size of insert
637 elements of interest selectivity 50
24Related work
- Relational
- GMS93 Survey
- GL95 Algebraic approach to maintain relational
views with duplicates - BLT86, CW91, ZGHW95, Q96, MK00,
PSCP02 - Object-Oriented
- KR96 MultiView. Object algebra, exploit OO
features like inheritance, path indexes. - AFP02 Algebraic approach. Store OID-s rather
then actual data. - XML-like data models
- ZM98 Select-Project graph structured views as
collections of objects. - AMRVW98 Semistructured data model OEM, query
language LOREL. Only atomic updates. Does not
handle order. - QLR02 Dynamic web data. Based on XPath.
Maintains path index structure. - LD00 Hierarchical semistructured data. View
defined with WHAX-QL. Does not handle order. - EWDR02 Motivation for this work. Algebraic
approach. Does not handle order. Large
intermediate results.
25Conclusions
- Proposed order-encoding scheme that migrates XML
algebra from ordered to non-ordered bag semantics - Gave first solution to order-sensitive XQuery
view maintenance - Handles core of XQuery
- Handles complex updates
- Proved correctness of approach
- Implemented the solution within Rainbow
- Experimental evaluation confirms feasibility of
solution
26For more information
- The Rainbow project
- http//davis.wpi.edu/dsrg/rainbow/
- Related publications
- K. Dimitrova, M. El-Sayed and E. Rundensteiner.
Order-sensitive View Maintenance of Materialized
XQuery Views. Technical Report WPI-CS-TR-03-17,
May 2003. - M. El-Sayed, K. Dimitrova, E. Rundensteiner,
Efficiently Supporting Order in XML Query
Processing, WIDM'03, New Orleans, Nov.2003. - X. Zhang, K. Dimitrova, L. Wang, B. Pielech, L.
Ding, B. Murphy, M. El-Sayed and E.
Rundensteiner. RainbowII Multi-XQuery
Optimization Using Materialized XML Views. SIGMOD
DEMO, Jun. 2003. - M. Sayed, L. Wang, L. Ding and E. Rundensteiner.
An Algebraic Approach for Incremental Maintenance
of Materialized XQuery Views. In Proceedings of
WIDM02, page88, 2002.(.ps) - X. Zhang, B. Pielech and E. Rundensteiner. Honey,
I Shrunk the Xquery!- An XML Algebra Optimization
Approach. In Proceedings of WIDM02, 2002. - X. Zhang, M. Mulchandani, S. Christ, B. Murphy
and E. Rundensteiner. Rainbow Mapping-Driven
XQuery Processing System. Proceeding of SIGMOD02,
In Demo Session, page 614, 2002.
27Thank you !