Title: IVOX Incremental View Maintenance for Ordered XML
1IVOX Incremental View Maintenance for Ordered
XML
- DSRG Talk
- WPI February 20th 2003
- Students Katica Dimitrova Maged El Sayed
- Advisor Prof. Elke Rundensteiner
2Outline
- Motivation
- Problem Description
- Background
- XML Algebra
- Order in XML Algebra
- The IVOX Approach
- Order Encoding
- Overall strategy
- System Architecture
- Related Work
- Future Work
3Outline
?
- Motivation
- Problem Description
- Background
- XML Algebra
- Order in XML Algebra
- The IVOX Approach
- Order Encoding
- Overall strategy
- System Architecture
- Related Work
- Future Work
4Motivation
- Views in general
- Data warehouses
- Information integration
- Access control, Privacy, ..etc
- XML Views (EXTRA useful)
- Information Inter-Portability
- Crossing gaps between different data models
- Materialized Views
- Speed up data retrieval
- Query optimization
- Increased availability
View
View Definition Query
RDB
XML
Other Sources
5Maintaining Materialized Views
- When sources are updated, materialized view may
becomes inconsistent. - Methods of view maintenance
- Recomputation
- recompute view from scratch from base data
- Incremental view maintenance
- compute changes to view in response to changes to
base sources - Heuristic Incremental view maintenance is
usually cheaper than full recomputation.
6Outline
?
- Motivation
- Problem Description
- Background
- The XAT Algebra
- XML order in the XAT Context
- The IVOX Approach
- Order Encoding
- Overall strategy
- System Architecture
- Related Work
- Future Work
?
7The Problem
- Previous work for
- Relational GMS93, bag semantics GL95,
ZGHW95, PSCP02 - Object-Relational LVM00
- Object-Oriented AFP02
- Structured data models AMRVW98, ZM98
- XML data model not handling order LD00
- Can techniques for other data models be reused
for XML?
8Is Maintaining XML Views Different?
- XML features
- Hierarchical
- Optional elements
- Self-typed
- References
- Ordered
- Expressiveness of view definition language
- Complex operations
- tagging, unnesting, aggregation, ..
- Expected large auxiliary information
9Example
- ltbibgt
- ltbookgt
- ltpricegt 65.95 lt/pricegt
- lttitlegt Advanced Programming in
- the Unix environment lt/titlegt
- lt/bookgt
- ltbookgt
- lttitlegt TCP/IP Illustrated lt/titlegt
- lt/bookgt
- ltbookgt
- ltpricegt39.95lt/pricegt
- lttitlegt Data on the Web lt/titlegt
- lt/bookgt
- lt/bibgt
ltresultgt ltbookgt lttitlegtData on the
Weblt/titlegt ltpricegt39.95lt/pricegt
lt/bookgt lt/resultgt
View Extent
Bib.xml
ltresultgt for b in document("bib.xml")/bib/book
where b/price/text() lt 60 return
ltbookgt b/title, b/price lt/bookgt lt/resultgt
List all books that cost less than 60, including
their title and price
View Definition Query
10Example
- ltbibgt
- ltbookgt
- ltpricegt 65.95 lt/pricegt
- lttitlegt Advanced Programming in
- the Unix environment lt/titlegt
- lt/bookgt
- ltbookgt
- lttitlegt TCP/IP Illustrated lt/titlegt
- lt/bookgt
- ltbookgt
- ltpricegt39.95lt/pricegt
- lttitlegt Data on the Web lt/titlegt
- lt/bookgt
- lt/bibgt
ltresultgt ltbookgt lttitlegtData on the
Weblt/titlegt ltpricegt39.95lt/pricegt
lt/bookgt lt/resultgt
- ltbookgt
- lttitlegtTCP/IP Illustratedlt/titlegt
- ltpricegt55.48lt/pricegt
- lt/bookgt
ltpricegt55.48lt/pricegt
View Extent
Bib.xml
ltresultgt for b in document("bib.xml")/bib/book
where b/price/text() lt 60 return
ltbookgt b/title, b/price lt/bookgt lt/resultgt
Insert element ltpricegt55.48lt/pricegt into second
book
View Definition Query
11Our Goal
- Design incremental view maintenance strategy for
XQuery views that - Correctly update the view
- Is order sensitive
- Returns view in proper order
- Allows for updates that specify order
- Covers at least the core of XQuery language
views - Minimizes auxiliary information requirements
12Basics of IVOX Approach Algebraic
- Update propagation rules for each algebra
operator and each update type
XML View
D2
Update
D2 Update
Algebra Tree
Operator
Operator
XQuery Definition
D1
D1 Update
Execution
View Maintenance
XML Source
XML Source
XML Source
time
Update
13Why Algebraic?
- Robust Easily adaptable to operator semantic
changes - Extensible new operators can be added
- Allows for reuse of techniques for known
operators - Language independent- independent of syntax
changes (of XQuery by W3C) - Formal basis for provable correctness
14Outline
?
- Motivation
- Problem Description
- Background
- XML Algebra
- Order in XML Algebra
- The IVOX Approach
- Order Encoding
- Overall strategy
- System Architecture
- Related Work
- Future Work
?
?
15Background on XML Algebra XAT
- XAT Operators
- SQL Operators Select, Project
- Special Operators Source, FOR
- XML Operators Navigate, Tagger ..
- XAT Data Model (XAT Table)
- Order sensitive table of tuples
- Columns denote user-specified or internally
generated variable bindings - A cell in a tuple holds an XML node for a
sequence of XML nodes
? col1, price col3
16Order in XAT Context
- Order among tuples
- Order among XML nodes in a cell
? col1, price col3
17Order in the XAT Context
- Order among the tuples
- Order among XML nodes in a single cell
(
,
)
Agg col5
18Order in XAT Context View Maintenance
- On update worry about
- Order among tuples
- Order among XML nodes in a cell
? col1, price col3
19Order in XAT Context View Maintenance
- On update worry about
- Order among the tuples
- Order among XML nodes in a single cell
(
,
)
Agg col5
20Duplicate Information in XAT Context
- Complex operations require auxiliary information
- Auxiliary information can be too large in XAT
context - May be expensive to maintain it
? col1, price col3
!
Duplicated Storage
21Outline
?
- Motivation
- Problem Description
- Background
- XML Algebra
- Order in XML Algebra
- The IVOX Approach
- Order Encoding
- Overall strategy
- System Architecture
- Related Work
- Future Work
?
?
?
22Possible Solutions to Order Preservation (I)
- Sequential storage
- (XPROP approach by Maged, Ling Luping)
- Assume intermediate results stored sequentially
- Inserts and deletes are performed in physical
order - No order encoding
- Special support required for secondary storage
- May require iteration over many tuples to
determine order
23Possible Solutions to Order Preservation (II)
- Naïve order encoding for tuples and sequences of
XML nodes - Assign order numbers to tuples and to XML nodes
in a sequence - Requires frequent renumbering on inserts.
-
? col1, price col3
24Using Node Identity
- Idea Use node identity
- Usage
- For encoding order and structure
- As a reference to base data
25What Encoding For Node Identity?
- Existing techniques for encoding order for XML
- Global Order (UW)
- Local Order (UW)
- Dewey Order (UW)
- Lexicographical Order (MASS)
1
bib
8
2
7
book
book
5
book
10
9
6
title
3
price
price
9
8
4
price
title
7
6
title
26What Encoding For Node Identity?
- Existing techniques for encoding order for XML
- Global Order (UW)
- Local Order (UW)
- Dewey Order (UW)
- Lexicographical Order (MASS)
1
bib
1
3
book
book
2
book
2
1
title
1
price
price
1
2
price
title
2
1
title
27What Encoding For Node Identity?
- Existing techniques for encoding order for XML
- Global Order (UW)
- Local Order (UW)
- Dewey Order (UW)
- Lexicographical Order (MASS)
1
bib
1.1
1.3
book
book
1.2
book
1.3.2
1.2.1
title
1.1.1
price
price
1.3.1
1.1.2
price
title
1.2.2
1.2.1
title
28What Encoding For Node Identity?
- Existing techniques for encoding order for XML
- Global Order (UW)
- Local Order (UW)
- Dewey Order (UW)
- Lexicographical Order (MASS)
b
bib
b.b
b.f
book
book
b.d
book
b.f.l
b.d.b
title
b.b.b
price
price
b.f.cm
b.b.cd
price
title
The Winner
b.d.f
title
29Lexicographical Keys LexKeys
- What are LexKeys?
- Multi-level lexicographical keys
- Example c , ba.c.b
- Examples of comparison
- b lt b.c bab lt bd.cc b.b lt b.b.c
- Advantages
- All LexKeys form a totally ordered set with
respect to lt - It is always possible to generate a key between
two keys - The deletion of a LexKey in a sequence does not
affect other LexKeys - Usage
- Reference to XML nodes
- Encoding order
30LexKeys in XAT Tables
? b, price col2
? b, price col2
31Order Among XAT Tuples
- Notion designate order schema to XAT tables
- Ordering by LexKeys by columns in order schema
yields correct tuple order.
Order Schema
1
2
3
1
2
32Calculating Order Schema
- Rules for each operator
- Calculated in a postorder traversal of the tree
- Sample Rules
33Order Among Tuples Example
1
1
2
? b, price col2
? b, price col2
1
1
2
3
34Order in Collection within a cell?
1
2
(
,
,
)
Agg col5
Agg col5
1
2
2
1
35Smart Keys
SmartKey
Key part, by default also represents order
Optional, only represents order when present
- Notation key(order)
- Examples
- b.c.b (h)
- b.c.b
36SmartKeys in XATTables
1
2
(
,
,
)
Agg col5
Agg col5
1
2
2
1
37The Impact of SmartKeys on View Maintenance
38Order Among XAT Tuples during View Maintenance
- Not touching other tuples in XAT table
- No reordering ever needed.
- Gaining distributiveness in regard to bag union
on tuple level
1
3
2
? col1, price col3
1
3
2
39Order in a Sequence during View Maintenance
1
2
- Not touching other members of the sequence
- No reordering ever needed.
- Gaining distributiveness in regard to bag union
on cell level
,
Agg col5
2
1
40Update Propagation Rules
-
- Use distributiveness in regard to bag union
- Reuse rules from relational for most SQL XAT
operators
41Update Propagation Rules Example(Navigate Unnest
on Insert Tuple)
- T2old ? col,pathcol (T1old)
- T1newT1old ?T1
- T2new ? col,pathcol (T1old ?T1)
- ? col,pathcol (T1old) ? col,pathcol
(?T1) - T2old ?T2
- represents bag union
T2
?T2
? col,pathcol
? col,pathcol
T1
?T1
Execution
View Maintenance
time
42Update Propagation Strategy
XML View
XAT
Translator
XML Source
XML Source
XML Source
Storage Manager
43Update Primitives (The Format of Delta)
Apply to original XML Document
- XML Update Primitives (xup)
- Insert (xmlFragment, path)
- Delete (path)
- InsertAtt (name, value, path)
- DeleteAtt (name, path)
- Replace (oldValue, newValue, path)
- XML Key Update Primitives (keyup)
- Insert (el, path)
- Delete (path)
- Replace (el, pos)
- XAT Update Primitives (xatup)
- InsertTuple (tuple)
- DeleteTuple (tupleId)
- ChangeTuple (Keyup, columnName, tupleId)
Express update on original XML data in terms of
LexKeys
Apply to XATTable
44A Complete Example
45 T ltresultgtcol5lt/resultgt col6
Execution
Agg col5
T ltbookgtcol4 col2lt/bookgt col5
Storage Manager
Constructed XDOMs
? col3 lt 60
? b, title col4
? b, price col2
? col1, book b
bib.xml
? S1, bib col1
S bib.xml S1
bib.xml
46 T ltresultgtcol5lt/resultgt col6
View Maintenance
Agg col5
T ltbookgtcol4 col2lt/bookgt col5
Storage Manager
Constructed XDOMs
? col3 lt 60
? b, title col4
? b, price col2
? col1, book b
bib.xml
? S1, bib col1
S bib.xml S1
bib.xml
47Outline
?
- Motivation
- Problem Description
- Background on XAT
- XML Algebra
- Order in XML Algebra
- The IVOX Approach
- Order Encoding
- Overall strategy
- System Architecture
- Related Work
- Future Work
?
?
?
?
48System Architecture
Execution
View Maintenance
User
View Definition XQuery
Legend
Materialized XML View
Update XQuery
Process
XML Query Engine
Update Primitive Generator
Data
XML View Maintainer
VM Initializer
Update Propagation Rules Repository
XML Algebra Tree
Persistent Data Storage
IVOX
Executer
One time occurrence
Rainbow
XTUP
On-update occurrence
XML Source
Materialized Auxiliary Views
XML Source
XML Source
Storage Manager
49Outline
?
- Motivation
- Problem Description
- Background on XAT
- XML Algebra
- Order in XML Algebra
- The IVOX Approach
- Order Encoding
- Overall strategy
- System Architecture
- Related Work
- Future Work
?
?
?
?
?
50Related Work
- A.Gupta, I.S.Mumick. Maintenance of Materialized
Views Problems, Techniques, and Application. In
Bulletin of the Technical Committee on Data
engineering 1995. - T. Grin, L.Libkin. Incremental maintenance of
views with duplicates. In SIGMOD 1995. - H. Liefke and S. Davidson. View Maintenance for
Hierarchical Semistructured Data. In DAWAK 2000. - S. Abiteboul, J. McHugh, Rys, Vassalos, J.
Wiener. Incremental Maintenance for Materialized
Views over Semistructured Data. In VLDB 1998.
51Outline
?
- Motivation
- Problem Description
- Background on XAT
- XML Algebra
- Order in XML Algebra
- The IVOX Approach
- Order Encoding
- Overall strategy
- System Architecture
- Related Work
- Future Work
?
?
?
?
?
?
52Future Work
- Near Future
- Launch the system
- Batch update coming
- Experiments and Evaluation
- Compare the systems performance to recomputation
- and Beyond
- Batching updates coming from different sources
- Integrity constraints
- Algebra tree rewrite rules