Title: Querying XML Views of Relational Data
1Querying XML Views of Relational Data
- Jayavel ShanmugasundaramCornell University
Jerry KiernanEugene ShekitaCatalina FanJohn
FunderburkIBM Almaden Research Center
2Introduction
- XML is becoming the standard for
- Data integration
- Data exchange on the web (especially B2B)
- But! Most data will continue to be stored in
relational databases - Need some way to convert relational data to XML
- XPERANTO allows existing relational data to be
viewed and queried as XML
3Web Services Example
Supplier provides an XML View of its Data
XQuery over Catalog
Internet
Buyer
XQuery Result
XQuery
XQuery Result
Application CodeConvert XQuery toSQL Query
Application CodeConvert RelationalData to XML
Supplier
SQL Query
SQL Result
Relational Database
4High-Level Architecture
XQuery Query
Query Result
XPERANTO
XQuery to SQLConverter
Tagger
SQL Query
SQL Result
Relational Database
push data- and memory-intensive computationdown
to relational engine
5Outline
- Motivation and Introduction
- System Architecture
- Query Processing
- Conclusion and Related Work
6Example Relational Data
order
item
payment
7XML View for Partners
ltorder id10gt ltcustomergt Smith
Construction lt/customergt ltitemsgt
ltitem descriptiongenerator gt
ltcostgt 8000 lt/costgt lt/itemgt
ltitem descriptionbackhoegt
ltcostgt 24000 lt/costgt lt/itemgt
lt/itemsgt ltpaymentsgt
ltpayment due1/10/01gt
ltamountgt 20000 lt/amountgt lt/paymentgt
ltpayment due6/10/01gt
ltamountgt 12000 lt/amountgt
lt/paymentgt lt/paymentsgtlt/ordergt
8Default XML View
ltdbgt ltordergt ltrowgt ltidgt10 lt/idgt
ltcustnamegt Smith Construction lt/custnamegt
lt/rowgt ltrowgt ltidgt 9 lt/idgt
ltcustnamegtWestern Builders lt/custnamegt lt/rowgt
lt/ordergt ltitemgt ltrowgt ltoidgt 10
lt/oidgt ltdescgt generator lt/descgt ltcostgt 8000
lt/costgt lt/rowgt ltrowgt ltoidgt 10 lt/oidgt
ltdescgt backhoe lt/descgt ltcostgt 24000 lt/costgt
lt/rowgt lt/itemgt ltpaymentgt
similar to ltordergt and ltitemgt
lt/paymentgt lt/dbgt
9Creating an XPERANTO View
create view orders as ( for order in
view(default)/order/row return ltorder
idorder/idgt ltcustomergt
order/custname lt/customergt
ltitemsgt lt/itemsgt
ltpaymentsgt
lt/paymentsgt lt/ordergt)
for item in view(default)/item/row where
order/id item/oid return ltitem
descriptionitem/desc gt
ltcostgt item/cost lt/costgt lt/itemgt
for payment in
view(default)/item/row
where order/id payment/oid
return ltpayment duepayment/dategt
ltamountgt payment/amount lt/amountgt
lt/paymentgt
sortby(_at_due)
10Allow Partners to Query View
Get all orders of customer Smith
for order in view(orders)where
order/customer/text() like Smith return order
11Outline
- Motivation and Introduction
- System Architecture
- Query Processing
- Conclusion and Related Work
12Query Processing in XPERANTO
XQuery
Query Result
XPERANTO Query Engine
XQuery Parser
XQGM
Query Rewrite View Composition
XQGM
Computation Pushdown
TaggerRuntime
Tagger Graph
SQL Query
Tuples
RDBMS
13Outline
- Motivation and Introduction
- System Architecture
- Query Processing
- XQGM
- View Composition
- Computation Pushdown
- Conclusion and Related Work
14XQGM
- Intermediate representation needs to be
- General enough to capture semantics of a powerful
language such as XQuery - Be amenable to an easy translation to SQL
- XQGM was designed with these in mind
- Borrows from other work on XML algebras
- An extension of DB2s QGM
15XQGM (contd.)
- XQGM consists of
- Operators
- Functions (invoked inside operators)
- Operators capture manipulation of relationships
(similar to relational operators) - Functions capture manipulation of XML entities
(elements, attributes, etc.) - XML construction functions
- XML navigation functions
16XQGM Operators
- Table
- Select
- Project
- Join
- Group by
- Order by
- Union
- Unnest
- View
17XML Construction Functions
- Scalar
- cr8elem(T, A, E) ? Element
- cr8attList(A1, ..., An) ? List
- cr8att(N, V) ? Attribute
- cr8XMLFragList(E1, ..., En) ? List
- Aggregate
- aggXMLFragments(V) ? List
18XML Navigation Functions
- Scalar
- getTagName(E) ? String
- getContents(E) ? List
- getAttributes(E) ? List
- getAttName(A) ? String
- getAttValue(A) ? String
- isElement(E) ? Boolean
- isText(T) ? Boolean
- Superscalar
- unnest(L) ? ?
19for order in view(orders)where
order/customer/text() like
Smith return order
20create view orders as ( for order in
view(default)/order/row return ltorder
idorder/idgt ltcustomergt
order/custname lt/customergt
ltitemsgt lt/itemsgt
ltpaymentsgt
lt/paymentsgt lt/ordergt)
for item in view(default)/item/row where
order/id item/oid return ltitem
descriptionitem/desc gt
ltcostgt item/cost lt/costgt lt/itemgt
for payment in
view(default)/item/row
where order/id payment/oid
return ltpayment duepayment/dategt
ltamountgt payment/amount lt/amountgt
lt/paymentgt
sortby(_at_due)
21(No Transcript)
22ltorder ididgt ltcustomergt custname
lt/customergt ltitemsgt items lt/itemsgt
ltpaymentsgt pmts lt/paymentsgtlt/ordergt
cr8Elem(order, cr8AttList(cr8Att(id,
id)), cr8XMLFragList(cr8Elem(custo
mer,
cr8AttList(),
cr8XMLFragList(cus
tname)),
cr8Elem(items,
cr8AttList(),
cr8XMLFragList(items)),
cr8Elem(payments,
cr8AttList(),
cr8XMLFragList(pmts))
)
)
23Outline
- Motivation and Introduction
- System Architecture
- Query Processing
- XQGM
- View Composition
- Computation Pushdown
- Conclusion and Related Work
24View Composition
- XML views with nesting are constructed from flat
relational tables - Navigational operations (expressed as XPath)
traverse nested elements - Thus navigational operations undo the effects of
construction - All XML navigation can thus be eliminated
25Benefits of View Composition
- Intermediate XML fragments are eliminated
- Only the construction of desired XML fragments
are computed - Also enables predicates to be pushed down to
relational engine - Will see example shortly
26View Composition
27for order in view(orders)where
order/customer/text() like
Smith return order
28View
Query
order
join (correlated)
order
custname
project order ltordergt
custname
custname
pmts
items
id
correlation on order.id
select custname like Smith
join (correlated)
pmts
items
groupby orderby (on due) pmts
aggXMLFrags(pmt)
groupby items aggXMLFrags(item)
pmt
item
due
project item ltitemgt
project pmt ltpaymentgt
cost
desc
amt
due
select oid id
select oid id
Predicate pushdown
id
custname
select custname like Smith
due
oid
amt
id
custname
cost
desc
oid
table payment
table order
table item
29Outline
- Motivation and Introduction
- System Architecture
- Query Processing
- XQGM
- View Composition
- Computation Pushdown
- Conclusion and Related Work
30Computation Pushdown
- Correlated queries are expensive
- Decorrelation
- Sorted outer union SQL plan SSB99 and
extensions FMS00 shown to be most efficient - Separate SQL part from Tagger part (Tagger
Pull-up) - Most computation in SQL part
- Simple tagger run-time produces XML results in a
single pass over SQL results
31Generated SQL Query
Select o.id, p.amt, p.dueFrom order o,
payment p Where custname like Smith
and o.id p.oidOrder by o.id
Select o.id, i.desc, i.costFrom order o, item
i Where o.custname like Smith and
o.id i.oidOrder by o.id
Select o.id, o.custnameFrom order o Where
custname like SmithOrder by o.id
32Outline
- Motivation and Introduction
- System Architecture
- Query Processing
- Conclusion and Related Work
33Conclusion
- XPERANTO allows users to publish relational data
as XML - Using a high-level XML query language
- Eliminating the need for application code
- Prototype Java implementation supports a
significant sub-set of XQuery - Works on top of any relational database system
34Conclusion (contd.)
- Performance results
- Query compilation time
- Query execution time
- Query compilation time
- Order of milliseconds (200 ms for query over 12
tables) - Query execution time
- Uses sorted outer union plan SSB99 and
extensions FMS00
35Related Work
- Commercial database systems
- Most do not support queries over XML views
- XML Integration Systems (e.g., MIX, YAT)
- Not optimized for RDBMSs
- SilkRoute FTS99
- No XQuery support
- Agora MFK01
- Different approach based on materialized view
matching