Title: Querying XML Database Using Relational Database System
1Querying XML DatabaseUsingRelational Database
System
- Rucha Patel
- MS CS (Spring 2008)
- Advanced Database Systems CSc 8712
- Instructor Dr. Yingshu Li
9/20/2014
1
2Outline of Presentation
- Background Information regarding XML
- Storing XML documents in relational DB system
- Querying Manipulating XML data
- XML Data Models for Query Processing
- XML Labeling Schemes
- Structural Joins
- General Technique for Querying XML Documents
using Relational DB System - XQL ( XML Query Language )
- Conclusion
3Background Information - XML
- Evolved from a document markup language
- For exchange of structured and semi-structured
data - For self-describing data -gt between heterogeneous
data sources - XML Data Management Systems
- Specialized system only for XML documents
- General System manage XML along with other data
formats.
4Background Information XML ( Contd )
- XML is a recommendation of W3C
- XML Schema Type System for XML
- XPath A language for navigating within XML
documents - XSLT an XML transformation language
- XQuery a general purpose XML query language
- Based on XML Schema types
- Includes XPath as a subset.
9/20/2014
4
5Storing XML Documents in RDB System
- 1 ) Simplest one is to use Long Character String
data type like, CLOB in SQL - Will store entire document as a character string
- Textual Fidelity
- Fails to take advantage of structural information
available in XML markup
9/20/2014
5
6Storing XML Documents in RDB System ( Contd )
- 2 ) Shredding
- Distributes XML information across one/more
columns of tables preserving both data values
structural relationships. - For XML schema gt tables
- levels of elements.
- at each level different tables for elements in
hierarchy - Schema Based Shredding
- Not efficient with
- sparse element - with varying contents
- Mixed contents text child elements
- Fails to preserve
- Document ordering
- Processing instructions of XML documents
9/20/2014
6
7Storing XML Documents in RDB System ( Contd )
- 3) XML Publishing
- to reconstructs XML documents from relational
tables, - Systems usually provides inverse information
called XML Publishing - Such Systems with shredding XML publishing are
said to provide Relational Fidelity - As authoritative form of data is relational, not
XML. - 4) Native XML with XML Fidelity.
9/20/2014
7
8Querying Manipulating XML Data
- XML Storage facility -gt interface to access and
manipulate stored data. - XPath better navigation within documents but,
- can not transform structures
- Can not construct new elements
- XSLT transformation Construction But,
- Recursive template-driven nature unsuitable for
optimization - XQuery complete set of query facilities.
9/20/2014
8
9Querying Manipulating XML Data ( Contd )
- XML Data Model
- XML documents as ordered, labeled, finite,
unranked trees. - Relative order of nodes order of siblings
- Region encoding labeling scheme
- lt doc, start, end, level gt
- Doc to which document, node belongs to
- Start end position of element in a document
- Level level of a node in a tree
- X is an ancestor of y, if and only if
- x.start lt y.start and x.end gt y.end
9/20/2014
9
10Querying Manipulating XML Data ( Contd )
- XML Labeling Schemes
- To evaluate queries in XPath, XSLT XQuery,
- Maintain results throughout the evaluation in
document order - Restricts choice of query plans
- Impossible if query requires data to be resorted
along different axis at some point. - Sort Operator handled at appropriate times
- Assign each node a label denoting relative
order - Like, region encoding scheme
- Ancestor-descent problem
- Variable size labeling scheme
- Do not need to relabel a node on update.
- Difficult to allocate fixed portion of each
record for label.
9/20/2014
10
11Querying Manipulating XML Data ( Contd )
- Storing XML in RDBMS
- Labeling Scheme edge shredding form a single
relation for storing XML Doc - Edge relation
- Global Encoding Scheme
- Edge(id, parent-id, end, path-id, value)
- Local Encoding Scheme
- Edge(id, parent-id, sIndex, path-id, value)
- sIndex position of a node among siblings
9/20/2014
11
12General Technique for Querying XML Doc in RDBMS
- To store and query an XML Doc
- Relational Schema Generation table creation
- Shredding storing XML Doc
- Converting queries over stored XML into SQL
queries over created tables - Relational schema generation requires its own
query processor to convert the queries - But, the same query processor can be used..
9/20/2014
12
13Contd...
- To use the same query processor for relational
schema generation and converting queries, - Along with shredding, Reconstruction XML View is
created over relational tables - Virtually reconstructs the
- Stored XML Doc lt- shredded rows.
- Just like the normal view over the Stored XML
Doc. - Queries on Stored XML Queries over
Reconstruction XML View
9/20/2014
13
14Contd...
9/20/2014
14
15Contd...
- For Relational Schema Generation, a program that
- Generated desired relational schema
- Produce XML Shredder object
- Create reconstruction XML view
- Either for,
- Shared relational schema
- Edge relational schema
9/20/2014
15
16Contd...
- Shared Relational Schema
- Steps to generate relational schema
- Create a DTD Graph node ( XML Element, Attribute,
Operator) - Create a relation for root element in graph
- All children of element are represented in same
relation of element EXCEPT, - -node, - is a set values cant captured by
relational expressions - So, create separate relation for these nodes.
-
9/20/2014
16
17Contd...
9/20/2014
17
18Contd...
9/20/2014
18
19Contd...
9/20/2014
19
20XQL ( XML Query Language )
- Structured Queries relational / OO DB
- Unstructured Queries Documents
- Semi-structured Queries XML Documents
- Features like,
- Allows, user to combine information from multiple
sources - Uses links as a part of a query
- Search based on text containtment
- Eg ) Doc1 recommended books
- Doc 2 Books Prices
- Doc 3 Reviews of Books
- Then, a query -gt list recommended books, prices
and reviews.
9/20/2014
20
21XQL ( XML Query Language ) Contd
- Difference between SQL XQL Query
SQL XQL
The database is a set of tables. The database is a set of one or more XML documents.
uses the structure of tables as a basic model. uses the structure of XML documents as a basic model.
The FROM clause determines the tables which are examined by the query. A query is given a list of input nodes from one or more documents.
The result of a query is a table containing a set of rows this table may serve as the basis for further queries. The result of a query is a list of XML document nodes, which may serve as the basis for further queries.
9/20/2014
21
22XQL ( XML Query Language ) Contd
- Basic Concepts of XQL
- Simple string element name
- Eg. table
- / child operator indicates hierarchy
- Eg. Front/author
- front/author'Theodore Seuss Geisel'
- front/author/address/_at_type'email'
- front//address
- //address
- front/author/address_at_type'email'
- front/author'Theodore Seuss Geisel'_at_gender'male
' and shoesize'9EEEE' - section1,3 to 5, 8, -1
- section_at_level'3'1 to 2
9/20/2014
22
23XQL ( XML Query Language ) Contd
9/20/2014
23
24XQL ( XML Query Language ) Contd
9/20/2014
24
25XQL ( XML Query Language ) Contd
- Grouping of results
- Query
- lists the products on invoices might want to
group products by invoice, placing each group of
products within an invoice tag.
9/20/2014
25
26XQL ( XML Query Language ) Contd
- Join
- Combine information from multiple sources to
create one unifies view. - Queries can be written like,
9/20/2014
26
27Conclusion
- XML Documents can be stored efficiently in a
relational database system using number of
approaches. - General Technique for storing and querying XML
Document using RDBMS eliminated need of separate
query processors for XML query translation. - Using General Technique Reconstruction XML View
can be generated for both shared and edge based
relational schema. - Stored XML Document can be queried effectively
through the use of XQuery, XPath, XSLT or XQL.
9/20/2014
27
28References
- XML and Relational Database Management Systems
the inside Story by Michael Rys, Don Chamberlin,
Daniela Florescu. - A General Technique for Querying XML Documents
using a Relational Database System by Jayavel
Shanmugasundaram, Rajasekar Krishnamurthy, Igor
Tatarinov. - Querying and Maintaining Ordered XML Data Using
Relational Databases by Willium SHui, Franky
Lam, Damien Fisher Raymond Wong. - Querying Structured Text in an XML Database by
Shurung Al-Khalifa, Cong Yu, H.V. Jagdish. - Structured Materialized Views for XML Queries
by Andrei Arion, Veronique Benzaken Ioana
Manolescu.
9/20/2014
28
29Thank You.
9/20/2014
29