Title: Views and XML Views of Relations
1Views and XML Views of Relations
- Zachary G. Ives
- University of Pennsylvania
- CIS 550 Database Information Systems
- December 14, 2009
Some slide content courtesy of Susan Davidson,
Dan Suciu, Raghu Ramakrishnan
2Administrivia
- Homework 4 due 11/4
- XQuery
- Project teams due Monday (4 people)
- Read the Shanmugasundaram paper (posted on the
Web site) by (or before) Sunday 11/8 - Post a 1-page summary to the newsgroup
- What problems did they address? Basic techniques?
Strengths weaknesses of their methods?
3Recall XQuery
- FLWOR statement note case sensitivity!
- for iterators that bind variables
- let collections
- where conditions
- order by order-conditions (older version was
SORTBY) - return output constructor
- XQuery is very SQL-like, but cleaner and more
orthogonal - It is based on paths and binding tuples, with
collections and trees as its first-class objects - See www.w3.org/TR/xquery/ for more details on the
language
4XQuery Works Well with Schema,and Validates
Against It (Incl. Keys)
- ltxsdschema xmlnsxsd"http//www.w3.org/2001/XMLS
chema"gt - ltxsdcomplexType nameThesisType"gt
- ltxsdattribute namekey" type"xsdstring"/gt
-
- ltxsdsequencegt
- ltxsdelement nameauthor" typexsdstring"/gt
- ...
- ltxsdelement nameschool" typexsdstring/gt
-
- lt/xsdsequencegt
- lt/xsdcomplexTypegt
- ltxsdelement namemastersthesis"
typeThesisType"gt - ltxsdkey namemtId"gt ltxsdselector
xpath./gt ltxsdfield xpath"_at_key"/gt
lt/xsdkeygt - ltxsdkeyref nameschoolRef referschoolId"gt
ltxsdselector xpath./school/gt
ltxsdfield xpath./text()"/gt lt/xsdkeyrefgt - lt/xsdelementgt
- lt/xsdschemagt
5Alternate Data Representations
- XQuery enables us to go XML ? XML
- or XML ? HTML, or XML ? text
- In SQL, we go relations ? relations
- What about relations ? XML and XML ? relations?
- Lets start with XML ? XML, relations ? relations
6Views in SQL and XQuery
- A view is a named query
- We use the name of the view to invoke the query
(treating it as if it were the relation it
returns) - SQL
- CREATE VIEW V(A,B,C) AS
- SELECT A,B,C FROM R WHERE R.A 123
- XQuerydeclare function V() as element(content)
- for r in doc(R)/root/tree,
- a in r/a, b in r/b, c in r/c
- where a 123
- return ltcontentgta, b, clt/contentgt
-
Using the views
SELECT FROM V, RWHERE V.B 5 AND V.C R.C
for v in V()/content, r in doc(r)/root/tree
where v/b r/breturn v
7Whats Useful about Views
- Providing security/access control
- We can assign users permissions on different
views - Can select or project so we only reveal what we
want! - Can be used as relations in other queries
- Allows the user to query things that make more
sense - Describe transformations from one schema (the
base relations) to another (the output of the
view) - The basis of converting from XML to relations or
vice versa - This will be incredibly useful in data
integration, discussed soon - Allow us to define recursive queries
8Materialized vs. Virtual Views
- A virtual view is a named query that is actually
re-computed every time it is merged with the
referencing query - CREATE VIEW V(A,B,C) AS
- SELECT A,B,C FROM R WHERE R.A 123
- A materialized view is one that is computed once
and its results are stored as a table - Think of this as a cached answer
- These are incredibly useful!
- Techniques exist for using materialized views to
answer other queries - Materialized views are the basis of relating
tables in different schemas
SELECT FROM V, RWHERE V.B 5 AND V.C R.C
9Views Should Stay Fresh
- Views (sometimes called intensional relations)
behave, from the perspective of a query language,
exactly like base relations (extensional
relations) - But theres an association that should be
maintained - If tuples change in the base relation, they
should change in the view (whether its
materialized or not) - If tuples change in the view, that should reflect
in the base relation(s)
10View Maintenance and the View Update Problem
- There exist algorithms to incrementally recompute
a materialized view when the base relations
change - We can try to propagate view changes to the base
relations - However, there are lots of views that arent
easily updatable - We can ensure views are updatable by enforcing
certain constraints (e.g., no aggregation),but
this limits the kinds of views we can have
R
S
R?S
delete?
11Views as a Bridge between Data Models
- A claim weve made several times
- XML cant represent anything that cant be
expressed in in the relational model - If this is true, then we must be able to
represent XML in relations - Store a relational view of XML (or create an XML
view of relations)
12A View as a Translation betweenXML and Relations
- Youll be reviewing the most-cited paper in this
area (Shanmugasundaram et al), and there are many
more (Fernandez et al., ) - Techniques already making it into commercial
systems - XPERANTO at IBM Research will appear in DB2 v9
- SQL Server has some XML support Oracle is also
doing some XML - Next time youll see how it works!
13Issues in Mapping Relational ? XML
- We know the following
- XML is a tree
- XML is SEMI-structured
- Theres some structured stuff
- There is some unstructured stuff
- Issues relate to describing XML structure,
particularly parent/child in a relational
encoding - Relations are flat
- Tuples can be connected via foreign-key/primary-
key links
14The Simplest Way to Encode a Tree
- Suppose we hadlttree id0gt ltcontent id1gt
ltsub-contentgtXYZ lt/sub-contentgt
lti-contentgt14 lt/i-contentgt
lt/contentgtlt/treegt - If we have no IDs, we CREATE values
- BinaryLikeEdge(key, label, type, value, parent)
What are shortcomings here?
15Florescu/Kossmann Improved Edge Approach
- Consider order, typing separate the values
- Vint(vid, value)
- Vstring(vid, value)
- Edge(parent, ordinal, label, flag, target)
16How Do You Compute the XML?
- Assume we know the structure of the XML tree
(well see how to avoid this later) - We can compute an XML-like SQL relation using
outer unions we first this technique in
XPERANTO - Idea if we take two non-union-compatible
expressions, pad each with NULLs, we can UNION
them together - Lets see how this works
17A Relation that Mirrors theXML Hierarchy
- Output relation would look like
18A Relation that Mirrors theXML Hierarchy
- Output relation would look like
19A Relation that Mirrors theXML Hierarchy
- Output relation would look like
Colors are representative of separate SQL queries
20SQL for Outputting XML
- For each sub-portion we preserve the keys
(target, ord) of the ancestors - Root
- select E.label AS rLabel, E.target AS rid, E.ord
AS rOrd, null AS cLabel, null AS cid, null AS
cOrd, null AS subOrd, null AS sid, null AS str,
null AS intfrom Edge Ewhere parent IS NULL - First-level children
- select null AS rLabel, E.target AS rid, E.ord AS
rOrd, E1.label AS cLabel, E1.target AS cid,
E1.ord AS cOrd, null AS from Edge E, Edge
E1where E.parent IS NULL AND E.target E1.parent
21The Rest of the Queries
- Grandchild
- select null as rLabel, E.target AS rid, E.ord AS
rOrd, null AS cLabel, E1.target AS cid, E1.ord AS
cOrd, E2.label as sLabel, E2.target as sid,
E2.ord AS sOrd, null as from Edge E, Edge E1,
Edge E2where E.parent IS NULL AND E.target
E1.parent AND E1.target E2.parent - Strings
- select null as rLabel, E.target AS rid, E.ord AS
rOrd, null AS cLabel, E1.target AS cid, E1.ord AS
cOrd, null as sLabel, E2.target as sid, E2.ord AS
sOrd, Vi.val AS str, null as intfrom Edge E,
Edge E1, Edge E2, Vint Vi where E.parent IS NULL
AND E.target E1.parent AND E1.target
E2.parent AND Vi.vid E2.target - How would we do integers?
22Finally
- Union them all together
- ( select E.label as rLabel, E.target AS rid,
E.ord AS rOrd, from Edge E where parent IS
NULL)UNION ( select null as rLabel, E.target
AS rid, E.ord AS rOrd, E1.label AS cLabel,
E1.target AS cid, E1.ord AS cOrd, null as
from Edge E, Edge E1 where E.parent IS NULL AND
E.target E1.parent) UNION ( - .
- ) UNION ( .
- )
- Then another module will add the XML tags, and
were done!
23Another Example
attribute
root
p-i
element
Root
text
universities
?xml
university
mastersthesis
country
name
key
author
title
year
school
Univ.
USA
1999
ms/Brown92
Univ.
PRPL
Kurt P.