Managing XML and Semistructured Data - PowerPoint PPT Presentation

About This Presentation
Title:

Managing XML and Semistructured Data

Description:

StruQL and XSL. Prof. Dan Suciu. Spring 2001. In this lecture. Website management with Strudel ... xsl:template Example: Retrieve all book titles: ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 47
Provided by: csWash
Category:

less

Transcript and Presenter's Notes

Title: Managing XML and Semistructured Data


1
Managing XML and Semistructured Data
  • Lecture 9 Query Languages -
  • StruQL and XSL

Prof. Dan Suciu
Spring 2001
2
In this lecture
  • Website management with Strudel
  • Background on skolem functions
  • Skolem functions in StruQL
  • Structural recursion
  • XSL
  • Resources
  • Catching the boat with Strudel VLDBJ 2001
  • UnQL A Query Language and Algebra for
    Semistructured Data Based on Structural Recursion
    Buneman, Fernandez, Suciu.VLDBJ 2000
  • Data on the Web Abiteboul, Buneman, Suciu
    sections 5.2, 6.4, 6.5

3
Strudel and StruQL
  • Strudel a Website management tool
  • Idea separate the following three tasks
  • Management of data
  • use some database
  • Management of the sites structure
  • use StruQL
  • Management of the sites presentation
  • use HTML templates (this was before XML...)

4
Example Bibliography Data
Input data
Bib
  • Bib paper author Jones,
  • author Smith,
  • title The Comma,
  • year 1994 ,
  • paper author Jones,
  • title The Dot,
  • year 1998 ,
  • paper author Mark,
  • ....
  • . . .

paper
paper
paper
author
year
author
title
Jones
Smith
The Comma
.....
5
Simple Website Definition in StruQL
WHERE Root -gt Bib.paper.author -gt A CREATE
Root(), HomePage(A) LINK Root() -gt
person -gt HomePage(A),
HomePage(A) -gt name -gt A
HomePage(A) -gt home -gt Root()
StruQL query
Result
home
home
home
name
name
name
Smith
Jones
Mark
Root(), HomePage(A) Skolem Functions (more
later)
6
Complex Website Definition in StruQL
WHERE Root -gt Bib -gt X, X -gt paper -gt P,
P -gt author -gt A, P -gt title -gt T,
P -gt year -gt Y CREATE Root(), HomePage(A),
YearPage(A,Y), PubPage(P) LINK Root() -gt
person -gt HomePage(A), HomePage(A)
-gt yearentry -gt YearPage(A,Y),
YearPage(A,Y) -gt publication -gt PubPage(P),
PubPage(P) -gt author -gt HomePage(A),
PubPage(P) -gt title -gt T
7
Example A Complex Web Site
The Comma
The Dot
8
Skolem Functions
  • Maier, 1986
  • in OO systems
  • Kifer et al, 1989
  • F-logic
  • Hull and Yoshikawa, 1990
  • deductive db (ILOG)
  • Papakonstantinou et al., 1996
  • semistructured db (MSL)

9
Skolem Functions in Logic
  • Origins First Order Logic
  • The Satisfiability problemgiven a formula ?,
    does it have a model ?

10
Skolem Functions in Logic
  • Example does ? have a model ?
  • Skolem functions replace ? with functions, drop
    ?
  • Fact ? has a model iff ? has a model

11
Skolem Functions in Databases
  • Recall Datalog
  • Means

Answer(title, author) - Paper(author, title,
year)
12
Skolem Functions in Databases
  • Now consider
  • I want to create a new object x. What meaning ?

Answer(author, x) - Paper(author, title, year)
13
Skolem Functions in Databases
  • Better use Skolem functions directly in Datalog
  • Choices

Answer(author, NewObj(author)) - Paper(author,
title, year)
Answer(author, NewObj(author,title)) -
Paper(author, title, year)
Answer(author, NewObj(title,year)) -
Paper(author, title, year)
Answer(author, NewObj()) - Paper(author, title,
year)
14
Skolem Functions in StruQL
  • StruQLs semantics
  • Input graph (Node, Edge)
  • Output graph(Node, Edge)
  • Example

WHERE Root -gt Bib.paper.author -gt A CREATE
Root(), HomePage(A) LINK Root() -gt
person -gt HomePage(A),
HomePage(A) -gt name -gt A
HomePage(A) -gt home -gt Root()
Node(Root()) - Node(HomePage(A)) -
Edge(Root,Bib,X), Edge(X,paper,Y),Edge(Y
,author,A)Edge(Root,person,HomePage(A)) -
Edge(Root,Bib,X), Edge(X,paper,Y),Edge(Y,author,A)
Edge(HomePage(A),person, A) -
Edge(Root,Bib,X), Edge(X,paper,Y),Edge(Y,author,A)
Edge(HomePage(A),home,Root()) -
Edge(Root,Bib,X), Edge(X,paper,Y),Edge(Y,author,A)
15
A Different ParadigmStructural Recursion
  • Data as sets with a union operator
  • a3, abone, c5, b4
  • a3 U abone,c5 U b4

16
Structural Recursion
  • Example retrieve all integers in the data

f(T1 U T2) f(T1) U f(T2) f(L T)
f(T) f() f(V)
if isInt(V) then result V
else
17
Structural Recursion
  • What does this do ?

f(T1 U T2) f(T1) U f(T2) f(L T)
if La then bf(T) else Lf(T) f()
f(V) V
Returns the same tree with a-edges replaced by
b-edges
18
Structural Recursion
  • What does this do ?

f(T1 U T2) f(T1) U f(T2) f(L T)
LLf(T) f()
f(V) V
Input tree with n nodes Output tree with 2n
nodes (every edge is doubled)
19
Structural Recursion
  • Example increase all engine prices by 10

20
Structural Recursion
  • Retrieve all subtrees reachable by (a.b).a

a
b
a
21
Structural Recursion General Form
f1(T1 U T2) f1(T1) U f1(T2) f1(L T)
E1(L, f1(T),...,fk(T), T) f1()
f1(V)
. . . .
fk(T1 U T2) fk(T1) U fk(T2) fk(L T)
Ek(L, f1(T),...,fk(T), T) fk()
fk(V)
Each of E1, ..., Ek consists only of _ _, U,
if_then_else_
22
Evaluating Structural Recursion
  • Recursive Evaluation
  • Compute the functions recursively, starting with
    f1 at the root
  • Termination is guaranteed.
  • How efficiently can we evaluate this ?

23
Structural Recursion
  • Consider this

f(T1 U T2) f(T1) U f(T2) f(L T)
Lf(T), Lf(T) f()
f(V) V
24
Naive Recursive Evaluation
a
a
a
b
b
b
b
b
c
c
c
c
c
c
c
c
c
d
Input tree n nodes Output tree 2n1 1 nodes
25
Efficient Recursive Evaluation
Recursive Evaluation with function
memorization. PTIME complexity.
f(T1 U T2) f(T1) U f(T2) f(L T)
Lf(T), Lf(T) f()
f(V) V
Alternatively apply the function in parallel to
each input edge ? Bulk Evaluation
26
Bulk Evaluation
Sometimes f doesnt return anything ? use ? edges
f(T1 U T2) f(T1) U f(T2) f(L T)
if Lc then T else f(T) f()
f(V) V
27
Epsilon Edges
  • Meaning of ? edges

a
b
a
b
?

d
c
c
d
c
d
28
Epsilon Edges
  • Note union becomes easy to draw with ? edges
  • Example

?
?
T1
T2
U

T1
T2
?
?
a
b
U
a
b
c
d
e

c
d
e

e
a
c
d
b
29
Bulk Evaluation
  • Idea apply E1, ..., Ek independently on each
    edge, then connect with ? edges ? PTIME

30
Bulk Evaluation
Recall (a.b).a
a
b
b
a
a
a
a
a
b
d
a
b
b
a
a
c
b
d
a
a
b
d
d
c
b
b
c
c
31
Structural Recursion
  • Can evaluate in two ways
  • Recursively memorize functions results
  • Bulk apply all functions on all edges, in
    parallel, connect, eliminate what is useless
  • Complexity PTIME
  • More precisely NLOGSPACE
  • Works on graphs with cycles too !

32
XSL
  • XSLT 1.0 (a recommendation)
  • http//www.w3.org/TR/xslt.html
  • XSLT 1.1 (a working draft)
  • http//www.w3.org/TR/xslt11/
  • In commercial products (e.g. IE5.0)

33
XSL
  • Purpose stylesheet specification language
  • stylesheet XML -gt HTML
  • in general XML -gt XML
  • Uses XPath

34
XSL Program
  • XSL program template-rule ... template-rule
  • template-rule match pattern template

Example Retrieve all book titles
ltxsltemplate match /gt
ltxslapply-templates/gt lt/xsltemplategt ltxsltemp
late match /bib//titlegt ltresultgt
ltxslvalue-of select . /gt lt/resultgt lt/xsltem
plategt
35
Simple XSL Program
  • Copy the input

ltxsltemplate match /gt
ltxslapply-templates/gt lt/xsltemplategt ltxsltemp
late match text()gt ltxslvalue-of
select./gtlt/xsltemplategt ltxsltemplate match
gt ltxslelement namename(.)gt
ltxslapply-templates/gt
lt/xslelementgt lt/xsltemplategt
36
Flow Control in XSL
ltxsltemplate match /gt ltxslapply-template
s/gt lt/xsltemplategt ltxsltemplate matchagt
ltAgtltxslapply-templates/gtlt/Agt lt/xsltemplategt ltxs
ltemplate matchbgt ltBgtltxslapply-templates/gtlt/
Bgt lt/xsltemplategt ltxsltemplate matchcgt
ltCgtltxslvalue-of/gtlt/Cgt lt/xsltemplategt
37
  • ltagt ltegt ltbgt ltcgt 1 lt/cgt
  • ltcgt 2 lt/cgt
  • lt/bgt
  • ltagt ltcgt 3 lt/cgt
  • lt/agt
  • lt/egt
  • ltcgt 4 lt/cgt
  • lt/agt
  • ltAgt ltBgt ltCgt 1 lt/Cgt
  • ltCgt 2 lt/Cgt
  • lt/Bgt
  • ltAgt ltCgt 3 lt/Cgt
  • lt/Agt
  • ltCgt 4 lt/Cgt
  • lt/Agt

38
XSL is Structural Recursion
  • Equivalent to

f(T1 U T2) f(T1) U f(T2) f(L T) if L
c then C t else L b then
B f(t) else L a then A
f(t) else f(t) f()
f(V) V
? ltxsltemplate matchcgt
? ltxsltemplate matchbgt
? ltxsltemplate matchagt
? ltxsltemplate match /gt
XSL query single function XSL query with modes
multiple function (next)
39
Modes in XSLT
Compute the path (a.b)
f(T1 U T2) f(T1) U f(T2) f(a T)
resultT U g(T) f() f(V)
V g(T1 U T2) g(T1) U g(T2) g(b T)
f(T) g() g(V)
V
ltxsltemplate match /gt
ltxslapply-templates modef/gt
lt/xsltemplategt ltxsltemplate match
modef/gt ltxsltemplate matcha modefgt
ltresultgt ltxslcopy-of match./gt lt/resultgt
ltxslapply-templates modeg/gt lt/xsltemplategt lt
xsltemplate match modeggt ltxsltemplate
matchb modeggt ltxslapply-templates
modef/gt lt/xsltemplategt
ltxslcopy-of ... gt copies the input to the output
ignoring modes, this computes (ab)
40
Modes in XSLT
  • Mode a name for a group of template rules
  • No mode empty mode
  • Same as having multiple recursive functions

41
Conflict Resolutionfor Template Rules
  • If several template rules match, choose that with
    highest priority.
  • Explicit priority ltxsltemplate matchabc
    priority1.41gt
  • Computing implicit priority ad-hoc rules given
    by the W3C, based on match
  • matchP1 P2 ... ? transform to a set of
    template rules.
  • matchabc ? the priority is 0.
  • match... some namespace name... ? the
    priority is -0.25.
  • matchnode() ? the priority is -0.5.
  • Otherwise, the priority is 0.5

It is an error if this leaves more than one
matching template rule.
42
Built-in Template Rules
  • Keeps us goingltxsltemplate match /gt
    ltxslapply-templates/gtlt/xsltemplategtthere is
    one such rule for each mode
  • Copies what we forgotltxsltemplate match
    text()_at_gtltxslvalue-of select./gtlt/xsltempl
    ategtthere is only one rule, for the empty mode
  • Lowest priorities among all rules hence, can be
    easily overridden

43
XSL Template
  • ltxsltemplate match expression mode name
    priority number name name gt
  • Body
  • lt/xsltemplategt
  • Default mode priority (computed as
    explained earlier) name when no match, no mode
  • Body
  • XML constructors ltmyTaggt...lt/myTaggt ltbgt
    ... lt/bgt ...
  • XSL instructions
  • ltxslapply-templatesgt ( recursive call)
  • ltxslvalue-ofgt ( copy the value)
  • ltxslcopygt ( shallow copy)
  • ltxslcopy-ofgt (deep copy)
  • ltxslelementgt ( more flexible than XML
    constructors)
  • ltxslattributegt ( add an attribute to the
    element)
  • ltxslifgt ( conditional)
  • ltxslfor-eachgt
  • Instructions for variables

44
XSL Apply Templates
  • ltxslapply-templates select expression mode
    name gt
  • Body
  • lt/xsl apply-templatesgt
  • Default
  • select (children)
  • mode (empty mode)
  • Body
  • Sort instructions
  • Paramemter instructions

45
XSL Variables
  • Declaring a variable
  • ltxslvariable name vname select valuegt
    value lt/xslvariablegt
  • Value either in select, or in body
  • Either in ltxsltemplategt ... lt/xsltemplategt or
    at top level
  • Declaring a parameter
  • ltxslparam select valuegt value lt/xslparamgt
  • In ltxsltemplategt ... lt/xsltemplategt, at the
    beginning
  • Passing a paramemter
  • ltxslwith-param select valuegt value
    lt/xslparamgt
  • In ltxslapply-templatesgt ... lt/xslapply-templates
    gt
  • Using variables vname

46
XSL and Structural Recursion
  • XSL
  • mainly on trees
  • may loop
  • Structural Recursion
  • arbitrary graphs
  • always terminates

add the following rule
ltxsltemplate match egt
ltxslapply-patterns select//gt lt/xsltemplategt
stack overflow on IE 5.0
Write a Comment
User Comments (0)
About PowerShow.com