XML - PowerPoint PPT Presentation

About This Presentation
Title:

XML

Description:

Roots: SGML (a very nasty language). After the roots: a format for sharing data. Why XML is of Interest to Us. XML is just syntax for data ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 50
Provided by: alon69
Learn more at: https://ics.uci.edu
Category:
Tags: xml

less

Transcript and Presenter's Notes

Title: XML


1
XML
2
XML
  • eXtensible Markup Language
  • XML 1.0 a recommendation from W3C, 1998
  • Roots SGML (a very nasty language).
  • After the roots a format for sharing data

3
Why XML is of Interest to Us
  • XML is just syntax for data
  • Note we have no syntax for relational data
  • But XML is not relational semistructured
  • This is exciting because
  • Can translate any data to XML
  • Can ship XML over the Web (HTTP)
  • Can input XML into any application
  • Thus data sharing and exchange on the Web

4
XML Data Sharing and Exchange
application
application
object-relational
Integrate
XML Data
WEB (HTTP)
Transform
Warehouse
application
relational data
legacy data
Specific data management tasks
5
From HTML to XML
HTML describes the presentation
6
HTML
  • Bibliography
  • Foundations of Databases
  • Abiteboul, Hull, Vianu

  • Addison Wesley, 1995
  • Data on the Web
  • Abiteoul, Buneman, Suciu

  • Morgan Kaufmann, 1999

7
XML
  • Foundations
  • Abiteboul
  • Hull
  • Vianu
  • Addison Wesley
  • 1995

XML describes the content
8
Web Services
  • A new paradigm for creating distributed
    applications?
  • Systems communicate via messages, contracts.
  • Example order processing system.
  • MS .NET, J2EE some of the platforms
  • XML a part of the story the data format.

9
XML Terminology
  • tags book, title, author,
  • start tag , end tag
  • elements ,
  • elements are nested
  • empty element abbrv.
  • an XML document single root element

well formed XML document if it has matching tags
10
More XML Attributes
  • Foundations of Databases
  • Abiteboul
  • 1995

attributes are alternative ways to represent data
11
More XML Oids and References
  • Jane
  • Mary
  • idrefo123 o555/
  • John

oids and references in XML are just syntax
12
XML Semantics a Tree !
data
Mary
Maple
345 Seattle
John
Thailand
23456
person
person
id
address
name
address
name
phone
o555
street
no
city
Mary
Thai
John
23456
Maple
345
Seattle
Order matters !!!
13
XML Data
  • XML is self-describing
  • Schema elements become part of the data
  • Reational schema persons(name,phone)
  • In XML , , are part of the
    data, and are repeated many times
  • Consequence XML is much more flexible
  • XML semistructured data

14
Relational Data as XML
person
XML
person
row
row
row
phone
name
name
name
phone
phone
John
3634
Sue
Dick
6343
6363
  • John
  • 3634
  • Sue
  • 6343
  • Dick
  • 6363

15
XML is Semi-structured Data
  • Missing attributes
  • Could represent ina table with nulls

John
1234
Joe
? no phone !
16
XML is Semi-structured Data
  • Repeated attributes
  • Impossible in tables

Mary
2345
3456
? two phones !
???
17
XML is Semi-structured Data
  • Attributes with different types in different
    objects
  • Nested collections (no 1NF)
  • Heterogeneous collections
  • contains both s and s

John
Smith

1234
? structured name !
18
Document Type DefinitionsDTD
  • part of the original XML specification
  • an XML document may have a DTD
  • XML document
  • well-formed if tags are correctly closed
  • Valid if it has a DTD and conforms to it
  • validation is useful in data exchange

19
Very Simple DTD
((personproduct)) name, office, phone?) (PCDATA)
(PCDATA) description?)

20
Very Simple DTD
Example of valid XML document
123456789
John
B432
1234
987654321
Jim
B123
... ...
21
DTD The Content Model
  • Content model
  • Complex a regular expression over other
    elements
  • Text-only PCDATA
  • Empty EMPTY
  • Any ANY
  • Mixed content (PCDATA A B C)


contentmodel
22
DTD Regular Expressions
DTD
XML
sequence
(firstName, lastName))
. . . . .
. . . . .
optional
. . . . .
. . . . . . . . .
. . . . . . .
. . . . .
Kleene star
alternation
23
Querying XML Data
  • XPath simple navigation through the tree
  • XQuery the SQL of XML
  • XSLT recursive traversal

24
Sample Data for Queries
  • Addison-Wesley
    Serge
    Abiteboul
    Rick
    Hull
    Victor
    Vianu Foundations
    of Databases 1995

    Freeman
    Jeffrey D. Ullman
    Principles of Database and Knowledge
    Base Systems 1998

25
Data Model for XPath
The root
The root element
book
book
publisher
author
. . . .
Addison-Wesley
Serge Abiteboul
26
XPath Simple Expressions
  • Result 1995
  • 1998
  • Result empty (there were no papers)

/bib/book/year
/bib/paper/year
27
XPath Restricted Kleene Closure
  • Result Serge Abiteboul
  • Rick
  • Hull
  • Victor Vianu
  • Jeffrey D. Ullman
  • Result Rick

//author
/bib//first-name
28
Xpath Text Nodes
/bib/book/author/text()
  • Result Serge Abiteboul
  • Jeffrey D. Ullman
  • Rick Hull doesnt appear because he has
    firstname, lastname
  • Functions in XPath
  • text() matches the text value
  • node() matches any node ( or _at_ or text())
  • name() returns the name of the current tag

29
Xpath Wildcard
  • Result Rick
  • Hull
  • Matches any element

//author/
30
Xpath Attribute Nodes
/bib/book/_at_price
  • Result 55
  • _at_price means that price is has to be an attribute

31
Xpath Predicates
  • Result Rick

/bib/book/authorfirstname
32
Xpath More Predicates
  • Result

/bib/book/authorfirstnameaddress//zipcity/
lastname
33
Xpath More Predicates
/bib/book_at_price /bib/bookauthor/_at_age /bib/bookauthor/text()
34
Xpath Summary
  • bib matches a bib element
  • matches any element
  • / matches the root element
  • /bib matches a bib element under root
  • bib/paper matches a paper in bib
  • bib//paper matches a paper in bib, at any depth
  • //paper matches a paper at any depth
  • paperbook matches a paper or a book
  • _at_price matches a price attribute
  • bib/book/_at_price matches price attribute in book,
    in bib
  • bib/book/_at_price

35
Comments on XPath?
  • Whats good about it?
  • What cant it do that you want it to do?
  • How does it compare, say, to SQL?

36
XQuery
  • Based on Quilt, which is based on XML-QL
  • Uses XPath to express more complex queries

37
FLWR (Flower) Expressions
  • FOR ...
  • LET...
  • WHERE...
  • RETURN...

38
XQuery
  • Find all book titles published after 1995

FOR x IN document("bib.xml")/bib/book WHERE
x/year 1995 RETURN x/title
Result abc def
ghi
39
XQuery
  • Find book titles by the coauthors of Database
    Theory

FOR x IN bib/booktitle/text() Database
Theory/author y IN bib/bookauthor/tex
t() x/text()/title RETURN
y/text()
Result abc
def ghi
The answer willcontain duplicates !
40
XQuery
  • Same as before, but eliminate duplicates

FOR x IN bib/booktitle/text() Database
Theory/author y IN distinct(bib/booka
uthor/text() x/text()/title) RETURN
y/text()
Result abc
def ghi
distinct a function that eliminates duplicates
41
XQuery Nesting
  • For each author of a book by Morgan Kaufmann,
    list all books she published

FOR a IN distinct(document("bib.xml")
/bib/bookpublisherMorgan
Kaufmann/author) RETURN
a, FOR t IN
/bib/bookauthora/title
RETURN t

42
XQuery
  • Jones
  • abc
  • def
  • Smith
  • ghi

Result
43
XQuery
  • FOR x in expr -- binds x to each value in the
    list expr
  • LET x expr -- binds x to the entire list
    expr
  • Useful for common subexpressions and for
    aggregations

44
XQuery
FOR p IN
distinct(document("bib.xml")//publisher)
LET b document("bib.xml")/bookpublisher
p WHERE count(b) 100 RETURN
p
count a (aggregate) function that returns the
number of elms
45
XQuery
  • Find books whose price is larger than average

LET aavg(document("bib.xml")/bib/book/price) FOR
b in document("bib.xml")/bib/book WHERE
b/price a RETURN b
Lets try to write this in SQL
46
XQuery
  • Summary
  • FOR-LET-WHERE-RETURN FLWR

FOR/LET Clauses
List of tuples
WHERE Clause
List of tuples
RETURN Clause
Instance of Xquery data model
47
FOR v.s. LET
  • FOR
  • Binds node variables ? iteration
  • LET
  • Binds collection variables ? one value

48
FOR v.s. LET
Returns ...
...
... ...
FOR x IN document("bib.xml")/bib/book RETURN
x
LET x IN document("bib.xml")/bib/book RETURN
x
Returns ...
...
... ...
49
Collections in XQuery
  • Ordered and unordered collections
  • /bib/book/author an ordered collection
  • Distinct(/bib/book/author) an unordered
    collection
  • LET a /bib/book ? a is a collection
  • b/author ? a collection (several authors...)

Returns ...
...
...
...
RETURN b/author
Write a Comment
User Comments (0)
About PowerShow.com