Agenda from now on - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Agenda from now on

Description:

person id='o123' mother='o456' name John /name /person ... street. no. city. Maple. 345. Seattle. John. Thai. phone. 23456. id. o555. Element. node. Text. node ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 36
Provided by: Alon90
Category:
Tags: agenda | now

less

Transcript and Presenter's Notes

Title: Agenda from now on


1
Agenda from now on
  • Done SQL, views, transactions, conceptual
    modeling, E/R, relational algebra.
  • Starting XML
  • To do the database engine
  • Storage
  • Query execution
  • Query optimization

2
XML
3
XML
  • eXtensible Markup Language
  • XML 1.0 a recommendation from W3C, 1998
  • Roots SGML (a very nasty language).
  • After the roots a format for sharing data

4
Why XML is of Interest to Us
  • XML is just syntax for data
  • Note we have no syntax for relational data
  • But XML is not relational semistructured
  • This is exciting because
  • Can translate any data to XML
  • Can ship XML over the Web (HTTP)
  • Can input XML into any application
  • Thus data sharing and exchange on the Web

5
XML Data Sharing and Exchange
application
application
object-relational
Integrate
XML Data
WEB (HTTP)
Transform
Warehouse
application
relational data
legacy data
Specific data management tasks
6
From HTML to XML
HTML describes the presentation
7
HTML
  • lth1gt Bibliography lt/h1gt
  • ltpgt ltigt Foundations of Databases lt/igt
  • Abiteboul, Hull, Vianu
  • ltbrgt Addison Wesley, 1995
  • ltpgt ltigt Data on the Web lt/igt
  • Abiteoul, Buneman, Suciu
  • ltbrgt Morgan Kaufmann, 1999

8
XML
  • ltbibliographygt
  • ltbookgt lttitlegt Foundations lt/titlegt
  • ltauthorgt Abiteboul lt/authorgt
  • ltauthorgt Hull lt/authorgt
  • ltauthorgt Vianu lt/authorgt
  • ltpublishergt Addison Wesley
    lt/publishergt
  • ltyeargt 1995 lt/yeargt
  • lt/bookgt
  • lt/bibliographygt

XML describes the content
9
Web Services
  • A new paradigm for creating distributed
    applications?
  • Systems communicate via messages, contracts.
  • Example order processing system.
  • MS .NET, J2EE some of the platforms
  • XML a part of the story the data format.

10
XML Terminology
  • tags book, title, author,
  • start tag ltbookgt, end tag lt/bookgt
  • elements ltbookgtltbookgt,ltauthorgtlt/authorgt
  • elements are nested
  • empty element ltredgtlt/redgt abbrv. ltred/gt
  • an XML document single root element

well formed XML document if it has matching tags
11
More XML Attributes
  • ltbook price 55 currency USDgt
  • lttitlegt Foundations of Databases lt/titlegt
  • ltauthorgt Abiteboul lt/authorgt
  • ltyeargt 1995 lt/yeargt
  • lt/bookgt

attributes are alternative ways to represent data
12
More XML Oids and References
  • ltperson ido555gt ltnamegt Jane lt/namegt lt/persongt
  • ltperson ido456gt ltnamegt Mary lt/namegt
  • ltchildren
    idrefo123 o555/gt
  • lt/persongt
  • ltperson ido123 mothero456gtltnamegtJohnlt/namegt
  • lt/persongt

oids and references in XML are just syntax
13
XML Semantics a Tree !
data
ltdatagt ltperson ido555 gt ltnamegt Mary
lt/namegt ltaddressgt ltstreetgt Maple lt/streetgt ltnogt
345 lt/nogt ltcitygt Seattle lt/citygt
lt/addressgt lt/persongt ltpersongt ltnamegt John
lt/namegt ltaddressgt Thailand lt/addressgt ltphonegt
23456 lt/phonegt lt/persongt lt/datagt
person
person
id
address
name
address
name
phone
o555
street
no
city
Mary
Thai
John
23456
Maple
345
Seattle
Order matters !!!
14
XML Data
  • XML is self-describing
  • Schema elements become part of the data
  • Reational schema persons(name,phone)
  • In XML ltpersonsgt, ltnamegt, ltphonegt are part of the
    data, and are repeated many times
  • Consequence XML is much more flexible
  • XML semistructured data

15
Relational Data as XML
person
XML
person
row
row
row
phone
name
name
name
phone
phone
John
3634
Sue
Dick
6343
6363
  • ltpersongt
  • ltrowgt ltnamegtJohnlt/namegt
  • ltphonegt 3634lt/phonegtlt/rowgt
  • ltrowgt ltnamegtSuelt/namegt
  • ltphonegt 6343lt/phonegt
  • ltrowgt ltnamegtDicklt/namegt
  • ltphonegt 6363lt/phonegtlt/rowgt
  • lt/persongt

16
XML is Semi-structured Data
  • Missing attributes
  • Could represent ina table with nulls

ltpersongt ltnamegt Johnlt/namegt
ltphonegt1234lt/phonegt lt/persongt ltpersongt
ltnamegtJoelt/namegt lt/persongt
? no phone !
17
XML is Semi-structured Data
  • Repeated attributes
  • Impossible in tables

ltpersongt ltnamegt Marylt/namegt
ltphonegt2345lt/phonegt
ltphonegt3456lt/phonegt lt/persongt
? two phones !
???
18
XML is Semi-structured Data
  • Attributes with different types in different
    objects
  • Nested collections (no 1NF)
  • Heterogeneous collections
  • ltdbgt contains both ltbookgts and ltpublishergts

ltpersongt ltnamegt ltfirstgt John lt/firstgt
ltlastgt Smith lt/lastgt
lt/namegt
ltphonegt1234lt/phonegt lt/persongt
? structured name !
19
Document Type DefinitionsDTD
  • part of the original XML specification
  • an XML document may have a DTD
  • XML document
  • well-formed if tags are correctly closed
  • Valid if it has a DTD and conforms to it
  • validation is useful in data exchange

20
Very Simple DTD
lt!DOCTYPE company lt!ELEMENT company
((personproduct))gt lt!ELEMENT person (ssn,
name, office, phone?)gt lt!ELEMENT ssn
(PCDATA)gt lt!ELEMENT name (PCDATA)gt
lt!ELEMENT office (PCDATA)gt lt!ELEMENT phone
(PCDATA)gt lt!ELEMENT product (pid, name,
description?)gt lt!ELEMENT pid (PCDATA)gt
lt!ELEMENT description (PCDATA)gt gt
21
Very Simple DTD
Example of valid XML document
ltcompanygt ltpersongt ltssngt 123456789 lt/ssngt
ltnamegt John lt/namegt
ltofficegt B432 lt/officegt
ltphonegt 1234 lt/phonegt lt/persongt
ltpersongt ltssngt 987654321 lt/ssngt
ltnamegt Jim lt/namegt
ltofficegt B123 lt/officegt lt/persongt
ltproductgt ... lt/productgt ... lt/companygt
22
DTD The Content Model
lt!ELEMENT tag (CONTENT)gt
  • Content model
  • Complex a regular expression over other
    elements
  • Text-only PCDATA
  • Empty EMPTY
  • Any ANY
  • Mixed content (PCDATA A B C)

contentmodel
23
DTD Regular Expressions
DTD
XML
sequence
lt!ELEMENT name
(firstName, lastName))
ltnamegt ltfirstNamegt . . . . . lt/firstNamegt
ltlastNamegt . . . . . lt/lastNamegt lt/namegt
optional
lt!ELEMENT name (firstName?, lastName))
ltpersongt ltnamegt . . . . . lt/namegt
ltphonegt . . . . . lt/phonegt ltphonegt . . . .
. lt/phonegt ltphonegt . . . . . lt/phonegt .
. . . . . lt/persongt
Kleene star
lt!ELEMENT person (name, phone))
alternation
lt!ELEMENT person (name, (phoneemail)))
24
Querying XML Data
  • XPath simple navigation through the tree
  • XQuery the SQL of XML
  • XSLT recursive traversal
  • will not discuss in class

25
Sample Data for Queries
  • ltbibgtltbookgt ltpublishergt Addison-Wesley
    lt/publishergt ltauthorgt Serge
    Abiteboul lt/authorgt ltauthorgt
    ltfirst-namegt Rick lt/first-namegt
    ltlast-namegt Hull lt/last-namegt
    lt/authorgt ltauthorgt Victor
    Vianu lt/authorgt lttitlegt Foundations
    of Databases lt/titlegt ltyeargt 1995
    lt/yeargtlt/bookgtltbook price55gt
    ltpublishergt Freeman lt/publishergt
    ltauthorgt Jeffrey D. Ullman lt/authorgt
    lttitlegt Principles of Database and Knowledge
    Base Systems lt/titlegt ltyeargt 1998
    lt/yeargtlt/bookgt
  • lt/bibgt

26
Data Model for XPath
The root
The root element
book
book
publisher
author
. . . .
Addison-Wesley
Serge Abiteboul
27
XPath Simple Expressions
/bib/book/year
  • Result ltyeargt 1995 lt/yeargt
  • ltyeargt 1998 lt/yeargt
  • Result empty (there were no papers)

/bib/paper/year
28
XPath Restricted Kleene Closure
//author
  • Resultltauthorgt Serge Abiteboul lt/authorgt
  • ltauthorgt ltfirst-namegt Rick
    lt/first-namegt
  • ltlast-namegt Hull
    lt/last-namegt
  • lt/authorgt
  • ltauthorgt Victor Vianu lt/authorgt
  • ltauthorgt Jeffrey D. Ullman
    lt/authorgt
  • Result ltfirst-namegt Rick lt/first-namegt

/bib//first-name
29
Xpath Text Nodes
/bib/book/author/text()
  • Result Serge Abiteboul
  • Jeffrey D. Ullman
  • Rick Hull doesnt appear because he has
    firstname, lastname
  • Functions in XPath
  • text() matches the text value
  • node() matches any node ( or _at_ or text())
  • name() returns the name of the current tag

30
Xpath Wildcard
  • Result ltfirst-namegt Rick lt/first-namegt
  • ltlast-namegt Hull lt/last-namegt
  • Matches any element

//author/
31
Xpath Attribute Nodes
/bib/book/_at_price
  • Result 55
  • _at_price means that price is has to be an attribute

32
Xpath Predicates
/bib/book/authorfirstname
  • Result ltauthorgt ltfirst-namegt Rick lt/first-namegt
  • ltlast-namegt Hull
    lt/last-namegt
  • lt/authorgt

33
Xpath More Predicates
  • Result ltlastnamegt lt/lastnamegt
  • ltlastnamegt lt/lastnamegt

/bib/book/authorfirstnameaddress//zipcity/
lastname
34
Xpath More Predicates
/bib/book_at_price lt 60
/bib/bookauthor/_at_age lt 25
/bib/bookauthor/text()
35
Xpath Summary
  • bib matches a bib element
  • matches any element
  • / matches the root element
  • /bib matches a bib element under root
  • bib/paper matches a paper in bib
  • bib//paper matches a paper in bib, at any depth
  • //paper matches a paper at any depth
  • paperbook matches a paper or a book
  • _at_price matches a price attribute
  • bib/book/_at_price matches price attribute in book,
    in bib
  • bib/book/_at_pricelt55/author/lastname matches
Write a Comment
User Comments (0)
About PowerShow.com