Title: Ling Wang
1XML XML Query
Ling Wang Luping Ding
2Introduction
- The Web opens a new challenges in
- - information technology
- - database framework.
- Why?
- - Data sources on the Web do NOT typically
conform to any well-known structure. - - Traditional databases technology is not
adequate in dealing with rich data - eg audio, video, nested data structures
3Features of Web Data
- Web data characteristics, called semistructured
- Object-like
- a collection of complex objects from CODM.
- Schema-less
- Not typically conform to any type traditional
structure. - Self-describing
- meaning of the data is carried along with the
data itself. -
- So, we need new database technologies to support
those Web-based applications.
4What is XML?
- XML---- Extensible Markup Language
- - A mark up language for documents containing
structured information. - - Universal format for structured documents
and data on the Web. - - An HTML-like language.
- XML specification defines a standard way to add
markup to documents. - Note Structured information , Markup language
5What is XML ---- example
A XML example for customer information ltcustomer
-details id"AcPharm39156"gt ltnamegtAcme
Pharmaceuticals Co.lt/namegt ltaddress
country"US"gt ltstreetgt7301 Smokey
Boulevardlt/streetgt ltcitygtSmallvillelt/citygt
ltstategtIndianalt/stategt ltpostalgt94571lt/postalgt
lt/addressgt lt/customer-detailsgt
6XML vs. HTML?
XML HTML
XML is extensible - NOT specifies semantics or tag set - Just facility Not extensible - Fix tag semantics and tag set - Defined by W3C(the World Wide Web Consortium).
XML document is well formed - A root element. - Opening tag is followed by a matching closing tag. - Element properly nested. Not strict required. - Tags are not required to be closed. - Browsers will forgive etc.
7Overview of XML
- Mechanisms for specifying document structure
- ---- a set of rules for structuring an XML
document. - DTD ---- Document type definition language
- (A part of XML standard )
- XML Schema ---- A more recent specification
- Â
- Query languages for XML
- XPath , XSLT, XQuery
8Basic concept in XML ---- element attributes
- XML element
- Any properly nested piece of text of the form
- ltsometaggtlt/sometaggt.
- eg ltstreetgt7301 Smokey Boulevardlt/streetgt
-
- XML Attributes
- also a tools for datapresentation.
- eg ltcustomer-details id"AcPharm39156"gt
lt/customer-detailsgt
content
name
9Basic concept in XML ---- namespace
- Namespaces
- - Why?
- Element names in XML are not fixed, name
conflict. - - How?
- Different authors use different namespace
identifiers for different domains. - The general structure namespacelocal-name
- Namespace ---- URI (uniform resource
identifier) URL (uniform resource locator) or
URN (universal resource name). - Local name ---- same form as regular XML tags.
- No a in it.
10Basic concept in XML ---- namespace
- An example of Namespaces
- ltitem xmlns"http//www.acmeinc.com/jpsupplies
"gt - xmlnstoyhttp//www.acmeinc.com/jptoysgt
- ltnamegtAfrican Coffee Tablelt/namegt
- ltfeaturegt
- lttoyitemgt
- lttoynamegtcyberpetlt/toynamegt
- lt/toyitemgt
- lt/featuregt
- lt/itemgt
default namespace
11DTD ---- Document Type Definitions
- Why DTD?
- - XML files carry a description of its own
format with it. - - Independent groups of people can agree with
interchanging data. - - Application verify received data from the
outside world - - Also verify own data.
- How?
- - DTD is included in your XML source file
- lt!DOCTYPE root-element element-declarations
gt - - DTD is external to your XML source file
- lt!DOCTYPE root-element SYSTEM "filename"gt
12DTD ---- example
- Example XML document with a DTD
- lt?xml version"1.0"?gt
- lt!DOCTYPE note
- lt!ELEMENT note (to,from,heading,body)gt
- lt!ELEMENT to (PCDATA)gt
- lt!ELEMENT from (PCDATA)gt
- lt!ELEMENT heading (PCDATA)gt
- lt!ELEMENT body (PCDATA)gt
- gt
- ltnotegt
- lttogtTovelt/togt
- ltfromgtJanilt/fromgt
- ltheadinggtReminderlt/headinggt
- ltbodygtDon't forget me this weekendlt/bodygt
- lt/notegt
13DTD ---- example
XML document with an external DTD lt?xml
version"1.0"?gt lt!DOCTYPE note SYSTEM
"note.dtd"gt ltnotegt lttogtTovelt/togt ltfromgtJanilt/fromgt
ltheadinggtReminderlt/headinggt ltbodygtDon't forget
me this weekend!lt/bodygt lt/notegt "note.dtd"
containing the DTD lt!ELEMENT note
(to,from,heading,body)gt lt!ELEMENT to
(PCDATA)gt lt!ELEMENT from (PCDATA)gt lt!ELEMENT
heading (PCDATA)gt lt!ELEMENT body (PCDATA)gt
14DTD ---- Inadequacy
- Inadequacy of DTD
- - Not designed with namespaces..
- - Use syntax ---- quite different from XML
document. - - A very limited set of basic types
- - Provide only limited means for expressing
data consistency constraints. - No keys
- Referential integrity is weak
- Attributes can be type ID, IDREF, IDREFS.
- No for element.
15DTD ---- Inadequacy
- Inadequacy of DTD
- - No ways of enforcing referential integrity
for elements. - - Use alternatives to state that the order
of elements is immaterial. Terrible as the
number of attributes grows. - - Element definitions are global to the
entire document.
16XML Schema
- XML Schemas
- An attempt to solve all those problems in DTD
- - Powerful data typing
- - Range checking
- - Namespace-aware validation based on
namespace URIs rather than on prefixes - - Extensibility and scalability
17XML Schema ---- example
- Here is a simple example about XML Schema
- lt?xml version"1.0"?gt
- ltxsdschema xmlnsxsd"http//www.w3.org/2001/XMLS
chema"gt - ltxsdelement name"SONG" type"SongType"/gt
- ltxsdcomplexType name"SongType"gt
- ltxsdsequencegt
- ltxsdelement name"TITLE"
type"xsdstring"/gt - ltxsdelement name"COMPOSER"
type"xsdstring"/gt - ltxsdelement name"PRODUCER"
type"xsdstring"/gt - ltxsdelement name"PUBLISHER"
type"xsdstring"/gt - ltxsdelement name"LENGTH"
type"xsdstring"/gt - ltxsdelement name"YEAR"
type"xsdstring"/gt - ltxsdelement name"ARTIST"
type"xsdstring"/gt - ltxsdelement name"PRICE"
type"xsdstring"/gt - lt/xsdsequencegt
- lt/xsdcomplexTypegt
- lt/xsdschemagt
18XML Schema ---- example
- The root element ---- schema.
- Default namespace ---- http//www.w3.org/2001/X
MLSchema with prefix xsd or xs. - Elements ---- xsdelement.
- divided into simple type and complex type.
- simple type element is one that can only
contain text and does not have any attributes. It
cannot contain any child elements. -
- Syntax ltxselement name"name" type"type"/gt
- Examples ltxselement name"to"
type"xsstring"/gt
19XML Schema ---- example
Complex type define a new type which can have
attributes and can have child elements. This is
very flexible. Syntax ltxselement
name"name"gt ltxscomplexTypegt . element
content lt/xscomplexTypegt lt/xselementgt
Example ltxselement name"note"gt
ltxscomplexTypegt ltxssequencegt ltelement
name"to" type"xsstring"/gt ltelement
name"from" type"xsstring"/gt ltelement name"he
ading" type"xsstring"/gt ltelement name"body"
type"xsstring"/gt lt/xssequencegt
lt/xscomplexTypegt lt/xselementgt
20XML Schema ---- features
- Simple Types
- - 44 built-in simple types in the W3C XML Schema
language. - - Divided into seven groups
- Numeric types
- Time types
- XML types
- String types
- The boolean type
- The URI reference type
- The binary types
21XML Schema ---- features
- Deriving Simple Types
- Not limited to the 44 simple types
- Create new data types by deriving from the
existing types -
- restrict a type to a subset of its normal values.
- eg A schema that derives a Str255 data type
from xsdstring - ltxsdsimpleType name"Str255"gt
- ltxsdrestriction base"xsdstring"gt
- ltxsdminLength value"1"/gt
- ltxsdmaxLength value"255"/gt
- lt/xsdrestrictiongt
- lt/xsdsimpleTypegt
22XML Schema ---- features
- create enumerated types
- Example
- ltxsdsimpleType name"PublisherType"gt
- ltxsdrestriction base"xsdstring"gt
- ltxsdenumeration value"Warner-Elektra-Atlan
tic"/gt - ltxsdenumeration value"Universal Music
Group"/gt - ltxsdenumeration value"Sony Music
Entertainment,Inc."/gt - ltxsdenumeration value"Capitol Records,
Inc."/gt - ltxsdenumeration value"BMG Music"/gt
- lt/xsdrestrictiongt
- lt/xsdsimpleTypegt
23XML Schema ---- features
- create new types by join existing types through
a union. - Example
- ltxsdsimpleType name"MoneyOrDecimal"gt
- ltxsduniongt
- ltxsdsimpleTypegt
- ltxsdrestriction base"xsddecimal"gt
- lt/xsdrestrictiongt
- lt/xsdsimpleTypegt
- ltxsdsimpleTypegt
- ltxsdrestriction base"xsdstring"gt
- ltxsdpattern value"\pSc\pNd(\.\pNd
\pNd)?"/gt - lt/xsdrestrictiongt
- lt/xsdsimpleTypegt
- lt/xsduniongt
- lt/xsdsimpleTypegt
24XML Schema ---- features
- Namespaces
- http//www.w3.org/2001/XMLSchema
- the namespace that identifies the names of
tags and attributes used in a schema. - The name is understood by all schema aware XML
processors. - http//www.w3.org/2001/XMLSchema-instance
- a small number of special names used in
instance documents, not schema. - - target namespace
- the set of names defined by a particular
schema document - the user-defined names that are to be used in
the instance documents.
25XML Schema ---- features
- Grouping
- - Does order really mattered? ?
- - How?
- xsdall group ---- each element in the group
must occur at most once, but that order is not
important. - xsdchoice group ---- any one element from the
group should appear. - xsdsequence group ---- each element in the group
appear exactly once, in the specified order.
26XML Schema ---- features
Example for xsdall group ltxsdcomplexType
name"PersonType"gt ltxsdsequencegt
ltxsdelement name"NAME"gt
ltxsdcomplexTypegt ltxsdallgt
ltxsdelement name"GIVEN" type"xsdstring"
minOccurs"1" maxOccurs"1"/gt
ltxsdelement name"FAMILY"
type"xsdstring"
minOccurs"1" maxOccurs"1"/gt lt/xsdallgt
lt/xsdcomplexTypegt lt/xsdelementgt
lt/xsdsequencegt lt/xsdcomplexTypegt
27XML Schema ---- features
Example for XML Choice group ltxsdcomplexType
name"SongType"gt ltxsdsequencegt
ltxsdelement name"TITLE" type"xsdstring"/gt
ltxsdchoicegt ltxsdelement name"COMPOSER"
type"PersonType"/gt ltxsdelement
name"PRODUCER" type"PersonType"/gt
lt/xsdchoicegt ltxsdelement name"PUBLISHER"
type"xsdstring"
minOccurs"0"/gt ltxsdelement name"LENGTH"
type"xsdstring"/gt ltxsdelement name"YEAR"
type"xsdstring"/gt ltxsdelement
name"ARTIST" type"xsdstring"
maxOccurs"unbounded"/gt ltxsdelement
name"PRICE" type"xsdstring" minOccurs"0"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
28XML Schema ---- features
- Schemas address limitations of DTDs
- a strange, non-XML syntax
- namespace incompatibility
- lack of data typing
- limited extensibility and scalability.
- XML Schemas
- - Powerful data typing
- - Range checking
- - Namespace-aware validation based on
namespace URIs rather than on prefixes - - Extensibility and scalability
29XML Constrains ---- DTD
- DTD
- No keys, its Referential integrity is weak
- Attributes ID, IDREF, IDREFS.
- ID ---- Unique value
- IDREF ---- Valid ID declared in same document
IDREFS ---- Valid ID, space-separated - But these are also based on type string.
-
- Element no corresponding parts.
30XML Constrains ---- Schema
- XML keys
- Similar with SQL, but complicated.
- - complex structures
- - a key might be composed of a sequence of
values - - located at different depths inside an
element. - Two ways Â
- - tag unique ---- UNIQUE constraint
- - tag key ---- PRIMARY KEY , not null
- eg ltkey namePrimaryKeyForClassgt
- ltselector xpathClasses/Class/gt
- ltfield xpathCrsCode/gt
- ltfield xpathSemester/gt
- lt/keygt
31XML Constrains ---- Schema
- Foreign keys
- eg ltcomplexTypegt
-
- ltkeyref nameNoBogusTranscripts
referadmPrimaryKeyForClassgt - ltselector xpathStudents/Student/CrsTaken/gt
- ltfield xpath_at_CrsCode/gt
- ltfield xpath_at_Semester/gt
- lt/keyrefgt
-
- lt/complexTypegt
- Powerful?
32Question
- Is XML data model relational or
object-relational? - Is XML a database?
33References
- 1 Chapter 17, XML and Web Data
- 2 Chapter 24, XML Bible (2nd edition) Schemas
- http//www.ibiblio.org/xml/books/bible2/index.html
toc - 3 http//www.w3schools.com
- http//www.w3.org/
- http//www.xml.com/
34Part II
- XML Query Language
- Counterpart of SQL in XML World
35XML Query Language
- Desired Characteristics for XML Query Language -
also Requirements - Good candidate XQuery Language
- Use Cases for XQuery Language
36Desired Characteristics
- XML Output
- Declarative - what has to be done?
- Query Operation
- No Schema Required
- Preserve Order and Association
- Mutually Embedding with XML
- Support for New Datatypes
- Suitable for Metadata
- Ability to add update capabilities in future
versions
37Details
- XML Output
- define derived database (virtual views)
- provide transparency to application (why?)
- The XML Query Language MUST be declarative - like
SQL - specifies what has to be done
- it MUST not enforce a particular evaluation
strategy
38Details (cont.)
- Query Operation
- Projection, selection, join, and restructuring
should all be possible in a single XML Query
(why?) - for optimization reason
39Query Operations
XML QUERY Details Relational Algebra
Projection Extract particular sub-elements or attributes of an element Projection
Selection Select values that satisfy some predicate Selection
Join Join values from one or more documents Join
Restructuring Constructing a new set of element instances to hold queried data Create view
40Example - Sample Data
- ltbibgt
- ltbook year"1999" isbn"1-55860-622-X"gt
- lttitlegtData on the Weblt/titlegt
- ltauthorgtAbiteboullt/authorgt
- ltauthorgtBunemanlt/authorgt
- ltauthorgtSuciult/authorgt
- lt/bookgt
- ltbook year"2001" isbn"1-XXXXX-YYY-Z"gt
- lttitlegtXML Querylt/titlegt
- ltauthorgtFernandezlt/authorgt
- ltauthorgtSuciult/authorgt
- lt/bookgt
- lt/bibgt
41Example - XML Schema
- ltxsgroup name"Bib"gt
- ltxselement name"bib"gt
- ltxscomplexTypegt
- ltxsgroup ref"Book"
- minOccurs"0"
maxOccurs"unbounded"/gt - lt/xscomplexTypegt
- lt/xselementgt
- lt/xsgroupgt
42Example - XML Schema (Cont.)
- ltxsgroup name"Book"gt
- ltxselement name"book"gt
- ltxscomplexTypegt
- ltxsattribute name"year"
type"xsinteger"/gt - ltxsattribute name"isbn"
type"xsstring"/gt - ltxselement name"title"
type"xsstring"/gt - ltxselement name"author"type"xsstring
" maxOccurs"unbounded"/gt - lt/xscomplexTypegt
- lt/xselementgt
- lt/xsgroupgt
43Variable Binding
- LET bib0
- ltbibgt
- ltbook year"1999" isbn"1-55860-622-X"gt
- lttitlegtData on the Weblt/titlegt
- ltauthorgtAbiteboullt/authorgt
- ltauthorgtBunemanlt/authorgt
- ltauthorgtSuciult/authorgt
- lt/bookgt
- ltbook year"2001" isbn"1-XXXXX-YYY-Z"gt
- lttitlegtXML Querylt/titlegt
- ltauthorgtFernandezlt/authorgt
- ltauthorgtSuciult/authorgt
- lt/bookgt),
- lt/bibgt
44Projection
- bib0/book/author
- gt ltauthorgtAbiteboullt/authorgt,
- ltauthorgtBunemanlt/authorgt,
- ltauthorgtSuciult/authorgt,
- ltauthorgtFernandezlt/authorgt,
- ltauthorgtSuciult/authorgt
- Notes the document order of author elements is
preserved
45Selection
- FOR b IN bib0/book
- WHERE b/_at_year/data() lt 2000
- RETURN b
- gt ltbook year"1999" isbn"1-55860-622-X"gt
- lttitlegtData on the Weblt/titlegt
- ltauthorgtAbiteboullt/authorgt
- ltauthorgtBunemanlt/authorgt
- ltauthorgtSuciult/authorgt
- lt/bookgt
46Join - Sample Data
- LET review0
- ltreviewsgt
- ltbookgt
- lttitlegtXML Querylt/titlegt
- ltreviewgtA darn fine book.lt/reviewgt
- lt/bookgt,
- ltbookgt
- lttitlegtData on the Weblt/titlegt
- ltreviewgtThis is great!lt/reviewgt
- lt/bookgt
- lt/reviewgt Reviews
47Join
- FOR b IN bib0/book, r IN review0/book
- WHERE b/title/data() r/title/data()
- RETURN ltbookgt b/title, b/author, r/review
lt/bookgt - gt ltbookgt
- lttitlegtData on the Weblt/titlegt
- ltauthorgtAbiteboullt/authorgt
- ltauthorgtBunemanlt/authorgt
- ltauthorgtSuciult/authorgt
- ltreviewgtA darn fine book.lt/reviewgt
- lt/bookgt,
- ltbookgt
- lttitlegtXML Querylt/titlegt
- ltauthorgtFernandezlt/authorgt
- ltauthorgtSuciult/authorgt
- ltreviewgtThis is great!lt/reviewgt
- lt/bookgt
48Restructuring
- FOR a IN distinct-value(bib0/book/author/data())
RETURN - ltbibliogt
- ltauthorgt a lt/authorgt
- FOR b IN bib0/book, a2 IN
b/author/data() - WHERE a a2 RETURN
- b/title
-
- lt/bibliogt
49Restructuring (Cont.)
- gt ltbibliogt
- ltauthorgtAbiteboullt/authorgt
- lttitlegtData on the Weblt/titlegt
- lt/bibliogt,
- ltbibliogt
- ltauthorgtBunemanlt/authorgt
- lttitlegtData on the Weblt/titlegt
- lt/bibliogt,
- ltbibliogt
- ltauthorgtSuciult/authorgt
- lttitlegtData on the Weblt/titlegt
- lttitlegtXML Querylt/titlegt
- lt/bibliogt,
- ltbibliogt
- ltauthorgtFernandezlt/authorgt
- lttitlegtXML Querylt/titlegt
- lt/bibliogt
50Details (cont.)
- No Schema Required
- XML Query should be usable on XML data when there
is no schema (DTD or XML Schema) known in
advance. But it should be able to exploit the
schema if the schema is available. - Preserve Order and Association
- XML Query should preserve order and association
of elements in XML data (why?)
51Details (cont.)
- Mutually Embedding with XML
- An XML Query should be able to contain arbitrary
XML data, and an XML document should be able to
hold arbitrary XML Queries - Support for New Datatypes
- XML Query should have an extension mechanism for
conditions and operations specific to a
particular datatypes (e.g. multimedia data).
52Details (cont.)
- Suitable for Metadata
- XML Query should be useful as a part of metadata
descriptions (how?) - Question how about metadata in relational
database? - The current version MUST not preclude the ability
to add update capabilities in future versions
53Your Idea?
- Any other characteristics you desire?
54XQuery Language
- Overview
- XPath
- XQuery 1.0 Semantics
- Future work for XQuery
55Overview
- Combine the best features of XPath, SQL and ideas
borrowed from object query language.
56XPath
- Language for navigation with tree-structured
documents - XPath data model
- XML document Tree
- Element
- Attribute Node
- text
- comment
57Navigation in XPath
- Operators
- Root /
- Parent ..
- Child (descendant) / or //
- Attribute value _at_
- Comment comment() function
- Text text() function
- Element ltelement namegt
- Wildcards
- all e-children of a node irrespective of type,
not including text nodes - _at_ all attributes
- // all descendants of current node
58XPath expression
- Combination of XPath operators
- Input a document tree
- Output a set of nodes
- Absolute path expression
- start from the root node
- Relative path expression
- start from the current node
59XPath query
- Selection conditions
- Built-in functions
- Aggregate functions
60Example XML file
- ltstudentsgt
- ltstudent studid996341111gt
- ltnamegtltfirstgtJohnlt/firstgtltlastgtDoelt/lastgtlt/n
amegt - ltstatusgtU2lt/statusgt
- ltcrstaken crscodeCS503
semesterS2002/gt - ltcrstaken crscodeCS561
semesterS2002/gt - lt/studentgt
- ltstudent studid996342222gt
- ltnamegtltfirstgtBartlt/firstgtltlastgtSimpsonlt/last
gtlt/namegt - ltstatusgtU4lt/statusgt
- ltcrstaken crscodeCS504
semesterS2002/gt - lt/studentgt
- lt/studentsgt
61XPath Document Tree
root
comment
comment
students
student
student
name
studid
status
crstaken
crstaken
first
last
crscode
crscode
semester
John
Doe
U2
semester
62Example XPath Query
- //studentstatusU2 and start-with(.//last,
D) and not (.//last.//first) - //studentcount(crstaken) gt5
- //studentcrstaken/_at_crscodeCS561
- crstaken/_at_semesterS2002
63Why XPath is not satisfying?
- Just for navigating, can only support limited
queries - Cannot express join
- Cannot work on multiple XML documents
- Cannot filter unwanted elements
- Not support user-defined functions
- Not support importation and use of the types
defined in various XML schemas - Any other limitations you can think of?
64A better candidate for XML Query?
- XQuery Language incorporates all the above
characteristics - Any other characteristics you can think of?
- XQuery engine Kweelt
- http//kweelt.sourceforge.net/
65XQuery expressions
- Path expressions
- FLWR expressions
- Element constructors
- Expressions involving operators and functions
- Conditional expressions
- Quantified expressions
- List constructors
- Expressions that test or modify datatypes
66XQuery FLWR Expressions
- A FLWR expression binds some expressions, applies
a predicate, and constructs a new result. - FOR var IN expr .
- LET var expr .
- WHERE expr .
- RETURN expr .
FOR and LET clauses grnerate a list of tuples of
bound exprs, preserving document order
WHERE clause applies a predicate, eliminating
some of the tuples
RETURN clause is executing for each surviving
tuple, generating an ordered list of outputs
67Example - DTD
- lt!ELEMENT reviews (entry)gt
- lt!ELEMENT entry (title, price, review)gt
- lt!ELEMENT title (PCDATA)gt
- lt!ELEMENT price (PCDATA)gt
- lt!ELEMENT review (PCDATA)gt
68Example Sample Data
- http//www.amazon.com/reviews.xml
- ltreviewsgt
- ltentrygt
- lttitlegtData on the Weblt/titlegt
- ltpricegt34.95lt/pricegt
- ltreviewgt
- a good discussion of database
systems and XML. - lt/reviewgt
- lt/entrygt
- ltentrygt
- lttitlegtAdvanced Unix Programming lt/titlegt
- ltpricegt65.95lt/pricegt
- ltreviewgt
- a good discussion of UNIX
programming. - lt/reviewgt
- lt/entrygt
- lt/reviewsgt
69Example - Request
- For each book found at both www.bn.com and
www.amazon.com, list the title of the book and
its price from each source
70Example - Query
- ltbooks-with-pricesgt
-
- for b in document("www.bn.com/bib.xml")//book
, - a in document("www.amazon.com/reviews.xml
")//entry - where b/title a/title
- return
- ltbook-with-pricesgt
- b/title
- ltprice-amazongt a/price/data()
lt/price-amazongt - ltprice-bngt b/price/data()
lt/price-bngt - lt/book-with-pricesgt
-
- lt/books-with-pricesgt
71Example - Result
- ltbooks-with-pricesgt
- ltbook-with-pricesgt
- lttitlegtAdvanced Unix Programminglt/titlegt
- ltprice-amazongt65.95lt/price-amazongt
- ltprice-bngt65.95lt/price-bngt
- lt/book-with-pricesgt
- ltbook-with-pricesgt
- lttitlegtData on the Weblt/titlegt
- ltprice-amazongt34.95lt/price-amazongt
- ltprice-bngt 39.95lt/price-bngt
- lt/book-with-pricesgt
- lt/books-with-pricesgt
72Use Cases
- Use Case 1 Queries that reserve hierarchy
- Use Case 2 Access to relational data
73Use Case 1 Queries that reserve hierarchy
- XML document has flexible structure
- Text is mixed with elements
- Many elements are optional
- Wide variation in structure from one document to
another - The ways in which elements are ordered and nested
are quite important (Can you give me an example?)
74Use Case 1 - DTD
- lt!DOCTYPE book
- lt!ELEMENT book (title, author, section)gt
- lt!ELEMENT title (PCDATA)gt
- lt!ELEMENT author (PCDATA)gt
- lt!ELEMENT section (title, (p figure section)
)gt - lt!ATTLIST section
- id ID IMPLIED
- difficulty CDATA IMPLIEDgt
- lt!ELEMENT p (PCDATA)gt
- lt!ELEMENT figure (title, image)gt
- lt!ATTLIST figure
- width CDATA REQUIRED
- height CDATA REQUIRED gt
- lt!ELEMENT image EMPTYgt
- lt!ATTLIST image
- source CDATA REQUIRED gt
- gt
75- ltbookgt
- lttitlegtData on the Weblt/titlegt
- ltauthorgtSerge Abiteboullt/authorgt
- ltauthorgtPeter Bunemanlt/authorgt
- ltsection id"intro" difficulty"easy" gt
- lttitlegtIntroductionlt/titlegt
- ltpgtText ... lt/pgt
- ltsectiongt
- lttitlegtAudiencelt/titlegt
- ltpgtText ... lt/pgt
- lt/sectiongt
- ltsectiongt
- lttitlegtWeb Data and the Two Cultureslt/titlegt
- ltpgtText ... lt/pgt
- ltfigure height"400" width"400"gt
- lttitlegtTraditional client/server
architecturelt/titlegt - ltimage source"csarch.gif"/gt
- lt/figuregt
- ltpgtText ... lt/pgt
76Use Case 1 - Request
- List all the sections and their titles. Preserve
the original attributes of each ltsectiongt
element, if any. - Questions
- Do we need all the elements?
- How could we eliminate unwanted elements?
- How could we preserve the original attributes?
77Use Case 1 - Solution
- lttocgt
-
- Let b document(book1.xml)
- Return
- Filter(b//section b//section/title
b//section/title/data()) -
- lt/tocgt
78Use Case 1 - Result
- lttocgt
- ltsection id"intro" difficulty"easy"gt
- lttitlegtIntroductionlt/titlegt
- ltsectiongt
- lttitlegtAudiencelt/titlegt
- lt/sectiongt
- ltsectiongt
- lttitlegtWeb Data and the Two Cultureslt/titlegt
- lt/sectiongt
- lt/sectiongt
- ltsection id"syntax" difficulty"medium"gt
- ...
- lt/sectiongt
- lt/tocgt
79Use Case 2 - Access to Relational Data
- Questions
- How to represent relational tables as XML
document? - Do we need multiple XML documents?
- How does XQuery work on multiple XML documents?
80Use Case 2 - Access to Relational Data
- Represent database table as XML document
- Document element lt-gt table
- Tuple lt-gt nested element
- Column lt-gt nested element inside tuple-element
- Column that allow null values are represented by
optional elements, and a missing element denotes
a null value
81Use Cases 2 - Online Auction
- Tables
- USERS (userid, name, rating)
- Contains info on registered users
- ITEMS (itemno, description, offered_by,
start_date, end_date, reserve_price) - Lists items currently or recently for sale
- BIDS (userid, itemno, bid, bid_date)
- Contains all bids on record
82Simplified E-R Diagram
userid
itemno
USERS
ITEMS
BIDS
userid
itemno
83Use Case 2 - DTD
- lt!DOCTYPE users
- lt!ELEMENT users (user_tuple)gt
- lt!ELEMENT user_tuple (userid, name, rating?)gt
- lt!ELEMENT userid (PCDATA)gt
- lt!ELEMENT name (PCDATA)gt
- lt!ELEMENT rating (PCDATA)gt
- gt
84Use Case 2 - DTD
- lt!DOCTYPE items
- lt!ELEMENT items (item_tuple)gt
- lt!ELEMENT item_tuple (itemno, description,
offered_by, start_date?, end_date?,
reserve_price? )gt - lt!ELEMENT itemno (PCDATA)gt
- lt!ELEMENT description (PCDATA)gt
- lt!ELEMENT offered_by (PCDATA)gt
- lt!ELEMENT start_date (PCDATA)gt
- lt!ELEMENT end_date (PCDATA)gt
- lt!ELEMENT reserve_price (PCDATA)gt
- gt
85Use Case 2 - DTD
- lt!DOCTYPE bids
- lt!ELEMENT bids (bid_tuple)gt
- lt!ELEMENT bid_tuple (userid, itemno, bid,
bid_date)gt - lt!ELEMENT userid (PCDATA)gt
- lt!ELEMENT itemno (PCDATA)gt
- lt!ELEMENT bid (PCDATA)gt
- lt!ELEMENT bid_date (PCDATA)gt
- gt
86Use Case 2 - Sample Data
USERID NAME RATING
U01 Tom Jones B
U02 Mary Doe A
U04 Roger Smith C
U05 Rip Sprat B
87Use Case 2 - Sample Data
ITEMNO DESCRIPTION OFFERED_BY START_DATE END_DATE RESERVE_ PRICE
1001 Red Bicycle U01 01-01-05 01-01-20 40
1002 Motorcycle U02 01-02-11 01-03-15 500
1003 Old Bicycle U02 01-01-10 01-02-20 25
88Use Case 2 Sample Data
USERID ITEMNO BID BID_DATE
U02 1001 35 01-01-07
U04 1001 40 01-01-08
U02 1001 45 01-01-11
U04 1001 55 01-01-15
U01 1002 400 01-02-14
U02 1002 600 01-02-16
U04 1002 1000 01-02-25
U02 1002 1200 01-03-02
U04 1003 15 01-01-22
U05 1003 20 01-02-03
89Use Case 2 - Request
- Request
- For all bicycles, list the item number,
description, and highest bid (if any), ordered by
item number.
90Use Case 2 Solution
- ltresultgt
-
- for i in document("items.xml")//item_tuple
- let b document("bids.xml")//bid_tupleitem
no i/itemno - where contains(i/description, "Bicycle")
- return
- ltitem_tuplegt
- i/itemno
- i/description
- lthigh_bidgt max(b/bid) lt/high_bidgt
- lt/item_tuplegt
- sortby(itemno)
-
- lt/resultgt
91Use Case 2 Result (Bingo!)
- ltresultgt
- ltitem_tuplegt
- ltitemnogt1001lt/itemnogt
- ltdescriptiongtRed Bicyclelt/descriptiongt
- lthigh_bidgt
- ltbidgt55lt/bidgt
- lt/high_bidgt
- lt/item_tuplegt
- ltitem_tuplegt
- ltitemnogt1003lt/itemnogt
- ltdescriptiongtOld Bicyclelt/descriptiongt
- lthigh_bidgt
- ltbidgt20lt/bidgt
- lt/high_bidgt
- lt/item_tuplegt
- lt/resultgt
92Future Work about XQuery
- Add support for new desired characteristics
- What are they?
- Any other future work?
93Bibliography
- Chapter 17, XML and Web Data
- XML Query Requirements
- http//www.w3.org/TR/2001/WD-xmlquery-req-20010215
- XML Query Use Cases, W3C Working Draft 20
December 2001 - http//www.w3.org/TR/2001/WD-xmlquery-use-cases-20
011220 - Database Desiderata for an XML Query Language,
David Maier, Oregon Graduate Institute