Ling Wang - PowerPoint PPT Presentation

About This Presentation
Title:

Ling Wang

Description:

... bid_date) Contains all bids on record Simplified E-R Diagram Use Case 2 ... note SYSTEM – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 94
Provided by: Default
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Ling Wang


1
XML XML Query
Ling Wang Luping Ding
2
Introduction
  • The Web opens a new challenges in
  • - information technology
  • - database framework.
  • Why?
  • - Data sources on the Web do NOT typically
    conform to any well-known structure.
  • - Traditional databases technology is not
    adequate in dealing with rich data
  • eg audio, video, nested data structures

3
Features of Web Data
  • Web data characteristics, called semistructured
  • Object-like
  • a collection of complex objects from CODM.
  • Schema-less
  • Not typically conform to any type traditional
    structure.
  • Self-describing
  • meaning of the data is carried along with the
    data itself.
  • So, we need new database technologies to support
    those Web-based applications.

4
What is XML?
  • XML---- Extensible Markup Language
  • - A mark up language for documents containing
    structured information.
  • - Universal format for structured documents
    and data on the Web.
  • - An HTML-like language.
  • XML specification defines a standard way to add
    markup to documents.
  • Note Structured information , Markup language

5
What is XML ---- example
A XML example for customer information ltcustomer
-details id"AcPharm39156"gt ltnamegtAcme
Pharmaceuticals Co.lt/namegt ltaddress
country"US"gt ltstreetgt7301 Smokey
Boulevardlt/streetgt ltcitygtSmallvillelt/citygt
ltstategtIndianalt/stategt ltpostalgt94571lt/postalgt
lt/addressgt lt/customer-detailsgt
6
XML vs. HTML?
XML HTML
XML is extensible - NOT specifies semantics or tag set - Just facility Not extensible - Fix tag semantics and tag set - Defined by W3C(the World Wide Web Consortium).
XML document is well formed - A root element. - Opening tag is followed by a matching closing tag. - Element properly nested. Not strict required. - Tags are not required to be closed. - Browsers will forgive etc.
7
Overview of XML
  • Mechanisms for specifying document structure
  • ---- a set of rules for structuring an XML
    document.
  • DTD ---- Document type definition language
  • (A part of XML standard )
  • XML Schema ---- A more recent specification
  •  
  • Query languages for XML
  • XPath , XSLT, XQuery

8
Basic concept in XML ---- element attributes
  • XML element
  • Any properly nested piece of text of the form
  • ltsometaggtlt/sometaggt.
  • eg ltstreetgt7301 Smokey Boulevardlt/streetgt
  • XML Attributes
  • also a tools for datapresentation.
  • eg ltcustomer-details id"AcPharm39156"gt
    lt/customer-detailsgt

content
name
9
Basic concept in XML ---- namespace
  • Namespaces
  • - Why?
  • Element names in XML are not fixed, name
    conflict.
  • - How?
  • Different authors use different namespace
    identifiers for different domains.
  • The general structure namespacelocal-name
  • Namespace ---- URI (uniform resource
    identifier) URL (uniform resource locator) or
    URN (universal resource name).
  • Local name ---- same form as regular XML tags.
  • No a in it.

10
Basic concept in XML ---- namespace
  • An example of Namespaces
  • ltitem xmlns"http//www.acmeinc.com/jpsupplies
    "gt
  • xmlnstoyhttp//www.acmeinc.com/jptoysgt
  • ltnamegtAfrican Coffee Tablelt/namegt
  • ltfeaturegt
  • lttoyitemgt
  • lttoynamegtcyberpetlt/toynamegt
  • lt/toyitemgt
  • lt/featuregt
  • lt/itemgt

default namespace
11
DTD ---- Document Type Definitions
  • Why DTD?
  • - XML files carry a description of its own
    format with it.
  • - Independent groups of people can agree with
    interchanging data.
  • - Application verify received data from the
    outside world
  • - Also verify own data.
  • How?
  • - DTD is included in your XML source file
  • lt!DOCTYPE root-element element-declarations
    gt
  • - DTD is external to your XML source file
  • lt!DOCTYPE root-element SYSTEM "filename"gt

12
DTD ---- example
  • Example XML document with a DTD
  • lt?xml version"1.0"?gt
  • lt!DOCTYPE note
  • lt!ELEMENT note (to,from,heading,body)gt
  • lt!ELEMENT to (PCDATA)gt
  • lt!ELEMENT from (PCDATA)gt
  • lt!ELEMENT heading (PCDATA)gt
  • lt!ELEMENT body (PCDATA)gt
  • gt
  • ltnotegt
  • lttogtTovelt/togt
  • ltfromgtJanilt/fromgt
  • ltheadinggtReminderlt/headinggt
  • ltbodygtDon't forget me this weekendlt/bodygt
  • lt/notegt

13
DTD ---- example
XML document with an external DTD lt?xml
version"1.0"?gt lt!DOCTYPE note SYSTEM
"note.dtd"gt ltnotegt lttogtTovelt/togt ltfromgtJanilt/fromgt
ltheadinggtReminderlt/headinggt ltbodygtDon't forget
me this weekend!lt/bodygt lt/notegt "note.dtd"
containing the DTD lt!ELEMENT note
(to,from,heading,body)gt lt!ELEMENT to
(PCDATA)gt lt!ELEMENT from (PCDATA)gt lt!ELEMENT
heading (PCDATA)gt lt!ELEMENT body (PCDATA)gt
14
DTD ---- Inadequacy
  • Inadequacy of DTD
  • - Not designed with namespaces..
  • - Use syntax ---- quite different from XML
    document.
  • - A very limited set of basic types
  • - Provide only limited means for expressing
    data consistency constraints.
  • No keys
  • Referential integrity is weak
  • Attributes can be type ID, IDREF, IDREFS.
  • No for element.

15
DTD ---- Inadequacy
  • Inadequacy of DTD
  • - No ways of enforcing referential integrity
    for elements.
  • - Use alternatives to state that the order
    of elements is immaterial. Terrible as the
    number of attributes grows.
  • - Element definitions are global to the
    entire document.

16
XML Schema
  • XML Schemas
  • An attempt to solve all those problems in DTD
  • - Powerful data typing
  • - Range checking
  • - Namespace-aware validation based on
    namespace URIs rather than on prefixes
  • - Extensibility and scalability

17
XML Schema ---- example
  • Here is a simple example about XML Schema
  • lt?xml version"1.0"?gt
  • ltxsdschema xmlnsxsd"http//www.w3.org/2001/XMLS
    chema"gt
  • ltxsdelement name"SONG" type"SongType"/gt
  • ltxsdcomplexType name"SongType"gt
  • ltxsdsequencegt
  • ltxsdelement name"TITLE"
    type"xsdstring"/gt
  • ltxsdelement name"COMPOSER"
    type"xsdstring"/gt
  • ltxsdelement name"PRODUCER"
    type"xsdstring"/gt
  • ltxsdelement name"PUBLISHER"
    type"xsdstring"/gt
  • ltxsdelement name"LENGTH"
    type"xsdstring"/gt
  • ltxsdelement name"YEAR"
    type"xsdstring"/gt
  • ltxsdelement name"ARTIST"
    type"xsdstring"/gt
  • ltxsdelement name"PRICE"
    type"xsdstring"/gt
  • lt/xsdsequencegt
  • lt/xsdcomplexTypegt
  • lt/xsdschemagt

18
XML Schema ---- example
  • The root element ---- schema.
  • Default namespace ---- http//www.w3.org/2001/X
    MLSchema with prefix xsd or xs.
  • Elements ---- xsdelement.
  • divided into simple type and complex type.
  • simple type element is one that can only
    contain text and does not have any attributes. It
    cannot contain any child elements.
  • Syntax ltxselement name"name" type"type"/gt
  • Examples ltxselement name"to"
    type"xsstring"/gt

19
XML Schema ---- example
Complex type define a new type which can have
attributes and can have child elements. This is
very flexible. Syntax ltxselement
name"name"gt ltxscomplexTypegt . element
content lt/xscomplexTypegt lt/xselementgt
Example ltxselement name"note"gt
ltxscomplexTypegt ltxssequencegt ltelement
name"to" type"xsstring"/gt ltelement
name"from" type"xsstring"/gt ltelement name"he
ading" type"xsstring"/gt ltelement name"body"
type"xsstring"/gt lt/xssequencegt
lt/xscomplexTypegt lt/xselementgt
20
XML Schema ---- features
  • Simple Types
  • - 44 built-in simple types in the W3C XML Schema
    language.
  • - Divided into seven groups
  • Numeric types
  • Time types
  • XML types
  • String types
  • The boolean type
  • The URI reference type
  • The binary types

21
XML Schema ---- features
  • Deriving Simple Types
  • Not limited to the 44 simple types
  • Create new data types by deriving from the
    existing types
  • restrict a type to a subset of its normal values.
  • eg A schema that derives a Str255 data type
    from xsdstring
  • ltxsdsimpleType name"Str255"gt
  • ltxsdrestriction base"xsdstring"gt
  • ltxsdminLength value"1"/gt
  • ltxsdmaxLength value"255"/gt
  • lt/xsdrestrictiongt
  • lt/xsdsimpleTypegt

22
XML Schema ---- features
  • create enumerated types
  • Example
  • ltxsdsimpleType name"PublisherType"gt
  • ltxsdrestriction base"xsdstring"gt
  • ltxsdenumeration value"Warner-Elektra-Atlan
    tic"/gt
  • ltxsdenumeration value"Universal Music
    Group"/gt
  • ltxsdenumeration value"Sony Music
    Entertainment,Inc."/gt
  • ltxsdenumeration value"Capitol Records,
    Inc."/gt
  • ltxsdenumeration value"BMG Music"/gt
  • lt/xsdrestrictiongt
  • lt/xsdsimpleTypegt

23
XML Schema ---- features
  • create new types by join existing types through
    a union.
  • Example
  • ltxsdsimpleType name"MoneyOrDecimal"gt
  • ltxsduniongt
  • ltxsdsimpleTypegt
  • ltxsdrestriction base"xsddecimal"gt
  • lt/xsdrestrictiongt
  • lt/xsdsimpleTypegt
  • ltxsdsimpleTypegt
  • ltxsdrestriction base"xsdstring"gt
  • ltxsdpattern value"\pSc\pNd(\.\pNd
    \pNd)?"/gt
  • lt/xsdrestrictiongt
  • lt/xsdsimpleTypegt
  • lt/xsduniongt
  • lt/xsdsimpleTypegt

24
XML Schema ---- features
  • Namespaces
  • http//www.w3.org/2001/XMLSchema
  • the namespace that identifies the names of
    tags and attributes used in a schema.
  • The name is understood by all schema aware XML
    processors.
  • http//www.w3.org/2001/XMLSchema-instance
  • a small number of special names used in
    instance documents, not schema.
  • - target namespace
  • the set of names defined by a particular
    schema document
  • the user-defined names that are to be used in
    the instance documents.

25
XML Schema ---- features
  • Grouping
  • - Does order really mattered? ?
  • - How?
  • xsdall group ---- each element in the group
    must occur at most once, but that order is not
    important.
  • xsdchoice group ---- any one element from the
    group should appear.
  • xsdsequence group ---- each element in the group
    appear exactly once, in the specified order.

26
XML Schema ---- features
Example for xsdall group ltxsdcomplexType
name"PersonType"gt ltxsdsequencegt
ltxsdelement name"NAME"gt
ltxsdcomplexTypegt ltxsdallgt
ltxsdelement name"GIVEN" type"xsdstring"
minOccurs"1" maxOccurs"1"/gt
ltxsdelement name"FAMILY"
type"xsdstring"
minOccurs"1" maxOccurs"1"/gt lt/xsdallgt
lt/xsdcomplexTypegt lt/xsdelementgt
lt/xsdsequencegt lt/xsdcomplexTypegt
27
XML Schema ---- features
Example for XML Choice group ltxsdcomplexType
name"SongType"gt ltxsdsequencegt
ltxsdelement name"TITLE" type"xsdstring"/gt
ltxsdchoicegt ltxsdelement name"COMPOSER"
type"PersonType"/gt ltxsdelement
name"PRODUCER" type"PersonType"/gt
lt/xsdchoicegt ltxsdelement name"PUBLISHER"
type"xsdstring"
minOccurs"0"/gt ltxsdelement name"LENGTH"
type"xsdstring"/gt ltxsdelement name"YEAR"
type"xsdstring"/gt ltxsdelement
name"ARTIST" type"xsdstring"
maxOccurs"unbounded"/gt ltxsdelement
name"PRICE" type"xsdstring" minOccurs"0"/gt
lt/xsdsequencegt lt/xsdcomplexTypegt
28
XML Schema ---- features
  • Schemas address limitations of DTDs
  • a strange, non-XML syntax
  • namespace incompatibility
  • lack of data typing
  • limited extensibility and scalability.
  • XML Schemas
  • - Powerful data typing
  • - Range checking
  • - Namespace-aware validation based on
    namespace URIs rather than on prefixes
  • - Extensibility and scalability

29
XML Constrains ---- DTD
  • DTD
  • No keys, its Referential integrity is weak
  • Attributes ID, IDREF, IDREFS.
  • ID ---- Unique value
  • IDREF ---- Valid ID declared in same document
    IDREFS ---- Valid ID, space-separated
  • But these are also based on type string.
  • Element no corresponding parts.

30
XML Constrains ---- Schema
  • XML keys
  • Similar with SQL, but complicated.
  • - complex structures
  • - a key might be composed of a sequence of
    values
  • - located at different depths inside an
    element.
  • Two ways  
  • - tag unique ---- UNIQUE constraint
  • - tag key ---- PRIMARY KEY , not null
  • eg ltkey namePrimaryKeyForClassgt
  • ltselector xpathClasses/Class/gt
  • ltfield xpathCrsCode/gt
  • ltfield xpathSemester/gt
  • lt/keygt

31
XML Constrains ---- Schema
  • Foreign keys
  • eg ltcomplexTypegt
  • ltkeyref nameNoBogusTranscripts
    referadmPrimaryKeyForClassgt
  • ltselector xpathStudents/Student/CrsTaken/gt
  • ltfield xpath_at_CrsCode/gt
  • ltfield xpath_at_Semester/gt
  • lt/keyrefgt
  • lt/complexTypegt
  • Powerful?

32
Question
  • Is XML data model relational or
    object-relational?
  • Is XML a database?

33
References
  • 1 Chapter 17, XML and Web Data
  • 2 Chapter 24, XML Bible (2nd edition) Schemas
  • http//www.ibiblio.org/xml/books/bible2/index.html
    toc
  • 3 http//www.w3schools.com
  • http//www.w3.org/
  • http//www.xml.com/

34
Part II
  • XML Query Language
  • Counterpart of SQL in XML World

35
XML Query Language
  • Desired Characteristics for XML Query Language -
    also Requirements
  • Good candidate XQuery Language
  • Use Cases for XQuery Language

36
Desired Characteristics
  • XML Output
  • Declarative - what has to be done?
  • Query Operation
  • No Schema Required
  • Preserve Order and Association
  • Mutually Embedding with XML
  • Support for New Datatypes
  • Suitable for Metadata
  • Ability to add update capabilities in future
    versions

37
Details
  • XML Output
  • define derived database (virtual views)
  • provide transparency to application (why?)
  • The XML Query Language MUST be declarative - like
    SQL
  • specifies what has to be done
  • it MUST not enforce a particular evaluation
    strategy

38
Details (cont.)
  • Query Operation
  • Projection, selection, join, and restructuring
    should all be possible in a single XML Query
    (why?)
  • for optimization reason

39
Query Operations
XML QUERY Details Relational Algebra
Projection Extract particular sub-elements or attributes of an element Projection
Selection Select values that satisfy some predicate Selection
Join Join values from one or more documents Join
Restructuring Constructing a new set of element instances to hold queried data Create view
40
Example - Sample Data
  • ltbibgt
  • ltbook year"1999" isbn"1-55860-622-X"gt
  • lttitlegtData on the Weblt/titlegt
  • ltauthorgtAbiteboullt/authorgt
  • ltauthorgtBunemanlt/authorgt
  • ltauthorgtSuciult/authorgt
  • lt/bookgt
  • ltbook year"2001" isbn"1-XXXXX-YYY-Z"gt
  • lttitlegtXML Querylt/titlegt
  • ltauthorgtFernandezlt/authorgt
  • ltauthorgtSuciult/authorgt
  • lt/bookgt
  • lt/bibgt

41
Example - XML Schema
  • ltxsgroup name"Bib"gt
  • ltxselement name"bib"gt
  • ltxscomplexTypegt
  • ltxsgroup ref"Book"
  • minOccurs"0"
    maxOccurs"unbounded"/gt
  • lt/xscomplexTypegt
  • lt/xselementgt
  • lt/xsgroupgt

42
Example - XML Schema (Cont.)
  • ltxsgroup name"Book"gt
  • ltxselement name"book"gt
  • ltxscomplexTypegt
  • ltxsattribute name"year"
    type"xsinteger"/gt
  • ltxsattribute name"isbn"
    type"xsstring"/gt
  • ltxselement name"title"
    type"xsstring"/gt
  • ltxselement name"author"type"xsstring
    " maxOccurs"unbounded"/gt
  • lt/xscomplexTypegt
  • lt/xselementgt
  • lt/xsgroupgt

43
Variable Binding
  • LET bib0
  • ltbibgt
  • ltbook year"1999" isbn"1-55860-622-X"gt
  • lttitlegtData on the Weblt/titlegt
  • ltauthorgtAbiteboullt/authorgt
  • ltauthorgtBunemanlt/authorgt
  • ltauthorgtSuciult/authorgt
  • lt/bookgt
  • ltbook year"2001" isbn"1-XXXXX-YYY-Z"gt
  • lttitlegtXML Querylt/titlegt
  • ltauthorgtFernandezlt/authorgt
  • ltauthorgtSuciult/authorgt
  • lt/bookgt),
  • lt/bibgt

44
Projection
  • bib0/book/author
  • gt ltauthorgtAbiteboullt/authorgt,
  • ltauthorgtBunemanlt/authorgt,
  • ltauthorgtSuciult/authorgt,
  • ltauthorgtFernandezlt/authorgt,
  • ltauthorgtSuciult/authorgt
  • Notes the document order of author elements is
    preserved

45
Selection
  • FOR b IN bib0/book
  • WHERE b/_at_year/data() lt 2000
  • RETURN b
  • gt ltbook year"1999" isbn"1-55860-622-X"gt
  • lttitlegtData on the Weblt/titlegt
  • ltauthorgtAbiteboullt/authorgt
  • ltauthorgtBunemanlt/authorgt
  • ltauthorgtSuciult/authorgt
  • lt/bookgt

46
Join - Sample Data
  • LET review0
  • ltreviewsgt
  • ltbookgt
  • lttitlegtXML Querylt/titlegt
  • ltreviewgtA darn fine book.lt/reviewgt
  • lt/bookgt,
  • ltbookgt
  • lttitlegtData on the Weblt/titlegt
  • ltreviewgtThis is great!lt/reviewgt
  • lt/bookgt
  • lt/reviewgt Reviews

47
Join
  • FOR b IN bib0/book, r IN review0/book
  • WHERE b/title/data() r/title/data()
  • RETURN ltbookgt b/title, b/author, r/review
    lt/bookgt
  • gt ltbookgt
  • lttitlegtData on the Weblt/titlegt
  • ltauthorgtAbiteboullt/authorgt
  • ltauthorgtBunemanlt/authorgt
  • ltauthorgtSuciult/authorgt
  • ltreviewgtA darn fine book.lt/reviewgt
  • lt/bookgt,
  • ltbookgt
  • lttitlegtXML Querylt/titlegt
  • ltauthorgtFernandezlt/authorgt
  • ltauthorgtSuciult/authorgt
  • ltreviewgtThis is great!lt/reviewgt
  • lt/bookgt

48
Restructuring
  • FOR a IN distinct-value(bib0/book/author/data())
    RETURN
  • ltbibliogt
  • ltauthorgt a lt/authorgt
  • FOR b IN bib0/book, a2 IN
    b/author/data()
  • WHERE a a2 RETURN
  • b/title
  • lt/bibliogt

49
Restructuring (Cont.)
  • gt ltbibliogt
  • ltauthorgtAbiteboullt/authorgt
  • lttitlegtData on the Weblt/titlegt
  • lt/bibliogt,
  • ltbibliogt
  • ltauthorgtBunemanlt/authorgt
  • lttitlegtData on the Weblt/titlegt
  • lt/bibliogt,
  • ltbibliogt
  • ltauthorgtSuciult/authorgt
  • lttitlegtData on the Weblt/titlegt
  • lttitlegtXML Querylt/titlegt
  • lt/bibliogt,
  • ltbibliogt
  • ltauthorgtFernandezlt/authorgt
  • lttitlegtXML Querylt/titlegt
  • lt/bibliogt

50
Details (cont.)
  • No Schema Required
  • XML Query should be usable on XML data when there
    is no schema (DTD or XML Schema) known in
    advance. But it should be able to exploit the
    schema if the schema is available.
  • Preserve Order and Association
  • XML Query should preserve order and association
    of elements in XML data (why?)

51
Details (cont.)
  • Mutually Embedding with XML
  • An XML Query should be able to contain arbitrary
    XML data, and an XML document should be able to
    hold arbitrary XML Queries
  • Support for New Datatypes
  • XML Query should have an extension mechanism for
    conditions and operations specific to a
    particular datatypes (e.g. multimedia data).

52
Details (cont.)
  • Suitable for Metadata
  • XML Query should be useful as a part of metadata
    descriptions (how?)
  • Question how about metadata in relational
    database?
  • The current version MUST not preclude the ability
    to add update capabilities in future versions

53
Your Idea?
  • Any other characteristics you desire?

54
XQuery Language
  • Overview
  • XPath
  • XQuery 1.0 Semantics
  • Future work for XQuery

55
Overview
  • Combine the best features of XPath, SQL and ideas
    borrowed from object query language.

56
XPath
  • Language for navigation with tree-structured
    documents
  • XPath data model
  • XML document Tree
  • Element
  • Attribute Node
  • text
  • comment

57
Navigation in XPath
  • Operators
  • Root /
  • Parent ..
  • Child (descendant) / or //
  • Attribute value _at_
  • Comment comment() function
  • Text text() function
  • Element ltelement namegt
  • Wildcards
  • all e-children of a node irrespective of type,
    not including text nodes
  • _at_ all attributes
  • // all descendants of current node

58
XPath expression
  • Combination of XPath operators
  • Input a document tree
  • Output a set of nodes
  • Absolute path expression
  • start from the root node
  • Relative path expression
  • start from the current node

59
XPath query
  • Selection conditions
  • Built-in functions
  • Aggregate functions

60
Example XML file
  • ltstudentsgt
  • ltstudent studid996341111gt
  • ltnamegtltfirstgtJohnlt/firstgtltlastgtDoelt/lastgtlt/n
    amegt
  • ltstatusgtU2lt/statusgt
  • ltcrstaken crscodeCS503
    semesterS2002/gt
  • ltcrstaken crscodeCS561
    semesterS2002/gt
  • lt/studentgt
  • ltstudent studid996342222gt
  • ltnamegtltfirstgtBartlt/firstgtltlastgtSimpsonlt/last
    gtlt/namegt
  • ltstatusgtU4lt/statusgt
  • ltcrstaken crscodeCS504
    semesterS2002/gt
  • lt/studentgt
  • lt/studentsgt

61
XPath Document Tree
root
comment
comment
students
student
student
name
studid
status
crstaken
crstaken
first
last
crscode
crscode
semester
John
Doe
U2
semester
62
Example XPath Query
  • //studentstatusU2 and start-with(.//last,
    D) and not (.//last.//first)
  • //studentcount(crstaken) gt5
  • //studentcrstaken/_at_crscodeCS561
  • crstaken/_at_semesterS2002

63
Why XPath is not satisfying?
  • Just for navigating, can only support limited
    queries
  • Cannot express join
  • Cannot work on multiple XML documents
  • Cannot filter unwanted elements
  • Not support user-defined functions
  • Not support importation and use of the types
    defined in various XML schemas
  • Any other limitations you can think of?

64
A better candidate for XML Query?
  • XQuery Language incorporates all the above
    characteristics
  • Any other characteristics you can think of?
  • XQuery engine Kweelt
  • http//kweelt.sourceforge.net/

65
XQuery expressions
  • Path expressions
  • FLWR expressions
  • Element constructors
  • Expressions involving operators and functions
  • Conditional expressions
  • Quantified expressions
  • List constructors
  • Expressions that test or modify datatypes

66
XQuery FLWR Expressions
  • A FLWR expression binds some expressions, applies
    a predicate, and constructs a new result.
  • FOR var IN expr .
  • LET var expr .
  • WHERE expr .
  • RETURN expr .

FOR and LET clauses grnerate a list of tuples of
bound exprs, preserving document order
WHERE clause applies a predicate, eliminating
some of the tuples
RETURN clause is executing for each surviving
tuple, generating an ordered list of outputs
67
Example - DTD
  • lt!ELEMENT reviews (entry)gt
  • lt!ELEMENT entry (title, price, review)gt
  • lt!ELEMENT title (PCDATA)gt
  • lt!ELEMENT price (PCDATA)gt
  • lt!ELEMENT review (PCDATA)gt

68
Example Sample Data
  • http//www.amazon.com/reviews.xml
  • ltreviewsgt
  • ltentrygt
  • lttitlegtData on the Weblt/titlegt
  • ltpricegt34.95lt/pricegt
  • ltreviewgt
  • a good discussion of database
    systems and XML.
  • lt/reviewgt
  • lt/entrygt
  • ltentrygt
  • lttitlegtAdvanced Unix Programming lt/titlegt
  • ltpricegt65.95lt/pricegt
  • ltreviewgt
  • a good discussion of UNIX
    programming.
  • lt/reviewgt
  • lt/entrygt
  • lt/reviewsgt

69
Example - Request
  • For each book found at both www.bn.com and
    www.amazon.com, list the title of the book and
    its price from each source

70
Example - Query
  • ltbooks-with-pricesgt
  • for b in document("www.bn.com/bib.xml")//book
    ,
  • a in document("www.amazon.com/reviews.xml
    ")//entry
  • where b/title a/title
  • return
  • ltbook-with-pricesgt
  • b/title
  • ltprice-amazongt a/price/data()
    lt/price-amazongt
  • ltprice-bngt b/price/data()
    lt/price-bngt
  • lt/book-with-pricesgt
  • lt/books-with-pricesgt

71
Example - Result
  • ltbooks-with-pricesgt
  • ltbook-with-pricesgt
  • lttitlegtAdvanced Unix Programminglt/titlegt
  • ltprice-amazongt65.95lt/price-amazongt
  • ltprice-bngt65.95lt/price-bngt
  • lt/book-with-pricesgt
  • ltbook-with-pricesgt
  • lttitlegtData on the Weblt/titlegt
  • ltprice-amazongt34.95lt/price-amazongt
  • ltprice-bngt 39.95lt/price-bngt
  • lt/book-with-pricesgt
  • lt/books-with-pricesgt

72
Use Cases
  • Use Case 1 Queries that reserve hierarchy
  • Use Case 2 Access to relational data

73
Use Case 1 Queries that reserve hierarchy
  • XML document has flexible structure
  • Text is mixed with elements
  • Many elements are optional
  • Wide variation in structure from one document to
    another
  • The ways in which elements are ordered and nested
    are quite important (Can you give me an example?)

74
Use Case 1 - DTD
  • lt!DOCTYPE book
  • lt!ELEMENT book (title, author, section)gt
  • lt!ELEMENT title (PCDATA)gt
  • lt!ELEMENT author (PCDATA)gt
  • lt!ELEMENT section (title, (p figure section)
    )gt
  • lt!ATTLIST section
  • id ID IMPLIED
  • difficulty CDATA IMPLIEDgt
  • lt!ELEMENT p (PCDATA)gt
  • lt!ELEMENT figure (title, image)gt
  • lt!ATTLIST figure
  • width CDATA REQUIRED
  • height CDATA REQUIRED gt
  • lt!ELEMENT image EMPTYgt
  • lt!ATTLIST image
  • source CDATA REQUIRED gt
  • gt

75
  • ltbookgt
  • lttitlegtData on the Weblt/titlegt
  • ltauthorgtSerge Abiteboullt/authorgt
  • ltauthorgtPeter Bunemanlt/authorgt
  • ltsection id"intro" difficulty"easy" gt
  • lttitlegtIntroductionlt/titlegt
  • ltpgtText ... lt/pgt
  • ltsectiongt
  • lttitlegtAudiencelt/titlegt
  • ltpgtText ... lt/pgt
  • lt/sectiongt
  • ltsectiongt
  • lttitlegtWeb Data and the Two Cultureslt/titlegt
  • ltpgtText ... lt/pgt
  • ltfigure height"400" width"400"gt
  • lttitlegtTraditional client/server
    architecturelt/titlegt
  • ltimage source"csarch.gif"/gt
  • lt/figuregt
  • ltpgtText ... lt/pgt

76
Use Case 1 - Request
  • List all the sections and their titles. Preserve
    the original attributes of each ltsectiongt
    element, if any.
  • Questions
  • Do we need all the elements?
  • How could we eliminate unwanted elements?
  • How could we preserve the original attributes?

77
Use Case 1 - Solution
  • lttocgt
  • Let b document(book1.xml)
  • Return
  • Filter(b//section b//section/title
    b//section/title/data())
  • lt/tocgt

78
Use Case 1 - Result
  • lttocgt
  • ltsection id"intro" difficulty"easy"gt
  • lttitlegtIntroductionlt/titlegt
  • ltsectiongt
  • lttitlegtAudiencelt/titlegt
  • lt/sectiongt
  • ltsectiongt
  • lttitlegtWeb Data and the Two Cultureslt/titlegt
  • lt/sectiongt
  • lt/sectiongt
  • ltsection id"syntax" difficulty"medium"gt
  • ...
  • lt/sectiongt
  • lt/tocgt

79
Use Case 2 - Access to Relational Data
  • Questions
  • How to represent relational tables as XML
    document?
  • Do we need multiple XML documents?
  • How does XQuery work on multiple XML documents?

80
Use Case 2 - Access to Relational Data
  • Represent database table as XML document
  • Document element lt-gt table
  • Tuple lt-gt nested element
  • Column lt-gt nested element inside tuple-element
  • Column that allow null values are represented by
    optional elements, and a missing element denotes
    a null value

81
Use Cases 2 - Online Auction
  • Tables
  • USERS (userid, name, rating)
  • Contains info on registered users
  • ITEMS (itemno, description, offered_by,
    start_date, end_date, reserve_price)
  • Lists items currently or recently for sale
  • BIDS (userid, itemno, bid, bid_date)
  • Contains all bids on record

82
Simplified E-R Diagram
userid
itemno
USERS
ITEMS
BIDS
userid
itemno
83
Use Case 2 - DTD
  • lt!DOCTYPE users
  • lt!ELEMENT users (user_tuple)gt
  • lt!ELEMENT user_tuple (userid, name, rating?)gt
  • lt!ELEMENT userid (PCDATA)gt
  • lt!ELEMENT name (PCDATA)gt
  • lt!ELEMENT rating (PCDATA)gt
  • gt

84
Use Case 2 - DTD
  • lt!DOCTYPE items
  • lt!ELEMENT items (item_tuple)gt
  • lt!ELEMENT item_tuple (itemno, description,
    offered_by, start_date?, end_date?,
    reserve_price? )gt
  • lt!ELEMENT itemno (PCDATA)gt
  • lt!ELEMENT description (PCDATA)gt
  • lt!ELEMENT offered_by (PCDATA)gt
  • lt!ELEMENT start_date (PCDATA)gt
  • lt!ELEMENT end_date (PCDATA)gt
  • lt!ELEMENT reserve_price (PCDATA)gt
  • gt

85
Use Case 2 - DTD
  • lt!DOCTYPE bids
  • lt!ELEMENT bids (bid_tuple)gt
  • lt!ELEMENT bid_tuple (userid, itemno, bid,
    bid_date)gt
  • lt!ELEMENT userid (PCDATA)gt
  • lt!ELEMENT itemno (PCDATA)gt
  • lt!ELEMENT bid (PCDATA)gt
  • lt!ELEMENT bid_date (PCDATA)gt
  • gt

86
Use Case 2 - Sample Data
  • USER

USERID NAME RATING
U01 Tom Jones B
U02 Mary Doe A
U04 Roger Smith C
U05 Rip Sprat B
87
Use Case 2 - Sample Data
  • ITEMS

ITEMNO DESCRIPTION OFFERED_BY START_DATE END_DATE RESERVE_ PRICE
1001 Red Bicycle U01 01-01-05 01-01-20 40
1002 Motorcycle U02 01-02-11 01-03-15 500
1003 Old Bicycle U02 01-01-10 01-02-20 25
88
Use Case 2 Sample Data
  • BIDS

USERID ITEMNO BID BID_DATE
U02 1001 35 01-01-07
U04 1001 40 01-01-08
U02 1001 45 01-01-11
U04 1001 55 01-01-15
U01 1002 400 01-02-14
U02 1002 600 01-02-16
U04 1002 1000 01-02-25
U02 1002 1200 01-03-02
U04 1003 15 01-01-22
U05 1003 20 01-02-03
89
Use Case 2 - Request
  • Request
  • For all bicycles, list the item number,
    description, and highest bid (if any), ordered by
    item number.

90
Use Case 2 Solution
  • ltresultgt
  • for i in document("items.xml")//item_tuple
  • let b document("bids.xml")//bid_tupleitem
    no i/itemno
  • where contains(i/description, "Bicycle")
  • return
  • ltitem_tuplegt
  • i/itemno
  • i/description
  • lthigh_bidgt max(b/bid) lt/high_bidgt
  • lt/item_tuplegt
  • sortby(itemno)
  • lt/resultgt

91
Use Case 2 Result (Bingo!)
  • ltresultgt
  • ltitem_tuplegt
  • ltitemnogt1001lt/itemnogt
  • ltdescriptiongtRed Bicyclelt/descriptiongt
  • lthigh_bidgt
  • ltbidgt55lt/bidgt
  • lt/high_bidgt
  • lt/item_tuplegt
  • ltitem_tuplegt
  • ltitemnogt1003lt/itemnogt
  • ltdescriptiongtOld Bicyclelt/descriptiongt
  • lthigh_bidgt
  • ltbidgt20lt/bidgt
  • lt/high_bidgt
  • lt/item_tuplegt
  • lt/resultgt

92
Future Work about XQuery
  • Add support for new desired characteristics
  • What are they?
  • Any other future work?

93
Bibliography
  • Chapter 17, XML and Web Data
  • XML Query Requirements
  • http//www.w3.org/TR/2001/WD-xmlquery-req-20010215
  • XML Query Use Cases, W3C Working Draft 20
    December 2001
  • http//www.w3.org/TR/2001/WD-xmlquery-use-cases-20
    011220
  • Database Desiderata for an XML Query Language,
    David Maier, Oregon Graduate Institute
Write a Comment
User Comments (0)
About PowerShow.com