XML%20queries%20and%20updates - PowerPoint PPT Presentation

About This Presentation
Title:

XML%20queries%20and%20updates

Description:

The Extensible Markup Language (XML) is the universal format for structured ... The great and true Amphibian, whose nature is disposed to. ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 71
Provided by: dan133
Learn more at: https://dsf.berkeley.edu
Category:

less

Transcript and Presenter's Notes

Title: XML%20queries%20and%20updates


1
XML queries and updates
  • Daniela Florescu

2
Outline
  • Introduction
  • XML Data Model and Type System
  • XML Queries
  • Technical discussions on Xquery design
  • XML Updates
  • Conclusions

3
What is XML?
  • The Extensible Markup Language (XML) is the
    universal format for structured documents and
    data on the Web.
  • Base specifications
  • XML 1.0, W3C Recommendation Feb '98
  • Namespaces, W3C Recommendation Jan '99

4
Simple XML Data Example
  • ltbook year1967 xmlnsamzwww.amazon.comgt
  • lttitlegtThe politics of experiencelt/titlegt
  • ltauthorgtR.D. Lainglt/authorgt
  • ltamzref amzisbn1341-1444-555/gt
  • ltsectiongt
  • The great and true Amphibian, whose
    nature is disposed to..
  • lttitlegtPersons and experiencelt/titlegt
    Even facts become...
  • lt/sectiongt
  • lt/bookgt

5
The secrets of the XML success
  • XML is a data representation format
  • XML is universal
  • XML is human readable
  • XML is machine readable
  • XML is international
  • XML is platform independent
  • XML is vendor independent
  • XML is endorsed by the W3C
  • XML is not a new technology
  • XML is not only a data representation format

6
XML as a family of technologies
  • XML Information Set
  • XML Schema
  • XML Query
  • The Extensible Stylesheet Transformation Language
    (XSLT)
  • XML Forms
  • XML Protocol
  • XML Encryption
  • XML Signature
  • Others
  • almost all the pieces needed for a reasonably
    good Web Services puzzle

7
Major application domains for XML
  • Data exchange on the Web
  • e.g.HealthCare Level Seven http//www.hl7.org/
  • Application integration on the Web
  • e.g. ebXML http//www.ebxml.org/
  • Document exchange on the Web
  • e.g. Encoded Archival Description Application
    http//lcweb.loc.gov/ead/

8
The role of an XML query language
  • Why a query language for XML ?
  • Preserve the logical/physical data independence
  • The semantics is described in terms of an
    abstract data model, independent on the physical
    data storage
  • Declarative programming
  • Such programs should describe the what, not the
    how
  • Why a native query language ??
  • We need to deal with the specificities of XML
    (hierarchical, ordered , textual, potentially
    schema-less structure)

9
XML query languages state of the art
  • Query languages for graph data
  • e.g. GOOD, GraphLog, Clean
  • Query languages/scripting languages for the WEB
  • e.g. WebSQL, WebOQL, WebL
  • Query languages for semi-structured data
  • e.g. MSL, UnQL, StruQL, YATL

10
XML query languages state of the art
  • Research query languages for XML
  • e.g. XML-QL, Lorel, XML-GL, Quilt, Xduce
  • Industry query languages for XML
  • e.g. XQL, OQL extensions to query SGML documents
  • W3C standard processing languages for XML
  • e.g. XPath, XSLT
  • Standard W3C XML Query Language Xquery

11
W3C Query Working Group - History
  • Sept 1999 WG creation and first F2F
  • Currently 30 W3C member companies
  • Twelve F2F meetings and 80 telecons so far
  • Public WDs every three months 
  • http//www.w3.org/XML/Query

12
W3C Query Working Group - Goal
  • "The goal of the XML Query WG is to produce
  • - an abstract data model for XML documents,
  • - a set of query operators on that data model,
  • - a query language based on these query
    operators

13
XML Many Environments
DOM
DOM
SAX
SAX
DBMS
DBMS
XQuery
W3C XML Query Data Model
W3C XML Query Data Model
XML
XML
Java
Java
COBOL
COBOL
14
W3C XML Working Group - Status
  • June 2001 new or revised working drafts
  • XML Query Requirements
  • XML Query Use Cases
  • XML Query 1.0 and Xpath 2.0 Data Model
  • XML Query 1.0 Formal Semantics
  • Xquery 1.0 An XML Query Language
  • XML Syntax for Xquery 1.0 (XqueryX)

15
General XML query requirements
  • Non-procedural, declarative query language
  • XML syntax for query language but also a human
    readable syntax
  • Protocol independent
  • Standard error conditions
  • Should not preclude updates

16
XML Query Use Cases
  • Use Case Organization
  • Description, DTD/Schema, Input Data, Queries and
    Results
  • Current Use Cases
  • "XMP" Experiences and exemplars
  • "TREE" Queries that preserve hierarchy
  • "SEQ" - Queries based on sequences
  • "R" - Access to relational data
  • "TEXT" Full-text search
  • "NS" - Queries using namespaces
  • "PARTS" - Recursive computation
  • "REF" - Queries based on references

17
XML Abstract Data Model
  • Common for Xpath 2.0 and XQuery 1.0
  • A logical model composed of
  • a set of logical entities
  • constructors and accessors for each entity
  • Based on the notion of an ordered tree
  • XML data cannot be modeled as a simple tree

18
XML Abstract Data Model Entities
  • Nodes
  • Node Document Element Attribute Text
    Namespaces PI Comment
  • Simple values (all XML Schema simple types)
  • string, boolean, ID, IDREF, decimal, QName, URI,
    ...
  • Sequences
  • Errors
  • Schema components

19
Document Nodes Constructors Accessors
  • Constructor
  • document-node
  • URI X Sequence1, (Element Text
    PI Comment )
  • -gtDocumentNode
  • Accessors
  • base-uri DocumentNode -gt URI
  • children DocumentNode -gt
  • Sequence1,(ElementNodeTextNodePIComm
    ent)
  • string-value DocumentNode -gt string

20
Attribute Nodes Constructors Accessors
  • Constructor
  • attribute-node Qname X string X
    SchemaComponent -gt
  • AttributeNode
  • Accessors
  • name AttributeNode -gt Qname
  • type AttributeNode -gt SchemaComponent
  • typed-value AttributeNode -gt
    Sequence(SimpleValue)
  • string-value AttributeNode -gt string
  • parent AttributeNode -gt Sequence0,1 (Node)

21
Sequences Constructors Accessors
  • Constructors
  • empty-sequence () -gt Sequence
  • append Sequence X Sequence -gt Sequence
  • Accessors
  • empty Sequence -gt boolean
  • head Sequence -gt UnitValue
  • tail Sequence -gt Sequence

22
XML data model - conclusion
  • Complete with respect to XML
  • Relatively simple design
  • ordered trees, node-labeled, with node identity
  • Semantics of the query language relies on the
    data model constructors and accessors
  • Relationship with the other W3C XML related
    standards
  • Clear mapping to/from the XML Infoset
  • Less clear relationship with Document Object
    Model (DOM)
  • Less clear relationship with the XML Schema and
    the type system

23
The Xquery type system
  • Xquerys original design had a powerful type
    system (based on Xduce)
  • The type system can
  • (1) detect statically errors in the queries
  • (2) infer the type of the result of valid
    queries
  • (3) ensure statically that the result of a given
    query is of a given (expected) type if the input
    dataset is guaranteed to be of a given type
  • Queries on types
  • Big debate XML type system vs. XML Schema

24
Xquery in a nutshell
  • Functional language
  • A query is an expression
  • The result of the query is the result of the
    evaluation of the expression
  • Expressions are evaluated in a certain
    environment
  • Strongly typed
  • Every expression has a type
  • Statically typed
  • The type of the result of an expression can be
    detected statically
  • Formal semantics based on XML Abstract Data Model
  • Dual syntax XML and non XML
  • Influenced by SQL, OQL, XQL, Xpath, Quilt

25
Xquery expressions
  • Constants and variables
  • expression1 operator expression2
  • function(expression1,...expression2)
  • XPath expressions (for navigation)
  • FLWR expressions (for iteration)
  • SORTBY expressions
  • Quantified expressions
  • Conditional expressions
  • Type-related expressions
  • XML node constructors (elements, attributes, etc)
  • Xquery expressions can be nested with full
    generality !

26
First XML queries
  • 11
  • x
  • x/title
  • x/price1
  • document(www.amazon.com/books.xml)

27
Xquery functions and operators
  • Arithmetic operators
  • , -, , div, , !, lt, etc
  • Logical operators
  • and, not, or
  • Collection oriented operators
  • union, intersection, difference, empty, distinct,
    count, sum, avg, min, max, etc
  • Global topological order related operators
  • before, after, unordered
  • XML specific functions
  • document, name, string-value, typed-value, etc
  • Many semantic open issues related to the
    semantics of these operators

28
Xpath expressions
  • General syntax
  • expression / step
  • Two syntaxes abbreviated or not
  • Step in the non-abbreviated syntax
  • axis nodeTest
  • Axis control the navigation direction in the tree
  • ancestor, ancestor-or-self, attribute, child,
    descendent, descendent-or-self, following,
    following-sibling, namespace, parent, preceding,
    preceding-sibling, self
  • Node test by
  • Name (e.g. publisher, myNSpublisher,
    publisher, myNS , )
  • Type (e.g. node(), comment(), text() )

29
Examples of path expressions
  • document(bibliography.xml)/childbib
  • x/childbib/childbook/attributeyear
  • x/parent
  • x/ancestor/descendentcomment()

30
Semantics of XPath expressions
  • Semantics of path expressions in Xpath 1.0
  • (1) Ordered forests of nodes as input, ordered
    forests of nodes as output
  • (2) For each root node in the input forest,
    select the nodes in the same document that obey
    to the given axis
  • (3) Among those select and return the ones that
    satisfy the node test.
  • (4) No duplicates are allowed in the output
  • (5) Output nodes are ordered by the document
    order
  • (6) Nodes preserve their identity
  • No type error for book/nose
  • A list of lists is automatically flattened

31
Xpath abbreviated syntax (1)
  • Axis can be missing
  • By default the child axis
  • x/childperson -gt x/person
  • Short-hands for common axes
  • Descendent-or-self
  • x/descendant-or-selfcomment() -gt
    x//comment()
  • Parent
  • x/parent -gt x/..
  • Attribute
  • x/attributeyear -gt x/_at_year
  • Self
  • x/self -gt x/.

32
Xpath abbreviated syntax (2)
  • Implicit root node
  • root/bib -gt /bib
  • root -gt /
  • Implicit current node (inside in the second order
    functions )
  • self/title -gt ./title
  • self/title -gt title

33
Simple iteration expression
  • Syntax
  • for variable in expression1 return
    expression2
  • Example
  • for x in document(bibliography.xml)/bib/book
  • return x/title
  • Semantics
  • bind the variable to each root node of the forest
    returned by expression1
  • for each such binding evaluate expression2
  • concatenate the resulting forests
  • lists of lists are automatically flattened

34
Local variable declaration
  • Syntax
  • let variable expression1 return
    expression2
  • Example
  • let x document(bibliography.xml)/bib/book
  • return count(x)
  • Semantics
  • bind the variable to the result of the
    expression1
  • add this binding to the current environment
  • evaluate expression2
  • remove the local variable from the environment.

35
Conditional expressions
  • Syntax
  • if ( expression1 ) then expression2 else
    expression3
  • Example
  • if ( book/_at_year lt1980 )
  • then old book
  • else new book
  • Semantics
  • If expression1 evaluates to true then return the
    result of the evaluation of expression2 else
    return the result of the evaluation of
    expression3.

36
FLWR expressions
  • Syntactic sugar that combines FOR, LET, IF
  • Example
  • for x in //bib/book / like the FROM in
    SQL /
  • let y x/author / no analog in SQL
    /
  • where x/titleThe politics of experience
  • / like
    the WHERE in SQL /
  • return count(y) / like the SELECT
    in SQL /

37
FLWR expression semantics
  • FLWR expression
  • for x in //bib/book
  • let y x/author
  • where x/titleThe politics of experience
  • return count(y)
  • Semantically equivalent to
  • for x in //bib/book
  • return (let y x/author
  • return if (x/titleThe politics
    of experience )
  • then count(y)
  • else ()
  • )

38
More FLWR expression examples
  • Selections
  • for b in document("bib.xml")//book
  • where b/publisher Springer Verlag" and
  • b/_at_year "1998"
  • return b/title
  • Joins
  • for b in document("bib.xml")//book,
  • p in //publisher
  • where b/publisher p/name
  • return b/title p/address/title, p/name

39
Xpath filter predicates
  • Syntax
  • expression1 expression2
  • is an overloaded operator
  • Filtering by predicate
  • //book ./author/firstname ronald
  • //book _at_price lt25
  • //book count(author _at_genderfemale )gt0
  • Filtering by position
  • /book3
  • /book3/author1
  • /book3/author1 to 2

40
Quantified expressions
  • Syntax
  • some variable in expression1 satisfies
    expression2
  • every variable in expression1 satisfies
    expression2
  • Examples
  • some x in //book satisfies x/price gt200
  • //booksome x in author satisfies
    x/_at_genderfemale
  • for x in //book
  • where every y in x/author satisfies
    y/_at_genderfemale
  • return x/title

41
SORTBY expressions
  • Syntax
  • expression0 SORTBY
  • ( expression1 ASCENDING DESCENDING
    , .,
  • expressionK ASCENDING DESCENDING )
  • Examples
  • //book sortby ( _at_price )
  • //book_at_year2001/author sortby (lastname,
    firstname)
  • for x in //book
  • where empty(x/author)
  • return x
  • sortby (title)

42
Global document order queries
  • Syntax
  • expression1 ( before after )
    expression2
  • Examples
  • //section before //sectiontitlePersons and
    experiences
  • //paragraph after //sectionnameIntroduction
  • before //paragraphcontains(Xq
    uery)

43
Xquery element constructors
  • Standard XML elements
  • ltsection titlePersons and experiences gt
    This is a section of the book entitled lttitlegtThe
    politics of Experiencelt/titlegt written by
    ltauthorgt Ronald Lainglt/authorgt. lt/sectiongt
  • Dynamically constructed elements
  • ltsection title s/title gtThis is a section
    of the book entitled s/ascendentsbook/title
    written by for a ins/ascendentsbook/author
    return ltauthorgt concat(a/firstname,
    a,lastname) lt/authorgt .lt/sectiongt

44
Complex Xquery example
  • ltbibliographygt
  • for x in //book_at_year2001
  • return
  • ltbook titlex/titlegt
  • if(empty(x/author))
  • then
    x/editor/affiliation
  • else x/author
  • lt/bookgt
  • lt/bibliographygt

45
Xquery operators on datatypes
  • INSTANCEOF
  • returns True if its first operand is an instance
    of the type named in its second operand
  • CAST
  • is used to convert a value from one datatype to
    another
  • TREAT
  • causes the query processor to treat an expression
    as though its datatype were a subtype of its
    static type
  • TYPESWITCH
  • branching based on the dynamic type of the input
    data

46
Dealing with node identity
  • All nodes in the data model have node identity
  • Nodes identity is preserved through queries
  • Two equality functions for nodes
  • Value based
  • Identity based

47
Local function declarations
  • Example
  • function number_paragraphs(x nssection)
  • return xsdinteger
  • count(x/paragraph)
  • sum(for y in x/section
  • return number_paragraphs(y))
  • number_paragraphs(/bib/booktitleThe politics
    of experience/section1)

48
Joins in XQuery
  • ltbooks-with-pricesgt
  • for a in document(amaxon.xml)/book,
  • b in document(bn.xml)/book
  • where b/isbna/isbn
  • return
  • ltbookgt
  • a/title
  • ltprice-amazongta/pricelt/price
    -amazongt,
  • ltprice-bngtb/pricelt/price-bngt
  • lt/bookgt
  • lt/books-with pricesgt

49
Left-outer joins in XQuery
  • ltbooks-with-pricesgt
  • for a in document(amaxon.xml)/book
  • return
  • ltbookgt
  • a/title
  • ltprice-amazongta/pricelt/price-amazongt,
  • for b in document(bn.xml)/
    book
  • where b/isbna/isbn
  • return ltprice-bngtb/price
    lt/price-bngt
  • lt/bookgt
  • lt/books-with pricesgt

50
Full-outer joins in Xquery
  • let allISBNsdistinct(document(amazon.xml)/boo
    k/isbn union

  • document(bn.xml)/book/isbn )
  • return
  • ltbooks-with-pricesgt
  • for isbn in allISBNs
  • return
  • ltbookgt
  • for a in
    document(amazon.xml)/bookisbnisbn
  • return ltprice-amazongtb/pricelt/price-amazon
    gt
  • for b in
    document(bn.xml)/book isbnisbn
  • return ltprice-bngtb/pricelt/price-bngt
  • lt/bookgt
  • lt/books-with pricesgt

51
Group-by and Having
  • Example
  • for a in distinct(//book/author/lastname)
  • let books //booksome y in
    author/lastnamea
  • where count(books)gt10
  • return ltresultgt
  • a/name books1 to 10
  • lt/resultgt

52
Views in Xquery
  • Views are supported in Xquery via functions
  • non-parameterized views via functions with no
    arguments
  • parameterized views via functions with at least
    one argument
  • Xquery supports recursive views
  • unrestricted form of recursion
  • Termination is not guaranteed automatically

53
Many open issues
  • Relationship with Xpath
  • E.g. should we preserve the implicit casting
    operations of Xpath 1.0?
  • Relationship with XML Schema
  • Bi-directional mapping between the XML Schema
    concepts and the Xquery type system concepts
  • Schema validation vs. type checking
  • Name-based sub typing vs. structural sub typing
  • Human readable (non XML) syntax for types ?
  • Xquery functions and operators built-in library
  • More sophisticated support for full text search
  • and many more

54
Xquery implementations
  • Microsoft
  • Software AG
  • Kweelt
  • Lucent
  • Univ. Darmstad
  • HiFive.com
  • FatDog.com

55
XML query language summary
  • Expressive power
  • Major functionality of XML-QL, XQL, SQL, OQL -
    query the many kinds of data XML contains!
  • Use-case driven approach
  • Can be implemented in many environments
  • Traditional databases, XML repositories, XML
    programming libraries, etc.
  • Queries may combine data from many sources
  • Minimalist design
  • Small, easy to understand, clean semantics ?
  • A quilt, not a camel

56
Conclusion
  • One language replaces DOMXPathXSLT
  • Expressive, concise, easy to learn ?
  • Implementable, optimizable
  • Data integration for multiple sources
  • Several current implementations
  • Preliminary update proposal
  • Future work
  • Scripting language for XML
  • Workflow langauge for XML
  • For more informations about the W3C XML Query
    Language WG activity please visit
  • W3C XML Query

57
Some of Xquerys debates
58
Procedural difficulties
  • Language designed by a committee
  • Hard to avoid the Camel
  • Strong interaction with other W3C WG
  • Not too much coordination among the W3C WG
  • No preexisting global vision or architecture
    (bottom up design, like the Web itself !)

59
Technical argument (1)
  • Problem 1 equality is not transitive nor
    reflexive
  • x3 and x4 can evaluate to true
  • xlt2 and xgt4 can also evaluate to true
  • x3 and x!3 can evaluate to true
  • xy and yz does not imply xz
  • xx can evaluate to false
  • Source
  • Equality (and all the other relational operators)
    has an implicit existential quantifier in Xpath
    1.0
  • Nasty consequences
  • high probability of user errors and intense
    frustration
  • good old query evaluation algorithms dont work
    anymore
  • schema evolution is badly handled

60
Technical argument (2)
  • Problem 2 implicit data conversions
  • from an element to the element content
  • from an attribute to the attribute content
  • from a sequence to a value (by taking the first
    member)
  • from a typed value to string
  • from a string to a typed value
  • from any typed value to a Boolean
  • (e.g. from a node set to Boolean)
  • Examples of bad cases
  • //bookprice is not the same as //bookprice0
  • ltbookgt_at_pricelt/bookgt is not the same as
    ltbookgt_at_price0lt/bookgt

61
Technical argument (3)
  • Problem 2 implicit data conversions
  • Source
  • Backward compatibility with Xpath 1.0
  • Dealing with the semi-structured aspect of the
    data
  • Trying to avoid static or dynamic errors as much
    as possible
  • Bad consequences
  • the result of the evaluation of an expression can
    depend on the context where the expression appear
  • high probability of user errors and intense
    frustration

62
Technical argument (4)
  • Problem 3 / is not a simple projection
  • (//book sortby _at_price)/title will be sorted by
    document order, not by price
  • Source
  • Backwards compatibility with Xpath 1.0
  • Bad consequences
  • high probability of user errors and more
    frustration
  • / often requires materialization (for sorting
    and duplicate elimination)
  • difficult to parallelize and stream

63
You can help
  • Designing such a language is VERY hard!
  • Your opinion matters!
  • A year from now it will be too late
  • Please help reviewing the specifications and send
    comments to
  • www-xml-query-comments_at_w3c.org

64
XML update language
  • Declarative update language
  • XML data model tree modification
  • E.g. nodes deletion, insertion, replacement
  • Metadata replacement
  • Built in top of the XML query language
  • Initial proposal from some of the XML Query WG
    members
  • Not an official working draft of the W3C !
  • Already supported by some Xquery implementations

65
XML update statements
  • Simple update statements
  • InsertStatement
  • DeleteStatement
  • RenameStatement
  • ReplaceStatement
  • MoveStatement
  • Complex update statements

66
INSERT statement
  • Syntax
  • insert expression1 ( into after before )
    expression2
  • Examples
  • insert ltpublishergtMorgan Kaufmannlt/publishergt
  • after //booktitleThe politics of
    experience/title
  • insert ltcommentgtThis is a great
    paragraph!lt/commentgt
  • before //bookauthor/lastnameLaing/section1
    /paragraph2

67
DELETE statement
  • Syntax
  • delete expression
  • Examples
  • delete //book/_at_pricegt100
  • delete //book1/section1 to 3/comment()
  • delete //comment()

68
RENAME statement
  • Syntax
  • rename expression as expression
  • Examples
  • rename //book as publication
  • rename //book/_at_price as amazon_price

69
REPLACE statement
  • Syntax
  • replace expression1 with expression2
  • Examples
  • replace //book1/title with lttitlegtSome new
    titlelt/titlegt
  • replace //book1/_at_price/data() with 25.50
  • replace //book1/_at_price/data() with
    //book1/_at_price/data()5

70
MOVE statement
  • Syntax
  • move expression1 ( before after into )
    expression2
  • Examples
  • move //book1/section1/paragraph2 before
  • //book1/section2/paragraph1
  • move //book1/_at_price into //book1/publisher
Write a Comment
User Comments (0)
About PowerShow.com