XML Programming Techniques - PowerPoint PPT Presentation

1 / 132
About This Presentation
Title:

XML Programming Techniques

Description:

( just another layer; made things worse!) Should XML change the way we build apps? ... ADO = ActiveX data objects (.NET platform) Microsoft ... – PowerPoint PPT presentation

Number of Views:286
Avg rating:3.0/5.0
Slides: 133
Provided by: dbis
Category:

less

Transcript and Presenter's Notes

Title: XML Programming Techniques


1
XML Programming Techniques
  • Daniela Florescu, OracleDonald Kossmann, ETH

2
Why this tutorial?
  • Has XML changed the way we build apps?
  • No! (just another layer made things worse!)
  • Should XML change the way we build apps?
  • Yes! (our hypothesis)
  • So what are the options/tradeoffs?

3
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

4
Killer Advantages of XML
  • Platform/vendor independent, international
    (UNICODE)
  • Human and machine readable
  • Serialization of data
  • Hype and people
  • Tools and human resources available
  • Standardization, secure investment
  • Family of technologies
  • XQuery, XML Schema, SOAP, XQuery, WS Security,
  • (all building blocks for SOA)
  • XML is not new!
  • Best of breed from OO, DB, Documents, Distr.
    Systems,

5
Killer Advantages
  • Decouple Data from Application
  • Data lives longer than code (legacy problem)
  • Data first, schema later (pay as you go along)
  • Spectrum unstructured to structured data
  • Potentially all data
  • Pay as you go along
  • Spectrum data, meta-data, code
  • Potentially all information
  • Avoid technology jungle one size fits all

6
Some Problems of XML
  • Not complete pieces of puzzle missing
  • RDF Compatibility, Programming,
  • Bottom-up standardization
  • Bottom-up product development
  • Too much fluff
  • Do you need processing instructions?
  • No references, no support for NM relationships
  • No design methodology
  • ER / UML were not designed for XML
  • Some things are good and bad
  • Lexical and binary representation of data
  • All data are context-sensitive (no cutpaste!)

7
Why is programming for XML different?
  • XML is not based on entities relationships
  • XML decouples data from its interpretation
  • Data first, schema later
  • Spectrum unstructured to structured data
  • Spectrum data, meta-data, code
  • Dont burry killer advantages of XML in
    programming language!

8
Typical XML Applications
  • Blogs RSS, Atom
  • Why XML Platform-independent, serialization,
    structure-unstructured data
  • Unused potential RSS as a building block of any
    streaming application
  • EAI Web Services, Rest
  • Why XML family of standards, serialization,
    platform-independent, machine readable
  • Unused potential performance, declarative
    programming, strong typing

9
Typical XML Applications (ctd.)
  • Office OpenOffice, Microsoft Office
  • Why XML structured-unstructured data, hype
  • Unused potential ???
  • Scientific Data
  • Why XML data first/schema later, hype,
    strucutre-unstructured data
  • Unused potential ???
  • Eclipse (XMI), Configuration Files
  • Why XML XML is not new, human readable,
    data/code/metadata
  • Unused potential data first/schema later

10
XML Architectures
XML
Objects
SQL
  • XML another layer for comm. presentation
  • Leave everything else as before
  • XML makes things worse (another layer)
  • More marshalling, more logging, more complexity

11
XML Architectures
SQL
XML
Objects
  • Common runtime ideally no marshalling
  • Exploit best of all worlds
  • Not clear how to do the cut
  • Example Microsoft LINQ

12
XML Architectures
XML
  • XML used by different components at different
    layers for different purposes
  • Examples Eclipse, PHP (most frameworks)

13
XML Architectures
XML
XML
XML
XML
XML
XML
XML
  • XML everywhere and nowhere
  • Example WebLogic, WebSphere

14
XML Architectures
XML
  • XML everywhere
  • Only a little bit of native code
  • Jim Gray Extremist Approach (ACM Queue)
  • Example XQuery, XQueryP

15
What is right for me?
  • How deep does the XML go into architecture?
  • Wrap XML as an additional layer
  • How big is wrapper compared to rest of code?
  • Am I too lazy to learn a new language?
  • Cost to train people, how safe is that investment
  • What tools support my SE process?
  • Do I have a methodology for the XML app?
  • What application? What computations?
  • What kind of XML data?
  • Persistent, data on the wire, typed, distributed,
    ...
  • What kind of XML data model?
  • Serialized XML, Infoset, PSVI, XDM, ...

16
What is right for me?
  • Optimizability, performance
  • Cost for data marshalling
  • Can I stream data no need to parse whole message
  • Do things several times (e.g., logging, checking
    integrity)
  • Productivity of programmers
  • Technology jungle vs. one unified model
  • Optimization, logging, ... are all automatic
    focus on application logic and not on mundane
    tasks
  • Static typing of programs
  • programming style (declarative vs. Imperative)
  • Standard compliance W3C XML family
  • Other domain-specific goodies
  • Support for push / events, error handling,
    logging, asynchronous computation,
  • Exploits / exposes killer advantages of XML
  • XML syntax?

17
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

18
Overview of XML APIs
  • DOM
  • Any XML application, updates navigational read
  • SAX
  • Low level XML processing, no updates, only
    forward nav.
  • StaX (JSR 173), XMLPullParser
  • Low level like SAX, but pull (instead of push)
  • TokenIterator (BEA XQuery processor)
  • Like JSR 173, but full support for XQuery data
    model
  • XQJ / JSR 225
  • Standard for Java interface for XQuery results
  • Microsoft XMLReader Streaming API
  • Microsofts streaming XML interface
  • (Many more that I have omitted.)

19
Classification Criteria
  • Navigational access?
  • Random access (by node id)?
  • Decouple navigation from data reads?
  • Updates?
  • Infoset or XQuery Data Model?
  • Target programming language?
  • Target data consumer?

20
Decoupling
  • Idea
  • methods to navigate through data (XML tree)
  • methods to read properties at current position
    (node)
  • Example DOM (tree-based model)
  • navigation firstChild, parentNode, nextSibling,
  • properties nodeName, getNamedItem,
  • (updates createElement, setNamedItem, )
  • Assessment
  • good read parts of document, integrate existing
    stores
  • bad materialize temp. query results,
    transformations

21
Non Decoupling
  • Idea
  • Combined navigation read properties
  • Special methods for fast forward, reverse
    navigation
  • Example TokenIterator (token stream)
  • Token getNext(), void skipToNextNode(),
  • Assessment
  • good less method calls, stream-based processing
  • good integration of data from multiple sources
  • bad difficult to wrap existing XML data sources
  • bad reverse navigation tricky, difficult
    programming model

22
Classification of APIs
23
Summary XML APIs
  • Good programmers stay in their world
  • Bad APIs are clumsy (not declarative)
  • Bad no logical/physical data independence
  • Bad APIs require data marshalling
  • Programming via XML APIs extreme case
  • How deep XML goes into architecture
  • How lazy am I to learn a new language

24
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

25
Code Generators
  • Idea
  • Input XML Schema (XSD)
  • Output Code in target language (mostly Java)
  • Examples
  • JAXB XML lt-gt Java Objects (Un-)Marshalling
  • Given XML and Java Class, automatic translation
  • Many similar open source projects (e.g. Castor)
  • XML Beans Java getters and setters for XML
  • Compiles Java interfaces based on XSD
  • Implements an XML Store XPath/XQuery access
  • Open Source, but owned by BEA
  • SDO, EMF (see next slides)

26
Eclipse Modeling Framework (EMF)
  • Background Model Driven Architecture
  • Idea compile (Java) code from model
  • EMF supports the following models
  • UML 2.0 diagrams (e.g., IBM Rational Rose)
  • XMI (XML Metadata Interchange)
  • Annotated Java
  • XML Schema (but restricted!!!)
  • Reference http//www.eclipse.org/emf

27
EMF ECore and EObject
  • ECore is a meta model
  • Model to describe models
  • All models (UML, etc.) are described with ECore
  • Analogon XML Schema
  • EObject is a model to represent instances
  • All instances (Java objects) implement EObject
  • Analogon XML instance

28
XML Schema vs. ECore
XML Schema
ECore
describes
XMLSchema.xsd
ECore.ecore
ECore.xsd
XMLSchema.ecore
29
EMF from UML Example
  • UML 2.0 Class Diagram
  • Generated Java Code
  • Public interface BankAccount extends EObject
  • String getOwner()
  • void setOwner(String value)
  • double getBalance()
  • void setBalance()
  • Generated code is annotated can be manually
    extended, regenerated
  • Generates interfaces implementation (i.e.,
    class)
  • Very big community (!)

30
EMF from XSD
  • ltxsdschema targetNamespace
  • xmlnsxsdgt
  • ltxsdcomplexType nameBankAccountgt
  • ltxsdsequencegt
  • ltxsdelement nameowner typexsdstring/gt
  • ltxsdelement namebalance typexsddouble/gt
  • lt/xsdsequencegt
  • lt/xsdcomplexTypegt
  • lt/xsdschemagt
  • Creates the same Java (interface class)
  • Works for simple cases
  • Does not work for complex XML Schemas
  • Generated Java not always equivalent to XML Schema

31
Summary EMF
  • Very popular in MDA community
  • If you believe in MDA, here you go
  • Technical advantages
  • References are part of ECore (fixes XML bug)
  • ECore shares some of the XML advantages
  • EObjects are strongly typed
  • Technical disadvantages (common to all CGs)
  • Does not support whole XML Schema
  • Does not support declarative programming
  • Optimizability alla DB is not likely to happen
  • Platform Java Eclipse
  • If you hate Microsoft, here you go
  • Code Generators XML APIs (productivity)
  • Schema-based static typing, data independence

32
SDO, ADO.NET
  • SDO service data objects (J2EE platform)
  • BEA, IBM, Oracle et al.
  • ADO ActiveX data objects (.NET platform)
  • Microsoft
  • Uniform access to data from different sources
  • Inparticular XML, Web sources
  • Java or C interface to access any kind of data
  • Protocol for disconnected client/server access
  • Client propagates change lists to server
  • Implementation IBMs SDO on top of EMF
  • Conceived by IBM as an extension of EMF
  • wrt. XML binding, similar tradeoffs as EMF

33
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

34
ECMAScript JavaScript JScript
  • History
  • Started 1995 Sun and Netscape
  • March 1996 Netscape Navigator 2.0
  • August 1996 Microsoft IE 3.0 (JScript)
  • June 1997, 1998 first standards (ECMAScript)
  • Dec. 1999 ECMA-262 (current version)regular
    expressions, formatting, try/catch,
  • June 2004, Dec 2005 E4X (ECMAScript for XML)
  • Purpose
  • enliven Web pages (dynamic Web-based GUIs)
  • Scripting language for experts and users
  • http//www.ecma-international.org

35
ECMAScript Overview
  • object-based language (not fully OO)
  • Object have properties (e.g., name, balance)
  • Properties contain objects, primitives, methods
  • Primitives e.g., Boolean, String, null
  • Properties have attributes (e.g., ReadOnly)
  • Objects are created through constructors
  • Constructors use prototypes
  • Built-in objects Object, Array, Function,
  • Example objects pop-up, menu, text field,
  • Event-based language (there is no main)
  • Attach code to events (mouse, errors, aborts, )
  • Syntax resembles Java, C, Self

36
E4X (ECMA-357)
  • Simplify access and manipulation of XML
  • DOM conceived as too clumsy
  • XML is a primitive (like String, Boolean, )
  • var x new XML()
  • x ltBankAccountgt ltowner id4711gtD.
    Ducklt/ownergt
  • ltbalance currEURgt123.54lt/balancegt
  • lt/BankAccountgt

37
E4X
  • Access to elements
  • Child access .
  • x.balance
  • Attribute axis ._at_
  • x.balance._at_curr
  • Iteration
  • var total 0
  • for each (x in allBankAccounts.BankAccount)
  • total x.balance
  • Updates
  • Delete nodes
  • delete x.comment
  • Insert nodes
  • x.comment ltcommentgtblablalt/commentgt

38
AJAX Asyn. JavaScript and XML
  • Goal fine-grained interaction between Web
    browser and Web server
  • Faster, more interactive, user-friendly Web GUI
  • Web GUI should be as powerful as desktop GUI
  • Idea Exploit JavaScript, HTTP and XML
  • JavaScript has methods to invoke HTTP requests
  • AJAX uses XML to ship data from/to server
  • Why so successful?
  • Nothing new it is all there already
  • Just do it!

39
AJAX Example
  • HTML Form
  • ltformgt Product
  • ltinput type"text" idpname" onkeyupautoComp(
    this.value)/gt
  • lt/formgt
  • JavaScript
  • function autoComp(str)
  • var urlwww.myapp.com/pname.do?"p"str
    xmlHttp.open("GET", url , true)
  • xmlHttp.send(null)

40
PHP
  • Compile first, execute later interpreter
  • Compiles into intermediate language
  • Executes opcodes (might contain a lot of
    functionality)
  • Dynamically typed language
  • Types include integer, float, boolean, string,
    array (hash), object, null
  • Type juggling is automatic at runtime based on
    context

41
PHP Accessing XML
  • Treats XML values as if they were native PHP
    types
  • Takes advantage of the new Zend Engine II
    Overloading API
  • Takes advantage of the dynamic nature of PHP
  • Uses Gnome projects libxml2 library

42
Simple Access to XML
43
Proposal XML Content Store
  • Goals
  • Process and manage XML data from many sources
    web services, RSS feeds, messages, configuration
    files, user data
  • Create an API to abstract CRUD details
  • Results
  • Allow for rapid application design without
    worrying about tedious persistence details
  • Implementation Example
  • API PHP
  • Persistence Layer Upcoming Release of DB2,
    code-named Viper, with Native XML support

44
Summary
  • JavaScript, AJAX, PHP are very popular
  • Essential building block of Web 2.0
  • Good mature platforms, great community
  • Good domain-specific goodies
  • E4X and PHP provide native support for XML
  • XML data type
  • Syntax to access and manipulate XML
  • E4X, PHP are not compatible with standards
  • they argue that this is a feature
  • Bad but, do miss some of the XML advantages

45
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

46
Processing XML with SQL
  • Mapping XML data into tuples in a relational
    database, then use (a variant of) SQL
  • User-controlled shredding, then use classical SQL
  • Model driven shredding (Florescu, Kossmann, 99)
  • Edge, binary approach  alternatives
  • Corresponds to generic APIs (e.g. DOM) for Java
  • Plus very general integrates well with
    relational data
  • Minus poor performance
  • Schema based shredding (Shanmugasundaram et al.
    99)
  • Map XML Schema / DTD to SQL DDL
  • Plus integrates well with relational data
  • Minus  missing tools, complicated
  • Automatic shredding, then use SQL/XML
  • Plus usability, logical/physical data
    independence
  • Minus less user control

47
History of SQL / XML
  • First edition part of SQL2003
  • Part 14 of the SQL standard
  • Pre-dates XQuery standard
  • Limited functionality - storage and publishing
  • Second edition work in progress
  • More complete integration of XQuery XQuery
    Data Model
  • Advanced Query capabilities
  • Expected to be published in 2006

48
XML Type in SQL
  • A new type (like varchar, date, numeric)
  • SQL2003 - XML type restricted to
  • XML document or
  • XML element or
  • Sequence of XML elements
  • SQL / XML, 2nd edition
  • Full support of XQuery Data Model
  • XML(SEQUENCE), XML(ANY CONTENT), ...

49
Example (SQL2003)
  • create table books(
  • title varchar(20),
  • authors XML)

No schema validation, no typing!
50
XML View on Relational Data
Phantasy-People
SELECT XMLGEN( ltPerson id Id gt Name
lt/Persongt) as Person FROM Phantasy-People
51
XML View on XML Data
SELECT Title, XMLGEN(ltpagtAuthors1/text()lt/pagt
) as PrimA FROM MyAuthors
52
XMLAGG
SalesTable
SELECT Product, XMLAGG( XMLELEMENT(NAME S,
Sales)) AS AllSales FROM SalesTable GROUP BY
Product
53
SQL / XML 2nd Edition
  • XML datatype will support XQuery data model
  • XML(UNTYPED CONTENT) old XML infoset model
  • XML(SEQUENCE) holds heterogeneous sequences
  • ... (other parameterized types validated data
    possible! Non well-formed XML data possible,
    too.)
  • Full XML Schema support and validation
  • XMLQuery() function
  • create XML content using XQuery
  • XMLTable() function
  • Shred XML to rel. Data using Xquery
  • Mapping between SQL XQuery data model
  • XMLCAST between XML and SQL types

54
XMLExists
  • SELECT Title FROM books
  • WHERE
  • XMLEXISTS(Authors, //author et al.)
  • Explicit PASSING also possible (see XMLQuery)

55
XMLQuery expression
  • SQL Expression use in select for constructing
    XML
  • select XMLQuery(
  • for i in ./PurchaseOrder
  • where i/PoNo j/val
  • return i//Item
  • passing p.pocol ,
  • xmlelement(val,2100) as j
  • returning content)
  • from purchaseorder p
  • ltItem itemno21gtltQuantitygt200lt/Quantitygt..lt/Item
    gt
  • ltItem itemno22gtltQuantitygt22lt/Quantitygt..lt/Itemgt

Pocol maps to default item
XMLElement value maps to j
56
XMLTable construct
  • Used in FROM clause translate XML into
    relational data
  • Splits up result into SQL columns, passing always
    BY REF
  • select items.pos, items.itemno, items.quantity
  • from purchaseorder p,
  • XMLTable(for i in /PurchaseOrder//Items
  • where i/Quantity gt 200
  • return i passing p.pocol
  • columns pos for ordinality,
  • itemno number
    path ItemNo
  • quantity number
    DEFAULT 0 path Quantity
  • ) items
  • POS ITEMNO QUANTITY
  • ------ ----------- ------------
  • 1 21 21
  • 2 22 0

Relational columns returned in result
Ordinality returns sequential position
Default value is used If path does not
return value
57
SQL/XML
  • Good
  • Takes advantage of the entire SQL infrastructure
    (e.g. triggers, PL/SQL)
  • Transactional support
  • Scalability, clustering, reliability
  • Global optimization (XML and relational)
  • Standard implemented and supported by Microsoft,
    Oracle, IBM, DataDirect, etc
  • Bad
  • Requires data to be loaded in the database
  • not good for temporary XML data
  • not worth the effort for small volumes of data
  • database complex component, hard to fit in an
    architectural diagram
  • Blend of the two languages (SQL, XQuery) isnt
    natural, easy to use
  • XQuery not supported entirely by database engines
  • Not XML updates a la XQuery yet

58
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

59
Xlinq in .NET
  • http//msdn.microsoft.com/data/linq/

XLinq
DLinq
Declarative access to persistent relational data
Declarative access to transient XML data
Standard Query Operators
.NET Common Language Integration
C
Visual Basic
60
XLinq main concepts
  • XML type added as a basic type (C, VB)
  • Infoset, no typed data
  • No support for the XML Data Model (XDM)
  • Temporary, not persistent XML data
  • Library of basic XML manipulation functions (e.g.
    navigation, construction)
  • Basic .NET Standard Query Operators
  • Collection-oriented set of operations
  • Second order
  • General, not XML specific
  • High level syntax similar to SELECT-FROM-WHERE
  • Natively integrated with the language, not
    through APIs
  • Goal eliminate the need for DOM processing

61
.NET Standard Query Operators
  • Set of second order operators
  • similar to the relational algebra
  • Work on all ordered collections in .NET
  • In particular, they work on collections of XML
    elements
  • Build your own algebraic query execution plan by
    hand !

62
.NET Standard Query Operators
  • Where(selectFunction)
  • Items.Where(i gt i.price lt100)
  • Select(mappingFunction)
  • Products.Select(p gt new p.name, p.price)
  • SelectMany(mappingFunction)
  • Customers.SelectMany(c gt c.orders)
  • Take, Skip
  • Products.OrderByDescending(p gt p.price).Take(3)
  • TakeWhile, SkipWhile(predicate)
  • Products.OrderByDescending(p gt
    p.price).TakeWhile(p gt p.pricelt100)

63
.NET Standard Query Operators
  • Join(outer, inner, outerKeySelection,
    innerKeySelection, resultSelector)
  • Customers.Join(orders, c gt c.CustomerID, o gt
    o.CustomerID, (c, o) gt new c.name, o.Total)
  • GroupJoin(outer, inner, outerKeySelection,
    innerKeySelection, resultSelector)
  • Customers.GroupJoin(orders, c gt c.CustomerID, o
    gt o.CustomerID, (c, co) gt new c.name,
    co.Sum(ogto.Total))
  • OrderBy(comparisonFunct), ThenBy(ComparisonFunct)
  • Collection.OrderBy().ThenBy().ThenBy()

64
.NET Standard Query Operators
  • GroupBy(collection, keySelector)
  • GroupBy(collection, equalityComparer)
  • Distinct, Union, Intersect, Except
  • Based on GetHashCode and Equals
  • ToDictionary(collection, keySelector)
  • Creates a one-to-one dictionary
  • ToLookup(collection, keySelector)
  • Creates a one-to-many dictionary
  • Any(collection, predicate), All(collection,
    predicate)
  • products.Any(p gt p.pricegt100)
  • Sum, Count, Min, Max, Average, Aggregate

65
Constructing XML data
  • C, VB (nested) functional notation
  • new XMLElement(person,
  • new XMLAttribute(age, 45),
  • new XMLElement(name, Patrick Hines),
  • new XMLElement(phone, 425-555-0144))
  • VB 9.0 inlined XML with dynamic content
  • ltcontactgt
  • ltnamegtltmyNamegtltnamegt
  • lt/contactgt

66
A more complex example
  • new XMLElement(contracts, contracts.
  • Where(c gt c.address.city New York).
  • OrderBy(c gt c.age).
  • Select(c gt new XMLElement(contact,
  • new XMLElement(name,
    c.name),
  • new XMLElement(phone,
    c.phone)))

Linq works across data models (objects, tuples,
XML)
67
Navigation primitives in XLinq
  • Similar to the path axes in Xpath 1.0
  • Nodes() retrieves all the children
  • Elements() retrieves all elements children
  • Elements(name) selects children elem. by name
  • Attributes()
  • Parent()
  • Descendents()
  • Etc

68
Updating primitives in XLinq
  • Add()
  • add new content to an existing XML tree
  • Remove()
  • Delete nodes from a tree
  • ReplaceContent()
  • Replaces the content of a node
  • SetElement()
  • Particular case of ReplaceContent
  • SetAttribute()

69
Declarative XML querying in XLinq
  • Select-From-Where style syntax directly supported
    C 3.0 (no API barrier)
  • Can be logically mapped into a combination of
    query operators (see above)
  • from c in contacts.Elements(contact),
  • average contacts.Elements(contact).
  • Average(x gt (int)
    x.Element(netWorth))
  • where (int) c.Element(netWorth) gt average
  • orderBy (string) c.Element(name)
  • select c

70
Conclusion on XLinq
  • Good
  • Usability for .NET developers (simple tasks)
  • Integration with the rest of .NETs tools and
    libraries
  • Bad
  • No support for typed data
  • No static analysis
  • No schema based static typing
  • No optimization based on static knowledge
  • Blend of imperative and declarative code
    problematic
  • Semantics lazy evaluation
  • Semantics error handling
  • Semantics imperative and and or are
    non-commutative
  • Optimization global dataflow analysis hard
  • Optimization streaming and indexing are explicit

71
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

72
WS-BPEL
  • Web Service Business Process Execution Language
    (version 2.0)
  • OASIS, May 2006 working draft
  • Not a general purpose programming language
  • Designed for a specific task
  • Specification of the implementation of a Web
    Service created by the composition and
    orchestration of other Web Services
  • Created by logically merging two previous XML
    programming languages
  • WSFL (IBM)
  • Xlang (Microsoft)
  • Implemented by Microsoft, Oracle, IBM, SAP

73
WS-BPEL programs
pickup notification
ship order
receive place order send ship order if(shipComplet
ed) send order notice (completed) else
send order notice (!completed) receive update
notification update ship history receive
invoice send invoice response receive payment
confirmation send order confirmation
place order
order confirmation
receive invoice send invoice respond
payment confirmation
74
Main concepts
  • Traditional workflow concepts adapted to the
    reality of XML and Web Services
  • Ports, messages and operations (WSDL)
  • Describe the external interface of the process
  • Activities
  • Describe how various components are assembled
    into complex execution logic
  • Variables
  • Internal state of the program
  • Error and compensation handlers
  • Describe the behavior in case of dynamic faults
  • Correlation sets
  • To describe how various process instances
    participate in complex conversations
  • Scopes

75
WS-BPEL query and expression languages
  • XML data model, query language and expression
    language are black boxes for the main language
  • By default Infoset (untyped data) and Xpath 1.0
  • Uses XSLT 1.0 for data transformation
    (doXslTransform)
  • Allows other data models and languages
  • XDM (XQuery Data Model)
  • Xpath 2.0
  • XQuery
  • ltassigngt
  • ltcopygt
  • ltfromgt po/lineItem_at_prodCodemyProd/amtex
    chRatelt/fromgt
  • lttogt convertPO/lineItem_at_prodCodemyPro
    d lt/togt
  • ltcopygt
  • lt/assigngt

76
WS-BPEL simple activities
  • assign and copy
  • invoke
  • receive
  • throw
  • wait
  • empty
  • exit
  • user defined activities (extensibility mechanism)

77
WS-BPEL structured activities
  • sequence
  • if
  • while
  • repeatUntil
  • pick
  • selectively choosing an activity
  • flow
  • for parallel and control dependency processing
  • forEach

78
WS-BPEL active behavior
  • Each scope can have event handlers
  • They execute concurrently
  • They start when the parent scope starts
  • OnEvent
  • Waiting for a particular type of message
  • OnAlarm
  • For (duration value), until (specific point in
    time)
  • repeatEvery

79
WS-BPEL error handling
  • Support for Long Running Transactions
  • Mechanism for specifying the compensation logic
    (sagas)
  • Compensation handlers associated with scopes

80
Compensation example
  • ltscopegt
  • ltcompensationHandlergt
  • ltinvoke partnerLinkSeller portTypePurchasing
  • operationCancelPurchase
    inputVariablegetResponse
  • outputVariablegetConfirmation
    gt
  • ltcorrelationsgt
  • ltcorrelation setPurchaseOrder
    patternrequest/gt
  • lt/correlationsgt
  • lt/invokegt
  • lt/compensationHandlergt
  • ltinvoke partnerLinkSeller portTypePurchasing
  • operationPurchase
    inputVariablesendPurchaseOrder
  • outputVariablegetResponsegt
  • ltcorrelationsgt
  • ltcorrelation setPurchaseOrder
    patternrequest initiateyes/gt
  • lt/correlationsgt
  • lt/invokegt
  • lt/scopegt

81
WS-BPEL conclusion
  • Good
  • Easy specification of Web Services orchestration
  • High level
  • Useful constructs (parallelism, compensation,
    events, etc)
  • Bad
  • Separation between control flow and
    expression/query language
  • Impact on static typing, automatic optimization,
    usability

82
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

83
W3C XQuery, Xpath, XSLT
XSLT 2.0
XQuery 1.0
extends
uses as a sublanguage
FLWOR expressions Node constructors Validation
Xpath 2.0
2006
extends, almost backwards compatible
Xpath 1.0
uses as a sublanguage
1999
XSLT 1.0
84
XQuery 1.0 vs. XSLT 2.0
  • Equivalent expressive power
  • Same data model, type system, function library
  • Different programming paradigms
  • Iteration-based for XQuery
  • Recursive template-based for XSLT
  • Two different syntaxes for the same language
  • XQuery easier when shape of the data is known
  • XSLT easier to use when shape of the data is
    unknown
  • Implementations often use the same runtime for
    both
  • Oracle, Saxon
  • Better language integration in the future

XQuery
XSLT 2.0
Xpath 2.0
Function Library
XML Type System (XML Schema)
XML Data Model (XDM)
85
XML Data Model (XDM)
  • Abstract (I.e. logical) data model for XML data
  • Same role for Xpath 2.0, XQuery and XSLT 2.0 as
    the relational data model for SQL
  • Purely logical --- no standard storage or access
    model (in purpose)
  • XQuery, Xpath 2.0 and XSLT 2.0 are closed with
    respect to XDM

XQuery Xpath 2.0 XSLT 2.0
Infoset
XML Data Model
PSVI
86
XML Data Model (XDM)
Remember Lisp ?
  • Instance of the data model
  • a sequence composed of zero or more items
  • The empty sequence often considered as the null
    value
  • Items
  • nodes or atomic values
  • Nodes
  • document element attribute text
    namespaces PI comment
  • Atomic values
  • Instances of all XML Schema atomic types
  • string, boolean, ID, IDREF, decimal, QName, URI,
    ...
  • untyped atomic values
  • Typed (I.e. schema validated) and untyped (I.e.
    non schema validated) nodes and values


87
Xpath 2.0/XQuery/XSLT 2.0 type system
  • Types are imported from XML Schemas
  • Standard static typing for XQuery and XPath 2.0
  • Optional feature
  • Pessimistic/conservative
  • XSLT 2.0 has no standard static typing rules
  • Dynamic dispatch makes dataflow analysis very
    hard
  • The goal of the type system is
  • detect statically errors in the queries
  • infer the type of the result of valid queries
  • ensure statically that the result of a given
    query is of a given (expected) type if the input
    dataset is guaranteed to be of a given type

88
What is XQuery ?
  • A programming language that can express
    arbitrary XML to XML data transformations
  • Logical/physical data independence
  • Declarative
  • Side-effect free
  • Strongly typed language
  • An expression language for XML.
  • Such expressions are embeddable in a variety of
    environments (programming languages, APIs, etc)

89
XQuery vs. SQL
Persistent data
Persistent data
Large volume
Large volume
SQL
XQuery
Transacted data
Transacted data
Declarative processing
Declarative processing
SQL works on the relational data model. XQuery
works on XML Data Model (XDM). XQuery the XML
replacement for SQL ? No. XQuery is not a query
language, but a declarative programming language.
90
XQuery programs
  • An XQuery program
  • a prolog an expression
  • Role of the prolog
  • Populate the context where the expression is
    compiled and evaluated
  • Prologue contains
  • namespace definitions
  • schema imports
  • default element and function namespace
  • function definitions
  • collations declarations
  • function library imports
  • global and external variables definitions, etc
  • The prolog is the link between the XQuery
    expression and the environment where the
    expression is embedded

91
XQuery expressions
  • XQuery Expr Constants Variable
    FunctionCalls PathExpr
  • ComparisonExpr ArithmeticExpr LogicExpr
  • FLWRExpr ConditionalExpr
    QuantifiedExpr
  • TypeSwitchExpr InstanceofExpr CastExpr
  • UnionExpr IntersectExceptExpr
  • ConstructorExpr ValidateExpr
  • Expressions can be nested with full generality !
  • Functional programming heritage.

92
Path expressions
  • document(bibliography.xml)/bib
  • x/childbib/childbook/_at_year
  • x/parent
  • x/child/descendentcomment()
  • x/childelement(, nsPoType)
  • x/attributeattribute(, xsinteger)
  • x/ancestorsdocument(schema-element(nsPO))
  • x/(childelement(, xsdate)
    attributeattribute(, xsdate)
  • x/f(.)

93
FLWOR expressions
  • Similar to the Select-From-Where of SQL
  • Clauses FOR, LET, WHERE, ORDER BY, RETURN
  • Example
  • for x in //bib/book
    / similar to FROM in SQL /
  • let y x/author
    / no analogy in SQL /
  • where x/titleThe politics of experience

  • / similar to WHERE in SQL /
  • order by x/year
    / similar to the ORDER BY clause /
  • return count(y)
    / similar to SELECT in SQL /

FOR var IN expr
RETURN expr
WHERE expr
LET var expr
ORDER expr
94
Node constructors
  • Constructing new nodes
  • Elements, attributes, documents, processing
    instructions, comments, text
  • Constant vs. Dynamically evaluated content
  • ltresultgt
  • literal text content
  • lt/resultgt
  • ltresultgt
  • x/name
  • lt/resultgt
  • ltresultgt
  • some content here x/text()and some more here
  • lt/resultgt

95
Functions in XQuery
  • In-place XQuery functions
  • declare function nsfoo(x as xsinteger) as
    element()
  • ltagt x1lt/agt
  • Can be recursive and mutually recursive
  • Support for external functions
  • Support for library of modules

XQuery functions play the role of database views
96
Dynamic dispatch in XSLT
  • Order of templates depends on the data
  • Very useful while dealing with irregular XML
    structures
  • ltxsltemplate match"/"gt
  • ltaxslstylesheet version"2.0"gt
  • ltxslapply-templates/gt
  • lt/axslstylesheetgt
  • lt/xsltemplategt
  • ltxsltemplate match"elements"gt
  • ltaxsltemplate match"/"gt
  • ltaxslcomment select"systemproperty('xslversion
    ')"/gt ltaxslapply-templates/gt
  • lt/axsltemplategt
  • lt/xsltemplategt
  • ltxsltemplate match"block"gt
  • ltaxsltemplate match"."gt
  • ltfoblockgt ltaxslapply-templates/gt lt/foblockgt
  • lt/axsltemplategt
  • lt/xsltemplategt

97
XQuery/Xpath 2.0 Full Text
  • XML data frequently contains text
  • XQuery/Xpath 2.0 Full Text extension provides
    search capabilities
  • Use case example RSS/blogs filtering
  • FTSelections special kind of Boolean predicates
  • Operators
  • words, and, or, not,  mild not, order, scope,
    distance, window, times)  
  • Match options
  • Case, diacritics, stemming, thesauri, stop words,
    language, wildcards
  • Scoring

98
XQuery Full Text Example
  • for book in doc("http//bstore1.example.com/full-
    text.xml")/books/book
  • let title book/metadata/title. ftcontains
    "improving" "usability" distance at most 2
    words ordered at start
  • where count(title)gt0
  • return title

99
XML Update facility
  • XML Update Facility W3C Working Draft
  • Ability to modify nodes in an XDM instance in a
    declarative fashion
  • Primitive update operations
  • insert ltagegt24lt/agegt into personnameJim
  • delete book_at_yearlt2000
  • rename article as publication
  • replace (books/book)1 with ltbookgt.lt/bookgt
  • replace value of title with New Title

100
XML Update Facility (2)
  • Conditional updates
  • if(book/yearlt2000)
  • then delete book/year
  • else rename book/year as publicationTime
  • Collection-oriented updates
  • for x in book
  • where x/yearlt200
  • do rename x as oldBook
  • XML transformations using the update syntax
  • Single snapshot query

101
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

102
Procedural extensions to XQuery
  • Very controversial topic
  • Old research
  • XL project (Florescu, Kossmann, 2001)
  • New Research
  • XQuery! (Simeon, Ghelli)
  • XQueryP (Carey, Chamberlin, Kossmann, Florescu,
    Robie)
  • Industrial pressure
  • E.g.MarkLogics XML application development
    platform
  • Long history of adding control flow logic to
    query languages
  • More then 15 years of success of PL/SQL and other
    procedural extensions for SQL
  • SQL might have failed otherwise !

103
What functionalities are missing in XQuery (after
adding updates)?
  • The ability to see the results of their
    side-effects during the computation
  • The ability to invoke external computations that
    cannot participate in a snapshot semantics
  • The ability to preserve state during computation
  • The ability to recover (in a controlled way) from
    dynamic errors

104
XQueryP proposal
  • Submitted by several companies to W3C
  • Oracle, BEA, DataDirect, etc
  • Under consideration for standardization
  • Surprisingly very small extensions to XQuery
    can satisfy many new use case scenarios (not all
    unfortunately)

105
The XQueryP technical proposal
  • A well-defined evaluation order for XQuery
    expressions (sequential order)
  • Paradigm shift for the database people
  • Does not mean that optimizability is reduced !
  • Reduce the granularity of the snapshot to each
    individual atomic update expression
  • Adds three new kind of expressions
  • Block
  • Set
  • While

106
(1) Sequential evaluation order
  • Slight modification to existing rules
  • FLWOR FLWO clauses are evaluated first result
    in a tuple stream then Return clause is
    evaluated in order for each tuple. Side-effects
    made by one row are visible to the subsequent
    rows.
  • COMMA subexpressions are evaluated in order
  • (UPDATING) FUNCTION CALL arguments are evaluated
    first before body gets evaluated

Required (only) if we add side-effects
immediately visible to the program e.g. variable
assignments or single snapshot atomic updates
otherwise semantics not deterministic.
107
(2) Reduce snapshot granularity
  • Today update snapshot entire query
  • Change
  • Every single atomic update expression (insert,
    delete, rename, replace) is executed and made
    effective immediately
  • Semantics is deterministic because of the
    sequential evaluation order (point1)

108
(3) Adding new expressions
  • Block expressions
  • Assignment expressions
  • While expressions

109
Block expression
  • Syntax
  • ( BlockDecl ) Expr ( Expr)
  • BlockDecl
  • (declare VarName TypeDecl? (
    ExprSingle) ?)?
  • (, VarName TypeDecl? (
    ExprSingle) ? )
  • Semantics
  • Declare a set of updatable variables, whose scope
    is only the block expression (in order)
  • Evaluate each expression (in order) and make the
    effects visible immediately
  • Return the value of the last expression
  • Updating if body contains an updating expression

110
Assignment expression
  • Syntax
  • set VarName ExprSingle
  • Semantics
  • Change the value of the variable
  • Variable has to be external or declared in a
    block (no let, for or typeswitch)
  • Updating expression
  • Semantics is deterministic because of the
    sequential evaluation order

111
While expression
  • Syntax
  • while ( ExprSingle ) return Expr
  • Semantics
  • Evaluate the test condition
  • If true then evaluate the return clause repeat
  • If false return the concatenation of the values
    returned by all previous evaluations of return
  • Syntactic sugar, mostly for convenience
  • Could be written using recursive functions

112
Atomic Blocks
  • Syntax
  • atomic . . .
  • Semantics
  • If the evaluation of Expr does not raise errors,
    then result is returned
  • If the evaluation of Expr raises a dynamic error
    then no partial side-effects are performed (all
    are rolled back) and the result is the error
  • Only the largest atomic scope is effective
  • Note XQuery! had a similar construct
  • Snap vs. atomic

113
XQueryP example
  • declare updating function localprune(d as
    xsinteger) as xsinteger
  • declare count as xsinteger 0
  • for m in /mail/messagedate lt d
  • return do delete m
  • set count count 1
  • count

114
More complex example
  • declare updating function myNscumCost(projects)
    as element( )
  • declare total-cost as xsdecimal 0
  • for p in projectsyear eq 2005
  • return
  • set total-cost total-costp/cost
  • ltprojectgt
  • ltnamegtp/namelt/namegt
  • ltcostgtp/costlt/costgt
  • ltcumCostgttotal-costlt/cumCostgt
  • ltprojectgt

Today additional self join, or recursive function
115
XQueryP conclusion
  • If successful, can provide a platform for
    building XML-only applications
  • No more SQL, no more Java/C
  • Declarative programming and usability
  • Good less code, higher level
  • Bad less programmers can do it, harder debugging
  • Automatic optimization
  • Compilers will be very complex to build
  • Better chances of success

116
Research projects
  • XL
  • Web Services implementation
  • Xduce
  • Static typing, pattern matching
  • Links
  • XML programming without tiers
  • XQuery!
  • Make XQuery fully compositional with side-effects
  • User controlled granularity for snapshots

117
Overview
  • Introduction
  • Applications Architectures
  • Interfaces to existing languages (Java, .NET, )
  • XML APIs SAX, DOM, StaX
  • Codegenerators JAXB 2.0, XML Beans, SDO, EMF
  • Extensions to existing programming languages
  • JavaScript (ECMA), AJAX, PHP
  • SQL/XML
  • Microsofts XLinq
  • Native XML Programming Languages
  • Domain-specific languages BPEL
  • Pure XML Type System XQuery, XSLT, XQueryP
  • Research Curl, XL, Xduce, Links, XQuery!, SIMKIN
  • Comparison of existing solutions

118
XML programming for what kind of application ?
  • Simple XML serialization for communication (XML
    at the end)
  • Xlink, JavaAPIs
  • Web distributed XML communication
  • Ajax
  • Complex XML computations (HealthCare7, XBRL)
  • XQuery, XQueryP, XLink
  • Orchestration of Web Service messages
  • BPEL
  • Process a mix of relational and XML data
  • SQL/XML
  • Formatting XML content
  • XSLT
  • Unfortunately, many (most) applications have
    several of those needs in the same time !
  • Changing paradigms is very costly

119
What community what background?
  • XML is an unification factor for CS various
    communities
  • For the moment each community wrongly believes to
    solve the XML problem
  • Global XML picture missing in each community

Programming languages
Databases
XML
Content management
Workflow
120
XML programming where in the architecture ?
  • What tier in the architecture ?
  • Client, server, middle tier ?
  • Same language on all the tiers ?
  • XQuery can run on all tiers
  • EcmaScript, PhP werent designed to scale on a
    large server, but middle tier
  • Which one will run on a mobile phone ?
  • XML might have an impact on the existence of the
    existing multi-tiered architectures

Client (XHTML, scripts)
Communication (XML)
Application logic (Java/C/PhP)
Storage (supports XML)
121
Programming style
  • All styles
  • Imperative programming APIs (Java DOM/SAX)
  • Declarative (XQuery, XQueryP)
  • Imperative declarative (Xlink)
  • Workflow (BPEL)
  • Recursive template (XSLT)
  • Choice
  • Usability based on what people are already used
    to do
  • Performance declarative is easier to optimize
  • Neither of those alternatives provides a
    complete XML programming solution
  • All will evolve in the future
  • Which one will provide all the functionality
    required ?

122
How much weight does XML have in the language ?
  • One of the thousands APIs
  • E.g. Java DOM
  • Language agnostic to the XML existence
  • More serious syntactic extension
  • Xlinq, SQL/XML
  • XML is one feature among others in the language
  • Nothing but XML
  • XQuery, Xpath, XSLT, XQueryP, BPEL
  • Try to process real XML (complex or not, good or
    bad), not to simplify it, or fix it
  • XML is a given

123
Compliance to the W3C family of standards
  • XML is not an orphan it comes with an
    Italian-style family of W3C standards
  • Infoset, Namespaces, XML Schema, Xlink, XForms,
    XHTML, binary XML, etc, etc
  • Forced to live well together by W3C rules
Write a Comment
User Comments (0)
About PowerShow.com