INT-2: XQuery Levels the Data Integration Playing Field - PowerPoint PPT Presentation

About This Presentation
Title:

INT-2: XQuery Levels the Data Integration Playing Field

Description:

Title [Presentation Title] Author: CMP Media LLC Last modified by: rotherma Created Date: 11/16/2006 8:06:31 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 46
Provided by: CMPM53
Category:

less

Transcript and Presenter's Notes

Title: INT-2: XQuery Levels the Data Integration Playing Field


1
INT-2 XQuery Levels the Data Integration Playing
Field
Carlo (Minollo) Innocenti
DataDirect XML Technologies, Program Manager
2
Agenda
  • The Problem
  • Why XQuery
  • XQuery for Java API (XQJ)
  • XQuery for Data Integration
  • XQuery Demos, Code Walk-throughs
  • Summary

3
A Typical Data Integration Problem
The web server needs to fetch users personal
data, stock holdings and live stock data to
compile a report to send back to the user
The user submits a request for a report about
their stock holdings
A public service offers live (delayed) stock
prices
Different repositories are used for different
parts of the information necessary to create a
stock holdings report
4
Some implementation constraints
Java/JSP codeaccessing the variousJava APIs
and generating the HTML report
HTML
SOAP through AXIS
Java Open Client or JDBC
dBASE IV APIs
JDBC
5
A dangerous approach
Data Access Layer
6
The XQuery Vision
XQuery
7
Agenda
  • The Problem
  • Why XQuery
  • XQuery for Java API (XQJ)
  • XQuery for Data Integration
  • XQuery Demos, Code Walk-throughs
  • Summary

8
What is XQuery?
  • W3C Query Language for XML
  • Native XML Programming Language
  • The SQL for XML
  • Designed to query, process, and create XML
  • High level functionality
  • Find anything in an XML structure
  • Querying and combining data
  • Creating XML structures
  • Functions
  • User-defined function libraries

9
XQuery Basics
  • Path Expressions Finding Data
    doc("holdings.xml")/holdings/entry
  • FLWOR Expressions Querying and Combining
    Datafor h in doc("holdings.xml")/holdings/holdi
    ng,     c in doc("companies.xml")/companies/comp
    anywhere h/userid "Minollo"     and
    c/ticker h/stocktickerreturn c/name

10
XQuery Basics
  • Path Expressions, FLWOR Expressions, and XML
    Constructorsfor h in doc("holdings.xml")/holdin
    gs/entry,     c in doc("companies.xml")/companie
    s/companywhere h/userid "Minollo"   and
    c/ticker h/stocktickerreturn  ltcompany
    ticker" c/ticker "gt      c/companyname
          c/annualrevenues   lt/companygt

11
Functions and Modules
  • A Function in a Library Modulemodule namespace
    stock"http//tagsalad.com/stocks"declare
    function stockcompanies(user as xsstring)
      for h in doc("holdings")/holdings/entry,   
        c in doc("companies")/companies/company  wh
    ere h/userid user      and c/ticker
    h/stockticker  return    ltcompany ticker"
    c/ticker "gt     lt/companygt

12
Functions and Modules (2)
  • Importing and Using a Library Moduleimport
    module namespace stock"http//tagsalad.com/stocks
    "stockcompanies("Minollo")

13
Why XQuery?
  • Native Support for XML
  • Conventional programming and query languages are
    not designed for XML
  • No more parse, navigate, cast, repeat XML is
    the native datatype
  • Designed for Data Integration
  • Native XML and non-XML data can be used the same
    way
  • Vastly simplifies development when input includes
    XML, relational, EDI
  • Requires support from implementation for the data
    sources you need

14
Why XQuery?
  • XML Output is Directly Useful
  • XML is becoming the industry standard for data
    exchange
  • Dynamic Web Sites
  • Publishing Applications
  • Web Messages
  • We normally dont exchange SQL tables or present
    them to users!
  • Programmer Productivity
  • Readable, declarative code transparent, easier
    to maintain
  • 7 to 20 times less code than Java SQL JDBC
    XML APIs

15
Why XQuery?
  • Performance
  • Declarative code can be optimized by the XQuery
    Engine
  • Relational database vendors and experts very
    involved in the design
  • Actually performance depends on the
    implementation

16
Benefits of XQuery
  • Data Integration is harder without XQuery!
  • Every data source is different
  • Many applications use several languages and APIs
    to address data sources (e.g. JavaJDBCDOM, SQL,
    Perl, XSLT)
  • Mediating among data sources accounts for a lot
    of code
  • XQuery treats all data sources as XML

17
Benefits of XQuery
  • Processing XML is harder without XQuery!
  • Most programming languages dont know XML
    structures
  • Parse, navigate, cast, repeat
  • XML is the native data structure for XQuery
  • XML Reporting is harder without XQuery!
  • XML input and output may have very complex
    structure
  • Many different desired XML outputs
  • Data Integration, Native XML Processing are
    needed
  • XQuery gives full query processing for any XML
    input and output

18
Agenda
  • The Problem
  • Why XQuery
  • XQuery for Java API (XQJ)
  • XQuery for Data Integration
  • XQuery Demos, Code Walk-throughs
  • Summary

19
What is XQJ?
  • XQuery API for Java (XQJ) JSR 225
  • The JDBC for XQuery

20
Benefits of XQJ
  • Industry Standard, similar to JDBC
  • No need to learn a new proprietary API for each
    product and each version
  • Can build on existing JDBC knowledge
  • Lets XQuery fit into any Java architecture
  • Queries can be created or parameterized at
    run-time
  • Example A portfolio for a given user at a given
    date
  • Interfaces are designed for use in J2EE
    applications
  • Example Results can be retrieved as DOM, SAX,
    StAX, or text

21
Agenda
  • The Problem
  • Why XQuery
  • XQuery for Java API (XQJ)
  • XQuery for Data Integration
  • XQuery Demos, Code Walk-throughs
  • Summary

22
An XQuery architecture
23
DataDirect XQuery
  • High performance
  • Scalable
  • Embeddable
  • Plugs into any Java architecture
  • Accesses almost any data source
  • No dependency on servers
  • Standards-based


24
XQuery can be fast for relational data!
ltportfolio gt    ltcompany ticker"AMZN"gt        lt
companynamegtAmazon.com, Inc.lt/companynamegt       
 ltannualrevenuesgt7780lt/annualrevenuesgt    lt/compa
nygt    ltcompany ticker"EBAY"gt        ltcompanyna
megteBay Inc.lt/companynamegt        ltannualrevenues
gt22600lt/annualrevenuesgt    lt/companygt    ltcompan
y ticker"IBM"gt        ltcompanynamegtInt'l
Business Machines Clt/companynamegt        ltannualr
evenuesgt128200lt/annualrevenuesgt    lt/companygt   
 ltcompany ticker"PRGS"gt        ltcompanynamegtProg
ress Softwarelt/companynamegt        ltannualrevenue
sgt493.4lt/annualrevenuesgt    lt/companygtlt/portfoli
ogt
  • Highly optimized for relational sources
  • Minimizes retrieval of data
  • No more rows than needed
  • No more columns than needed
  • Uses database functionality
  • Joins
  • Sorting
  • Etc..
  • Optimizes for each SQL dialect
  • Efficient JDBC retrieval
  • Embeds DataDirect JDBC technology
  • Optimizations added to support XQuery
  • Supports incremental retrieval
  • Optimizes for XML hierarchies
  • Sort-merge algorithm
  • Minimal cost of XML construction
  • Leverages SQL library
  • Supports hints

HOLDINGS HOLDINGS  
USERID TICKER SHARES
Jonathan PRGS 23
Minollo PRGS 4000000
Jonathan AMZN 3000
Minollo AMZN 3000
COMPANIES COMPANIES COMPANIES
TICKER NAME REVENUES
AMZN Amazon.com, Inc. 7780
EBAY eBay Inc. 22600
PRGS Progress Software 493.4
YHOO Yahoo! Inc. 10700
25
XQuery can be fast for XML files!
  • lt?xml version"1.0" encoding"UTF-8"?gtltsoapEnvel
    ope xmlnssoap"http//schemas.xmlsoap.org/soap/en
    velope/" xmlnsxsi"http//www.w3.org/2001/XMLSche
    ma-instance" xmlnsxsd"http//www.w3.org/2001/XML
    Schema"gt  ltsoapBodygt    ltGetQuotesResponse
    xmlns"http//swanandmokashi.com"gt      ltGetQuote
    sResultgt        ltQuotegt          ltCompanyNamegtAP
    PLE COMPUTERlt/CompanyNamegt          ltStockTickergt
    AAPLlt/StockTickergt          ltStockQuotegt74.17lt/St
    ockQuotegt          ltLastUpdatedgt9/14/2006
    401pmlt/LastUpdatedgt          ltChangegt1.17lt/Chang
    egt          ltPercentChangegt1.82lt/PercentChangegt
              ltOpenPricegtN/Alt/OpenPricegt          ltDa
    yHighPricegtN/Alt/DayHighPricegt          ltDayLowPri
    cegtN/Alt/DayLowPricegt          ltVolumegt0lt/Volumegt
              ltMarketCapgt63.266Blt/MarketCapgt         
     ltYearRangegt47.87 - 86.40lt/YearRangegt          ltE
    xDividendDategt21-Nov-95lt/ExDividendDategt         
     ltDividendYieldgtN/Alt/DividendYieldgt          ltDiv
    idendPerSharegt0.00lt/DividendPerSharegt        lt/Qu
    otegt      lt/GetQuotesResultgt    lt/GetQuotesRespo
    nsegt  lt/soapBodygtlt/soapEnvelopegt
  • General XQuery rewrites
  • Constant-folding, elimination of common
    sub-expressions, loop rewrites, ordering
    rewrites, etc
  • Document projection
  • XML construction accounts for much of the cost
  • Dont build parts of the document that the query
    doesnt need!
  • Document streaming
  • Discard parts of the document when no longer
    needed
  • Makes memory usage near constant with size of
    file
  • Multiple Gigabytes can be queried

26
XQuery can use XML Converters
  • EDI File
  • ISA00DATADIRECT00STYLUS200601DATA DIRECT
  • 01STYLUS STUDIO 060504121200503200654321
    0I'
  • GSBFDATADIRECTSTYLUS200620060504121212256X
    005030'
  • ST1053389'
  • BGN28102420060504121212GM'
  • NM12L4Progress Software Corporation'
  • N314 Oak Park Drive'
  • N4BedfordMA01730USAA'
  • REF1ZPRGS'
  • NM12L4Apple Computer, Inc.'
  • N31 Infinite Loop'
  • N4CupertinoCA95014USAA'
  • REF1ZAAPL'
  • SE113389'
  • GE1256'
  • IEA1200654321'

doc("adapter//EDI?ticker-request.edi") ltX12gt   
 ltISAgt        ltISA01gtlt!--I01 Authorization
Information Qualifier--gt00lt!--No Authorization
Information Present (No Meaningful Information in
I02)--gtlt/ISA01gt        ltISA02gtlt!--I02
Authorization Information--gtDATADIRECTlt/ISA02gt   
     ltISA03gtlt!--I03 Security Information
Qualifier--gt00lt!--No Security Information Present
(No Meaningful Information in I04)--gtlt/ISA03gt    
    ltISA04gtlt!--I04 Security Information--gtSTYLUS2
006lt/ISA04gt        ltISA05gtlt!--I05 Interchange
ID Qualifier--gt01lt!--Duns (Dun amp
Bradstreet)--gtlt/ISA05gt        ltISA06gtlt!--I06
Interchange Sender ID--gtDATA DIRECT
lt/ISA06gt        ltISA07gtlt!--I05 Interchange ID
Qualifier--gt01lt!--Duns (Dun amp
Bradstreet)--gtlt/ISA07gt        ltISA08gtlt!--I07
Interchange Receiver ID--gtSTYLUS STUDIO
lt/ISA08gt        ltISA09gtlt!--I08 Interchange
Date--gt060504lt!--2006-05-04--gtlt/ISA09gt        ltIS
A10gtlt!--I09 Interchange Time--gt1212lt/ISA10gt     
   ltISA11gtlt!--I65 Repetition Separator--gtlt/ISA11
gt        ltISA12gtlt!--I11 Interchange Control
Version Number--gt00503lt!--Standards Approved for
Publication by ASC X12 Procedures Review Board
through October 2005--gtlt/ISA12gt        ltISA13gtlt!-
-I12 Interchange Control Number--gt200654321lt/ISA1
3gt        ltISA14gtlt!--I13 Acknowledgment
Requested--gt0lt!--No Interchange Acknowledgment
Requested--gtlt/ISA14gt        ltISA15gtlt!--I14
Interchange Usage Indicator--gtIlt!--Information--gtlt
/ISA15gt        ltISA16gtlt!--I15 Component Element
Separator--gtlt/ISA16gt    lt/ISAgt    ltGSgt        
ltGS01gtlt!--479 Functional Identifier
Code--gtBFlt!--Business Entity Filings
(105)--gtlt/GS01gt        ltGS02gtlt!--142
Application Sender's Code--gtDATADIRECTlt/GS02gt    
    ltGS03gtlt!--124 Application Receiver's
Code--gtSTYLUS2006lt/GS03gt        ltGS04gtlt!--373
Date--gt20060504lt!--2006-05-04--gtlt/GS04gt        ltG
S05gtlt!--337 Time--gt121212lt/GS05gt        ltGS06gtlt!
--28 Group Control Number--gt256lt/GS06gt        ltG
S07gtlt!--455 Responsible Agency
Code--gtXlt!--Accredited Standards Committee
X12--gtlt/GS07gt        ltGS08gtlt!--480 Version /
Release / Industry Identifier Code--gt005030lt!--Sta
ndards Approved for Publication by ASC X12
Procedures Review Board through October
2005--gtlt/GS08gt    lt/GSgt
  • Convert non-XML format to XML on-the-fly!
  • EDI message types
  • Comma-delimited or tab-delimited files
  • dBase
  • RTF
  • mbox
  • Batch conversions are supported
  • Custom conversions

27
XQuery can access Web Services
declare function localamazon-listing(isbn)   
 lttnsRequestgt      lttnsConditiongtAlllt/tnsCondi
tiongt      lttnsDeliveryMethodgtShiplt/tnsDelivery
Methodgt      lttnsFutureLaunchDate/gt      lttnsI
dTypegtASINlt/tnsIdTypegt      lttnsItemIdgt isbn
lt/tnsItemIdgt      lttnsResponseGroupgtMediumlt/tn
sResponseGroupgt    lt/tnsRequestgt let loc
ltlocation address"http//soap.amazon.com/onca/
soap?ServiceAWSECommerceService"
soapaction"http//soap.amazon.com" /gtlet
payload localamazon-listing("0395518482")ret
urn wscall(loc, payload)
  • Leverage existing SOA architecture in queries!
  • Integrate queries with web services
  • Easily generate complex web service requests
  • Vastly increases the reach of your queries

28
Questions so far?
29
Agenda
  • The Problem
  • Why XQuery
  • XQuery for Java API (XQJ)
  • XQuery for Data Integration
  • XQuery Demos, Code Walk-throughs
  • Summary

30
A Data Integration problem
Java/JSP codeaccessing the variousJava APIs and
generating the HTML report
HTML
SOAP through AXIS
Java Open Client or JDBC
dBASE IV APIs
JDBC
31
A dangerous approach
Data Access Layer
32
The XQuery Vision
XQuery
33
The DataDirect XQuery Solution
HTML
34
Step by step
  • XQuery to aggregate data from the multiple data
    sources
  • XQuery to publish an HTML or XSL-FO (PDF) report
    directly
  • Pipelining multiple XQueries with validation
    steps
  • Exposing an XQuery Web Service and consuming it
    from OpenEdge

35
Step by step
  • XQuery to aggregate data from the multiple data
    sources
  • XQuery to publish an HTML or XSL-FO (PDF) report
    directly
  • Pipelining multiple XQueries with validation
    steps
  • Exposing an XQuery Web Service and consuming it
    from OpenEdge

36
Step by step
  • XQuery to aggregate data from the multiple data
    sources
  • XQuery to publish an HTML or XSL-FO (PDF) report
    directly
  • Pipelining multiple XQueries with validation
    steps
  • Exposing an XQuery Web Service and consuming it
    from OpenEdge

37
Step by step
  • XQuery to aggregate data from the multiple data
    sources
  • XQuery to publish an HTML or XSL-FO (PDF) report
    directly
  • Pipelining multiple XQueries with validation
    steps
  • Exposing an XQuery Web Service and consuming it
    from OpenEdge

38
Agenda
  • The Problem
  • Why XQuery
  • XQuery for Java API (XQJ)
  • XQuery for Data Integration
  • XQuery Demos, Code Walk-throughs
  • Summary

39
Benefits of XQuery
  • Data Integration is harder without XQuery!
  • Every data source is different
  • Many applications use several languages and APIs
    to address data sources (e.g. JavaJDBCDOM, SQL,
    Perl, XSLT)
  • Mediating among data sources accounts for a lot
    of code
  • XQuery treats all data sources as XML

40
Benefits of XQuery
  • Processing XML is harder without XQuery!
  • Most programming languages dont know XML
    structures
  • Parse, navigate, cast, repeat
  • XML is the native data structure for XQuery
  • XML Reporting is harder without XQuery!
  • XML input and output may have very complex
    structure
  • Many different desired XML outputs
  • Data Integration, Native XML Processing are
    needed
  • XQuery gives full query processing for any XML
    input and output

41
DataDirect XQuery
42
Getting Started
  • Examples Tutorialshttp//www.xquery.com
  • XQuery Tutorial
  • XQJ Tutorial
  • DataDirect XQuery Tutorial

43
Questions?
44
Thank you foryour time
45
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com