Raindrop: An Algebra-Automata Combined XQuery Engine over XML Streams - PowerPoint PPT Presentation

About This Presentation
Title:

Raindrop: An Algebra-Automata Combined XQuery Engine over XML Streams

Description:

Experimentation Results. Optimization II: Semantic Query Optimization ... Pattern retrieval over tokens solely relies on document-order traversal ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 29
Provided by: SK169
Learn more at: https://davis.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Raindrop: An Algebra-Automata Combined XQuery Engine over XML Streams


1
RaindropAn Algebra-Automata Combined XQuery
Engine over XML Streams
  • Hong Su, Elke Rundensteiner, Murali Mani, Ming Li
  • Worcester Polytechnic Institute
  • Worcester, MA
  • VLDB 2004

2
Stream Processing
data sources
Networks
data requesters
3
Whats Special for XML Stream Processing
Token-by-Token access manner
ltauctionsgt
ltauctiongt
ltsellergt

ltprimarygt

ltphonegt
timeline
Pattern Retrieval on Token Streams
4
Two Computation Paradigms
  • Automata-based yfilter, xscan, xsm, xsq, xpush
  • Algebraic niagara00,

FOR a in stream(bids)//auction, b in
a/sellerhomepage, c in
a/biddersameAddr WHERE b//phone
508 Return ltauctiongt b, c lt/auctiongt
Tagger
homepage

4
3
seller

auction
phone
Navigate a, /bidder-gt c

1
2
5
6
bid
Navigate a, /seller-gtb
8
9
7
bidder
sameAddr
Navigate stream(bids),//auction-gta
Automata
Algebra
5
Comparison of Two Paradigms
Automata Paradigm Algebra Paradigm
Good for pattern retrieval on tokens Does not support token inputs
Need patches for filtering and restructuring Good for filtering and restructuring
Present all details on same low level Support multiple descriptive levels (e.g., logical plan, physical plan)
Little studied as query processing paradigm Well studied as query process paradigm
Either paradigm has deficiencies Both paradigms
complement each other
6
Four-Level Algebraic Framework
  • This Raindrop framework intends to integrate
    both paradigms into one

Express the semantics of query regardless of
input sources
High (Declarative)
Semantics-Focused Plan
Accommodate tokenized streams/ automata
computation
Stream Logic Plan
Describe implementation details of operators
Decide how an operator is invoked (scheduling)
Low (Procedural)
Abstraction Level
7
Level I Semantics-Focused Plan
  • Express query semantics regardless of stored or
    stream input sources Rainbow-ZPR02
  • Reuse existing general optimization techniques
  • Decorrelation
  • Cancel duplicate navigation operators

8
Example Semantics-Focused Plan
Stream Data ltauctionsgt ltauctiongt
ltsellergt ltprimarygtltphonegt508
lt/phonegtlt/primarygt
ltsecondarygtltphonegt613lt/phonegtlt/secondarygt
lt/sellergt ltbidgtltbiddergtlt/biddergtlt
biddergtlt/biddergtlt/bidgt
lt/auctiongt
Query
FOR a in stream(bids)//auction, b in
a/sellerhomepage, c in
a/biddersameAddr WHERE b//phone
508 Return ltauctiongt b, c lt/auctiongt
Plan and Input/output Data

source ltauctionsgt lt/auctionsgt
a ltauctiongt lt/auctiongt
b ltsellergt lt/sellergt
c ltbiddergt lt/biddergt
NavUnnest a, /bid/bidder -gtc
ltauctionsgt lt/auctionsgt
ltauctiongt. .. lt/auctiongt

NavUnnest a, /seller -gtb
NavUnnest stream(bids),//auction-gta
9
Level II Stream Logical Plan
  • Extend semantics-focused plan to accommodate
    tokenized stream inputs
  • New input data format
  • Tokens
  • New operators
  • StreamSource, TokenNavigate, ExtractUnnest,
    ExtractNest, StructuralJoin
  • New rewrite rules
  • Push-into/Pull-out-of Automata

10
One Uniform Algebraic View
Algebraic Stream Logical Plan
Tuple-based plan
Query answer
Tuple stream
Token-based plan (automata plan)
XML data stream
11
Modeling Automata in Algebraic PlanBlack
BoxXScan01 vs. White Box
FOR a in stream(bids)//auction, b in
a/sellerhomepage, c in
a/bid/biddersameAddr WHERE b//phone
508 Return ltauctiongt b, c lt/auctiongt
StructuralJoin a
a stream(bids)//auction b a/seller c
a/bid/bidder
ExtractUnnest a, b
ExtractUnnest a, c
XScan
TokenNavigate a, /bid/bidder-gtc
TokenNavigate a, /seller-gtb
TokenNavigate stream(bids), //auction-gta
White Box
Black Box
12
Data Model in Algebraic Plan Modeling Automata
ltsellergtlt/sellergt
ltbiddergt...lt/biddergt


StructuralJoin a
ltsellergtlt/sellergt
ltbiddergt...lt/biddergt


ExtractUnnest a, b
ExtractUnnest a, c
ltsellergt
ltprimarygt
ltbiddergt
ltphonegt
ltbidderidgt
508
0314
lt/phonegt

TokenNavigate a, /bid/bidder-gtc
TokenNavigate a, /seller-gtb
lt/primarygt
...
ltauctiongt
ltsellergt
TokenNavigate stream(bids), //auction-gta
ltauctionsgt
ltprimarygt
ltauctiongt
ltphonegt
....

StreamSource


13
  • For Details of Levels III and IV, please refer to
  • Automaton Meets Query Algebra Towards a Unified
    Model for XQuery Evaluation over XML Data
    Streams, ER 2003
  • Raindrop A Uniform and Layered Algebraic
    Framework for XQueries on XML Streams, CIKM 2003
  • Raindrop A Uniform and Layered Algebraic
    Framework for XQueries on XML Streams, Journal
    Submission 2004

14
Optimization I Computation Into or Out of
Automata?

Out of Automata
Into Automata
NavigateUnest a, /bid/bidder -gtc


NavigateUnnest a, /seller -gtb
NavigateUnnest a, /bid/bidder-gtc
NavUnnest stream(bids), //auction-gta
Automata Plan
StructuralJoin a
NavigateUnnest a, /seller-gtb
ExtractUnnest a, b
ExtractUnnest a, c
Automata Plan
TokenNavigate a, /seller-gtb
TokenNavigate a, /bid/bidder-gtc
ExtracUnnest stream(bids), a
TokenNavigate stream(bids),
//auction-gta
TokenNavigate stream(bids),
//auction-gta
15
Experimentation Results
16
Optimization II Semantic Query Optimization
  • General schema-based optimizations
  • Eliminate predicate/join,
  • Focus on operators manipulating flat values
  • XML specific schema-based optimizations
  • Focus on pattern retrieval
  • Fall into two categories
  • General XML SQO
  • Minimize query tree YCL-ATT 01
  • Stream XML SQO (our focus)

17
Stream-Specific XML SQO
  • Observations
  • Pattern retrieval over tokens solely relies on
    document-order traversal
  • Schema constraints help expedite document-order
    traversal
  • State-of-the-Art
  • XPush03 covers limited query (boolean XPath
    match) and one type of constraints
  • Our goals
  • Support more powerful query (XQuery)
  • Support more types of constraints (XSchema)

18
Step I Construct Query Graph
FOR a in stream(bids)//auction, b in
a/sellerhomepage, c in
a/bid/biddersameAddr WHERE b//phone
508 Return ltauctiongt b, c lt/auctiongt
(a) Example Query
(b) Query Tree
19
Example XML Schema
20
Step II Apply Optimization Rules
  • Offer optimization rules utilizing
  • occurrence constraints
  • exclusive constraints
  • order constraints
  • Apply rules in an order ensuring
  • no beneficial rule missed
  • no redundant rule introduced

21
Step III Translate Rewritten Query Graph Back to
Plan (I)
when lt/phonegt is encountered twice, check
//phone if fails the predicate, suspend states
s2 and s3
Utilize Occurrence Constraints
22
Step III Translate Rewritten Query Graph Back to
Plan (II)
when ltbillTogt or ltshipTogt is encountered once
suspend states s2 and s9
Utilize Exclusive Constraints
23
Step III Translate Rewritten Query Graph Back to
Plan (III)
when ltprimarygt is encountered once, check
/homepage if no presence, suspend states s10, s3
and s2
Utilize Order Constraints
24
  • http//davis.wpi.edu/dsrg/raindrop/

suhong_at_cs.wpi.edu
Thank WPI DSRG Rainbow Team for XAT
Algebra Support
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
  • Thank WPI DSRG Rainbow Team for XAT
    Algebra Support
Write a Comment
User Comments (0)
About PowerShow.com