Query Optimization Over Web Services - PowerPoint PPT Presentation

About This Presentation
Title:

Query Optimization Over Web Services

Description:

Web services emerging as a popular standard for sharing data and functionality ... A variable Zi with every WSi, set to 1 if Wsi belongs to PCx. ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 35
Provided by: cseIi8
Category:

less

Transcript and Presenter's Notes

Title: Query Optimization Over Web Services


1
Query Optimization Over Web Services
  • Utkarsh Kamesh Jennifer
    Rajeev
  • Shrivastava Munagala Wisdom Motwani
  • Presented By
  • Ajay Kumar Sarda

2
Motivation
  • Web services emerging as a popular standard for
    sharing data and functionality
  • Databases behind web services
  • DBMS-like capabilities when data sources are web
    services
  • Need for query optimization for queries spanning
    multiple web services

3
Motivating Example
  • A credit card company wants to send out mails for
    its new credit card offer.
  • I Potential recipient names
  • WS1name(n) ? credit rating (cr)
  • WS2name(n) ? credit card number (ccn)
  • WS3card number (ccn) ? payment history (ph)
  • One Possible execution is WS1,WS2,WS3
  • Is it optimal?

4
Challenges
  • Different response time of web services
  • Precedence constraints
  • Tradeoff between linear pipeline and parallelism
  • Parsing SOAP/XML headers overhead

5
Related Work
  • Query optimization in the presence of limited
    access patterns
  • Binding pattern R (Ab, Bf)
  • Annotated query plans in the search space,prunes
    invalid and non-viable plans
  • Starts with initial set S of plans containing
    only atomic plans
  • S is iteratively updated by adding new plans
    obtained by combining plans from S using
    selection and join operations

6
Outline of the Talk
  • WSMS
  • Preliminaries
  • Query Optimization with and without precedence
    constraints
  • Data Chunking
  • Experimental Evaluation
  • Conclusion
  • Future work

7
WSMS Architecture
8
Query Model
  • Web Service denoted as WS(Xbi,,Yfi)
  • Xi - Bound Attributes
  • Yi - Free Attributes

9
Query Model (Contd.)
10
Query Plans
11
Execution Model
  • Ti created for each web service
  • Ti takes input from join thread Ji
  • Ji joins the outputs of parents of WSi
  • Jout joins the outputs of all leaves web service.

12
Execution Model (Contd.)
13
Statistics
  • Per-tuple response time(Ci)
  • ci1/ri where ri is maximum rate of at which
    results of invocations can be obtained from Wsi
  • Depends on web service provisioning, network
    conditions and load on the web service
  • Selectivity(Si)
  • Average number of returned tuples that remain
    unfiltered after applying predicates
  • Si lt1 (selective) or Si gt 1 (proliferative)

14
Bottleneck Cost Metric
  • Query plan H
  • Pi(H) -the set of predecessors of WSi in H
  • RS-- the combined selectivity of all the web
    services in S
  • Every tuple in I input to plan H, the average
    number of tuples that WSi needs to process is
    given by RPi(H)
  • Average processing time required by WSi per
    original input tuple in I is is RPi(H).Ci
  • Cost of the query plan H
  • max(RPi(H).Ci)

15
Bottleneck Cost Metric (Contd.)
  • Plan 1 max(2I, 100.1I, 50.5I)2.5
  • Plan 2 max(2I, 10I, 55I)25
  • Plan 2 is 10 times slower than plan 1

16
Q.O without Precedence Constraints
  • Lemma There exists an optimal plan that is a
    linear ordering of the selective web services,
    i.e., has no parallel dispatch of data.

17
Q.O without Precedence Constraints
  • Lemma Let WS1, . . . , WSn be a plan with a
    linear ordering of the selective web services. If
    ci gt ci1, then WSi and WSi1 can be swapped
    without increasing the cost of the plan.

Ci gt Ci1
18
Q.O without Precedence Constraints(Contd.)
  • Theorem For selective web services with no
    precedence constraints, the optimal plan is a
    linear ordering of the web services by increasing
    response time, ignoring selectivity's.

19
Q.O with Precedence Constraints
  • Constructs the plan DAG H incrementally by
    greedily adding to it one web service at a time
  • Web service chosen should be the one that can be
    added to H with minimum cost, and all of whose
    prerequisite web services have already been added
    to H
  • Mi -- the set of all web services that are
    prerequisites for WSi

20
Adding a Web Service to the Plan
  • A partial plan H (bar) and add WSx
  • Compute the best cut Cx such that on placing
    edges from the web services in Cx to WSx, cost is
    minimized
  • PCx set of all the web services in Cx and all
    the predecessors in H(bar)
  • Cost incurred by adding WSx is
  • Cost(WSx)RPCx. Cx

21
Adding a Web Service (Contd.)
  • A variable Zi with every WSi, set to 1 if Wsi
    belongs to PCx.
  • Optimal set PCx obtained by solving LP problem

22
Greedy Algorithm
23
Data Chunking
  • Parsing SOAP/XML headers and network cost
    overhead on web service call
  • Pass tuples to a web service in chunks
  • Response time of WSi depends on input chunk size
  • Ci(k) Response time of WSi on a chunk of size k
  • A limit kimax exists on max chunk size

24
Data Chunking (Contd.)
  • Query Optimizer must decide on optimal chunk size
    for each web service
  • The optimal chunk size to be used by WSi is Ki
    such that ci(Ki)/Ki is minimized
  • Profiling combined with query processing for
    trying out various chunk sizes
  • Intermediate tuples between any two web services
    in the pipelined plan are buffered

25
Experimental Evaluation
  • Total running time as metric
  • Compare the plans produced by optimizer against
  • Parallel Dispatch data in parallel
  • SelOrderChoose WS with lower selectivity
  • Compare the running time with and without
    chunking
  • Compare the WSMS cost against the slowest web
    service

26
Experimental Setup
  • WSMS prototype is multithreaded system in Java
  • Apache Axis tools for communicating with web
    services
  • Java Reflection
  • Different costs by varying delays
  • Different selectivities by rejecting tuple with
    probability 1-Si

27
No Precedence Constraints
  • WS1,WS2,WS3,WS4
  • Selectivities set as 0.4,0.3,0.2,0.1
  • Range of cost c varied from 0.2,2 to 2,2
  • Parallel WS4
  • SelOrder WS4

28
Precedence Constraints
  • WS1,WS2,WS3,WS4
  • WS1 lt WS3,WS2 lt WS4
  • Selectivities 2,1,0.1,0.1
  • Uniform cost of WS1,WS2,WS3 with WS4 varied from
    0.4 to 2

29
Data Chunking
  • WS1,WS2,WS3,WS4
  • No precedence constraints
  • Uniform cost
  • Selectivity set to 0.5
  • Web Services are arranged in linear pipeline
    (Optimizer)
  • Equal chunk size

30
WSMS Cost Vs Bottleneck Cost
  • No precedence constraints
  • Uniform web service costs
  • Selectivity set to 0.5
  • Web Services arranged in linear pipeline

31
Future Work
  • Different input tuples to follow different plans
  • Adaptive plans that changes with response times
  • Web Services with monetary costs
  • Multiple web services for same data
  • Profiling techniques that track response time and
    selectivities
  • Caching Techniques at WSMS

32
Conclusion
  • Web Service Management System
  • Bottleneck cost cost of pipelined plan
  • Optimal pipelined plan respecting precedence
    constraints
  • Optimal chunk size

33
References
  • Query Optimization over Web ServicesU.
    Srivastava, J. Widom, K. Munagala, and R. Motwani
  • Query optimization in the presence of limited
    access patterns. In Proc. of ACM SIGMOD Conf. on
    Management of Data

34
  • Thank You!
Write a Comment
User Comments (0)
About PowerShow.com