The Raindrop Engine: Continuous Query Processing

About This Presentation

Title:

The Raindrop Engine: Continuous Query Processing

Description:

System returns the results to the user as a stream ... title Dream Catcher /title author last King /last first S. /first /author ... – PowerPoint PPT presentation

Number of Views:165

Avg rating:3.0/5.0

Slides: 97

Provided by: SK169

Learn more at: https://davis.wpi.edu

Category:

more less

Transcript and Presenter's Notes

Title: The Raindrop Engine: Continuous Query Processing

1
The Raindrop EngineContinuous Query Processing

Elke A. Rundensteiner
Database Systems Research Lab, WPI
2003

2
Monitoring Applications

Monitor troop movements during combat and warn
when soldiers veer off course
Send alert when patients vital signs begin to
deteriorate
Monitor incoming news feeds to see stories on
Iraq
Scour network traffic logs looking for intruders

3
Properties of Monitoring Applications

Queries and monitors run continuously, possibly
unending
Applications have varying service preferences
Patient monitoring only want freshest data
Remote sensors have limited memory
News service wishes maximal throughput
Taking 60 seconds to process vital signs and
sound an alert may be too long

4
Properties of Streaming Data

Possibly never ending stream of data
Unpredictable arrival patterns
Network congestion
Weather (for external sensors)
Sensor moves out of range

5
DBMS Approach to Continuous Queries

Insert each new tuple into the database, encode
queries as triggers MWA03
Problems
High overhead with inserts CCC02
Triggers do not scale well CCC02
Uses static optimization and execution strategies
that cannot adapt to unpredictable streams
System is less utilized if data streams arrive
slowly
No means to input application service
requirements

6
New Class of Query Systems

CQ Systems emerged recently (Aurora, Stream,
NiagaraCQ, Telegraph, et al.)
Generally work as follows
System subscribes to some streams
End users issued continuous queries against
streams
System returns the results to the user as a
stream
All CQ systems use some adaptive techniques to
cope with unpredictable streams

7
Overview of Adaptive Techniques in CQ Systems
Research Work Technique(s) Goal
Aurora CCC02 Load shedding, batch tuple processing to reduce context switching Maintain high quality of service
STREAM MWA03 Adaptive scheduling algorithm (Chain) BBM03, Load shedding Minimize memory requirements during periods of bursty arrival
NiagaraCQ CDT00 Generate near-optimal query plans for multiple queries Efficiently share computation between multiple queries, highly scalable system, maximize output rate
Eddies AH00 (Telegraph) Dynamically route tuples among Joins Keep system constantly busy, improve throughput
XJoin UF00 Break Join into 3 stages and make use of memory and disk storage Keep Join and system running at full capacity at all times
UF01 Schedule streams with the highest rate Maximize throughput to clients
Tukwila IFF99 UF98 Reorganize query plans on the fly by using using synchronization packets to tell operators to finish up their current work. Improve ill-performing query plans
8
The WPI Stream Project Raindrop
Runtime Engine
CAPE Runtime Engine
QoS Inspector
Operator Configurator
Operator Scheduler
Plan Migrator
Distribution Manager
Query Plan Generator
Execution Engine
Storage Manager
Stream Receiver
Stream / Query Registration GUI
Queries
Stream Provider
Results
9
Topics Studied in Raindrop Project

Bring XML into Stream Engine
Scalable Query Operators (Punctuations)
Cooperative Plan Optimization
Adaptive Operator Scheduling
On-line Query Plan Migration
Distributed Plan Execution

10
PART I XQueries on XML Streams
(Automaton Meets Algebra)

Based on CIKM03
Joint work with Hong Su and Jinhui Jian

11
Whats Special for XML Stream Processing?
ltBiditemsgt ltbook year2001"gt
lttitlegtDream Catcherlt/titlegt
ltauthorgtltlastgtKinglt/lastgtltfirstgtS.lt/firstgtlt/author
gt ltpublishergtBt Bound lt/publishergt
ltpricegt 30 lt/initialgt lt/bookgt

Pattern Retrieval on Token Streams
12
Two Computation Paradigms

Automata-based yfilter02, xscan01, xsm02, xsq03,
xpush03
Algebraic niagara00,
This Raindrop framework intends to integrate
both paradigms into one

13
Automata-Based Paradigm

Auxiliary structures for
Buffering data
Filtering
Restructuring

FOR b in stream(biditems.xml) //book LET p
b/price t b/title WHERE p lt
20 Return ltInexpensivegt t lt/Inexpensivegt
//book/title
4
title

book
1
2
price
//book
//book/price
3
14
Algebraic Computation
FOR b in stream(biditems.xml) //book LET p
b/price t b/title WHERE p lt
20 Return ltInexpensivegt t lt/Inexpensivegt
book
book
book
title
author
publisher
price
Text
Text
Text
last
first
Text
Text
b
t
lttitlegt lt/titlegt
ltbookgt lt/bookgt

Navigate b, /title -gt t
b
ltbookgt lt/bookgt

15
Observations
Automata Paradigm Algebra Paradigm
Good for pattern retrieval on tokens Does not support token inputs
Need patches for filtering and restructuring Good for filtering and restructuring
Present all details on same low level Support multiple descriptive levels (declarative-gtprocedural)
Little studied as query processing paradigm Well studied as query process paradigm
Either paradigm has deficiencies Both paradigms
complement each other
16
How to Integrate Two Paradigms
17
How to Integrate Two Models?

Design choices
Extend algebraic paradigm to support automata?
Extend automata paradigm to support algebra?
Come up with completely new paradigm?
Extend algebraic paradigm to support automata
Practical
Reuse extend existing algebraic query
processing engines
Natural
Present details of automata computation at low
level
Present semantics of automata computation (target
patterns) at high level

18
Raindrop Four-Level Framework
High (Declarative)
Stream Logic Plan
Low (Procedural)
Abstraction Level
19
Level I Semantics-focused Plan Rainbow-ZPR02

Express query semantics regardless of stored or
stream input sources
Reuse existing techniques for stored XML
processing
Query parser
Initial plan constructor
Rewriting optimization
Decorrelation
Selection push down

20
Example Semantics-focused Plan
FOR b in stream(biditems.xml) //book LET p
b/price t b/title WHERE p lt
20 Return ltInexpensivegt t lt/Inexpensivegt
ltBiditemsgt ltbook year2001"gt
lttitlegtDream Catcherlt/titlegt
ltauthorgtltlastgtKinglt/lastgtltfirstgtS.lt/firstgtlt/author
gt ltpublishergtBt Bound lt/publishergt
ltpricegt 30 lt/initialgt lt/bookgt

21
Level II Stream Logical Plan

Extend semantics-focused plan to accommodate
tokenized stream inputs
New input data format
contextualized tokens
New operators
StreamSource, Nav, ExtractUnnest, ExtractNest,
StructuralJoin
New rewrite rules
Push-into-Automata

22
One Uniform Algebraic View
Algebraic Stream Logical Plan
Tuple-based plan
Query answer
Tuple stream
Token-based plan (automata plan)
XML data stream
23
Modeling the Automata in Algebraic PlanBlack
BoxXScan01 vs. White Box
FOR b in stream(biditems.xml) //book LET p
b/price t b/title WHERE p lt
20 Return ltInexpensivegt t lt/Inexpensivegt
b //book p b/price t b/title
XScan
Black Box
24
Example Uniform Algebraic Plan
FOR b in stream(biditems.xml) //book LET p
b/price t b/title WHERE p lt
30 Return ltInexpensivegt t lt/Inexpensivegt
Tuple-based plan
Token-based plan (automata plan)
25
Example Uniform Algebraic Plan
FOR b in stream(biditems.xml) //book LET p
b/price t b/title WHERE p lt
30 Return ltInexpensivegt t lt/Inexpensivegt
Tuple-based plan
StructuralJoin b
ExtractNest b, p
ExtractNest b, t
Navigate b, /title-gtt
Navigate b, /price-gtp
Navigate S1, //book -gtb
26
Example Uniform Algebraic Plan
FOR b in stream(biditems.xml) //book LET p
b/price t b/title WHERE p lt
30 Return ltInexpensivegt t lt/Inexpensivegt
Tagger Inexpensive, t-gtr
Select plt30
StructuralJoin b
ExtractNest b, p
ExtractNest b, t
Navigate b, /title-gtt
Navigate b, /price-gtp
Navigate S1, //book -gtb
27
From Semantics-focused Plan to Stream Logical Plan
28
Level III Stream Physical Plan

For each stream logical operator, define how to
generate outputs when given some inputs
Multiple physical implementations may be provided
for a single logical operator
Automata details of some physical implementation
are exposed at this level
Nav, ExtractNest, ExtractUnnest, Structural Join

29
One Implementation of Extract/Structural Join
lttitlegtlt/titlegt ltpricegtlt/pricegt
SJoin //book
ltpricegtlt/pricegt
lttitlegtlt/titlegt
ExtractNest b, t
ExtractNest /b, p
Nav b, /title-gtt
Nav b, /price-gtp

title
book
1
Nav ., //book -gtb
price
4
ltbiditemsgt ltbookgt lttitlegt Dream Catcher lt/titlegt
lt/bookgt
30
Level IV Stream Execution Plan

Describe coordination between operators regarding
when to fetch the inputs
When input operator generates one output tuple
When input operator generates a batch
When a time period has elapsed
Potentially unstable data arrival rate in stream
makes fixed scheduling strategy unsuitable
Delayed data under scheduling may stall engine
Bursty data not under scheduling may cause
overflow

31
Raindrop Four-Level Framework (Recap)
Express the semantics of query regardless of
input sources
Accommodate tokenized input streams
Stream Logic Plan
Describe how operators manipulate given data
Decides the Coordination among operators
32
Optimization Opportunities
33
Optimization Opportunities
General rewriting (e.g., selection push down)
Break-linear-navigation rewriting
Stream Logic Plan
Physical implementations choosing
Execution strategy choosing
34
From Semantics-focused to Stream Logical Plan In
or Out?
Tuple-based Plan
Query answer
Pattern retrieval in Semantics-focused plan
Tuple stream
Token-based plan (automata plan)
Apply push into automata
XML data stream
35
Plan Alternatives
36
Experimentation Results
37
Contributions Thus Far

Combined automata and algebra based paradigms
into one uniform algebraic paradigm
Provided four layers in algebraic paradigm
Query semantics expressed at high layer
Automata computation on streams hidden at low
layer
Supported optimization at an iterative manner
(from high abstraction level to low abstraction
level)
Illustrated enriched optimization opportunities
by experiments

38
On-Going Issues To be Tackled

Exploit XML schema constraints for query
optimization
Costing/query optimization of plans
On-the-fly migration into/out of automaton
Physical implementation strategies of operators
Load-shedding from an automaton

39
PART II On-line Query Plan Migration

Joint work with Yali Zhu

40
Motivation for Migration

An Initial Good Query Plan may
become less effective over time
Changes in stream data distributions
(selectivity)
Changes in data arrival rates (operator overload)
Addition of new queries/de-registering of
existing queries
Availability of resources allocated to query
evaluation
Changes in quality of service requirements

41
A Simple Motivating Example
42
On-line Plan Optimization

Detection of Suboptimality in Plan
Query Optimization via Plan Rewriting
On-line Migration of Subplan

43
Related Work

Efficient mid-query re-optimization of
sub-optimal query execution plans,
KabraDeWitt,SIGMOD98. (Only re-optimize part of
query not started executing yet)
On reconfiguring query execution plans in
distributed object-relational DBMS, K. Ng, 1998
ICPADS. (plan cloning)
Continuously Adaptive Continuous Queries over
Streams, S. Madden, J. Hellerstein, UC Berkeley.
ACM SIGMOD 2002.

44
Focus on Dynamic Migration

Given a better plan or sub-plan
Dynamically migrate running plan to given plan.
Guarantee correctness of results
No missing
No duplicate
No incorrect

45
Join Algorithm Stateful Operator
Output Queue AB

Symmetric NLJ
For each new A tuple
Purge State B using time-based window constraints
W.
Join with tuples in State B
Output result to output queue
Put into State A

a1b1
a1b2
a2b1
a2b2
a3b2
Node AB
State A
State B
a1
b1

a2
b2
a3
b1
b2
a1
a2
a3
Input Queue A
Input Queue B
46
So whats the problem of migration?

Old states in old plan still need to join with
future incoming tuples, cannot be discarded.
New tuples arrive randomly and continuously in
streaming system.

Old Query Plan
State C
State AB
a1b1
c1
ABC
a2b1
c2
c3
a2b1
a2b2
State A
State B
a1
b1
a2
b2
AB
47
Box Concept

Migration Unit Box
Old box contains an old plan or sub-plan
New box contains a new plan or sub-plan
Two equivalent boxes
Have the same input queues
Have the same output queues
Contain semantically equivalent sub-plans
Can be migrated from one to another

48
But How?

Proposal of two Migration Strategies
Moving state strategy
Parallel track strategy
Comparison via Cost models
Experimental Evaluation

49
Parallel Track Strategy

New plan and old plan co-exist during migration.
Run in parallel
Share input queues and output queues
Window constraints are used to eventually time
out the old states
This is when migration stage is over
Discard old plan and run only new plan

50
A Running Example
a1b1c1
a3b2c2
Output Queue ABC
a2b1c1
a3b1c1
a2b2c2
State BC
State C
State A
State AB
a1b1
c1
a3
b2c2

ABC
ABC
a2b1
c2

a3b1

a2b2
a3b2
W3
State A
State B
State B
State C
a3b1

a1
b1

a1
a2b2

b2c2

a2
c2
b2
b2
a3b2

a3
b1
AB
BC
c1
t
a2
a3
c2
a3
c2
b2
b2
A
B
C
Input Queues
51
Pros and Cons

We dont need to halt the system in order to do
migration
Low delay on generating results
Overhead
In old plan part, all-new tuple pair is discarded
only at the last node.

52
Moving State Strategy

First, freeze inputs and drain out the old
plan
Then, establish and connect the new plan
And, move over all old states to the new states
Lastly, let all new input data go to new plan
only
Resume processing

53
Moving State Strategy

State Matching Compare states of old and new
plans
State Moving If two states match, move them.
State Re-computation If no match, recompute
state.

54
Abstract Description
State BC
State C
State A
State AB
a1b1
c1
ABC
ABC
a2b1
State A
State B
State B
State C
a1
b1
a2
AB
BC
55
Intermediate States Sharing

We can share intermediate state BC if
Inputs for both plans are exactly the same.
Tuples arrived at the same state have passed
exactly the same predicates.
Above must hold for any sharing to be possible.

56
Moving State Strategy
a3b1c1
Output Queue ABC
a2b2c2
a3b2c2
X
State BC
State C
State A
State AB
a1b1
c1
a1

b1c1
ABC
ABC
a2b1
a2
b2c2
a3
W3
State A
State B
State B
State C
a1
b1
a1
b2c2

a2
b1
c1

b1
c2
b2
AB
BC
c1
t
a2
a3
X
X
c2
X
a3
c2
b2
b2
A
B
C
Input Queues
57
Why Need Two Pointers

Two nodes share the same state may have different
contents
Each node has two pointers for each associated
state.
First points to the first tuple in the state
Last points to the last tuple in the state

ABC
ABC
AB
BC
Shared State B
b1
b2
b3
b4
58
Compare the Two Migration Algorithms

Different distribution of tasks between old and
new plans

A B C Query plan used in 1 Query plan used in 2
O O O Old Old
O O N Old New
O N O Old New
N O O Old New
N N O Old New
N O N Old New
O N N Old New
N N N New New
N new tuple arrives after migration start
time. O old tuple arrives before migration start
time. New new query plan is used to compute the
result. Old old query plan is used to compute
the result.
59
Which algorithm performs better?

Compare the performances
Which one is faster? Cost-saving?
New query plan should outperform old query plan
In algorithm 1 old plan part deals with 7 out of
8 cases
In algorithm 2 new plan part deals with 7 out of
8 cases
Seems winning
However, 2 needs extra cost to re-compute
intermediate states.
So cost models are needed!

60
Cost Model Assumptions

All binary NL joins
Assume that we already know the statistics of
each node in query plan
Input arriving rate
Join node selectivity
We compute processing power needed in a period
of time during which migration happens.
Not computing the real power that the system
used.
Those two are different because of resource
limitation in a real system.

61
Cost Model Assumptions (cont.)

Assume tuple processing time is the same for
tuples of different sizes.
Assume when migration starts, the old query plan
has passed its start-up stage and fully running
States are at their max size, controlled by
window constraints.
Window constraints
Time-based.
Same over all streams in a join.

62
Running Example Revisit

Old Query Plan New Query Plan

63
Symbol Definition

?a, ?b, ?c Tuples/time unit, the average arrival
rate on stream A, B and C.
?ab, ?abc_old The selectivity of node AB and ABC
in old query plan.
?bc, ?abc_new The selectivity of node BC and ABC
in new query plan.
W Window constraints over all joins, time-based,
for example 5 time units.
t Any t time units after migration has started.
Cj Cost of join for each pair of tuples,
including the cost of accessing the 2 tuples,
comparing their values, and so on.
Cs Cost of inserting/deleting a tuple to/from
states.
Ct Cost of access a tuple, for example, check a
tuples timestamp.
input_name Size of the state for one input of
a node. For example, A represents the size of
state A in node AB, and A ?aW

64
Cost Model for Algorithm I for old plan part
State C
State AB

Cost for node AB in old plan
State starts full
CAB cost of purge cost of insert cost of
join
Cs(?a/?b) ?b Cs(?b/?a) ?a Cs ?b Cs
?aCj(?aB ?bA) t
2Cj ?a ?bW 2Cs(?a ?b) t ---
formula (1)
?AB (?aB ?bA) ?ab 2 ?a?bW ?ab ---
formula (2)
NAB 2 ?a?bW ?abt
Apply the same formula to each node in old query.
Put the cost of each node together would be the
total cost.

a1b1
c1
ABC
c2
a2b1
c3
a2b2
State A
State B
a1
b1
a2
AB
b2
65
Cost Model for Algorithm I for new plan part

Cost for node BC in new plan
State start empty, no purge needed if tltW.
CBC join cost insert cost
t2 ?b ?cCj (?bt ?ct)Cs , Where t lt W
--- formula (3)
In ith time unit after migration start time, and
output rate is
?i (2i 1) ?b ?c ?bc
The total number generated in anytime t (t ltW)
is
NBC ?I1, t ?i t2 ?b ?c ?bc
--- formula (4)
Also apply the same formula to each node in the
new plan.

State BC
State A
ABC
State B
State C
BC
66
Cost Model for Algorithm II

Extra cost needed for computing new states.
For our running example, we need to compute state
BC. The extra cost would be
CstateBC Cj BC Cj ?bW?cW W2?b?cCj
We can apply formula (1) and (2) for computing
cost of each node in old plan
Because all states are full new states
re-computing
Add above two together would be the total cost of
algorithm II.

67
Analysis on Cost Models

Several parameters control the performance of the
two migration algorithms
Arriving rate
Join node selectivity
Window size
Time
Costs may not be linear by time
As for algorithm I, new plan part
Total migration time largely depends on window
size
Design experiments by varying those parameters.

68
(No Transcript)
69
(No Transcript)
70
Some remaining challenges

Alternate Migration Strategies
Selection of Box Sizes
Dynamic Optimization and Migration
Comparison Study to Eddies

71
PART III Adaptive Scheduler Selection
Framework

Joint work with Brad Pielech and Tim Sutherland

72
Idea

Propose a novel adaptive selection of scheduling
strategies
Observations
The scheduling algorithm has large impact on
behavior of system
Utilizing a single scheduling algorithm to
execute a continuous query is not sufficient
because all scheduling algorithms have inherent
flaws or tradeoffs
Hypothesis
Adaptively choose next scheduling strategy to
leverage strengths and weaknesses of each and
outperform a single strategy

73
Continuous Query Issues

The arrival rate of data is unpredictable.
The volume of data may be extremely high.
Certain domains may have different service
requirements.
A scheduling strategy such as Round Robin, FIFO,
etc. designed to resolve a particular scheduling
problem (Minimize memory, Maximize throughput,
etc).
What happens if we have multiple problems to
solve?

74
Scheduling Example
s tuples outputted tuples inputted
C Operator processing cost in time unit

Operator 2 is the quickest and most selective
Operator 1 is the slowest and least selective

Every 1 time unit, a new tuple arrives in Q1
starting at t0
When told to run, an operator will process at
most 1 tuple, any extra is left in its queue for
a later run.
An operator takes C time units to process its
input, regardless of the size of the input
An operators output size s X of input
tuples if an operator inputs 1 tuple and s
0.9, it will output 0.9 tuples.
Assume zero time for context switches
Assume all tuples are the same size

75
Scheduling Example II

Two Scheduling Strategies
FIFO starts at leaf and processes the newest
tuple until completion1,2,3,1,2,3, etc.
Greedy schedule the operator with the most
tuples in its input buffer
1,1,1,2,1,1,1,2,3

We will compare throughput and total queue sizes
for both algorithms for the first few time units
76
Scheduling Example FIFO
FIFO Start at leaf and process the newest tuple
until completion.
End User
s 1 C 0.75
3
Time
s .1 C 0.25
2

FIFOs queue size grows very quickly.
It spends 1 time unit processing Operator 1,
then 1 time unit processing 2 and 3.
During these 2 time units, 2 tuples arrive in
1s queue.

FIFO outputs 0.09 tuples to the end user every 2
time units.
s 0.9 C 1
1
Stream
77
Scheduling Example Greedy

Greedy
Schedule the operator with the largest input
queue.

End User
s 1 C 0.75
3
Time
s .1 C 0.25
2

Greedys queue size grows at a slower rate
because Operator 1 is run more often
But tuples remain queued for long periods of
time in 2 and 3 until their queue sizes become
larger than 1s

Greedy will finally output a tuple at about t
16
At t 16, Greedy will output 1 tuple, by then,
FIFO has outputted .72 (.09 x 6) tuples

s 0.9 C 1
1
Stream
78
Scheduling Example Wrap-up

FIFO
Output tuples at regular intervals
- Q1 grows very quickly
- Output rate is low (.045 tuples / unit)
- Does not utilize operators fully O1 1
tuple per run, O2 0.9, 03 0.09. Max is 1 tuple

Greedy
Queue sizes grow less quickly than FIFOs
Output rate is high .0625 tuples / unit
More fully utilizes operators each operator
will run with 1 tuple each time
- Some tuples will stay in the system for a long
time
- Long delay before any tuples are outputted

79
So? What is our point?
A single scheduling strategy is NOT sufficient
when dealing with varying input stream rates,
data volume and service requirements!
80
New Adaptive Framework

In response to this need,
we propose a novel technique which will
select between several scheduling strategies
based on current system conditions and quality of
service requirements

81
Adaptive Framework

Choosing between more than one scheduling
strategy can leverage the strengths of each
strategy and minimize the use of an strategy when
it would not perform as well.
Allowing a user to input service requirements
means that the CQ system can adapt to the users
needs, not a static set of needs from the CQ
system.

82
Quality of Service Preferences

Each Application can specify their service
requirement as a weighted list of behaviors that
may be maximized or minimized as desired.
Assumptions / Restrictions
One (global) preference set at a time
Preference can change during execution.
Can only specify relative behavior.

83
Service Preferences II

Input three parameters
Metric Any statistic that is calculated by the
system.
Quantifier Maximize or Minimize the given
Metric.
Weight The relative weight / importance of this
metric. The sum of all weights is exactly 1.

Current Supported Metrics
Throughput
Queue Sizes
Delay

84
Adaptive Selection Overview

Given a table of service preferences and a set of
candidate scheduling algorithms
Initially run all candidate algorithms once in
order to gather some statistics about their
performance
Assign a score to each algorithm based on how
well they have met the preferences relative to
the other algorithms (score formulas on next
slides)
Choose scheduling algorithm that will best meet
the preferences based on how algorithms performed
thus far
Run that algorithm for a period of time, record
statistics
Repeat Steps 2-4 until query or streams are over

85
Adaptive Formulas
Zi- normalized statistic for a preference I the
number of preferences H- Historical
Category decay- decay factor
A schedulers score is comprised by summing the
normalized statistic score times the weight of
the statistic for each of the defined statistics
by the user.
86
Choosing Next Scheduling Strategy

Once the schedulers scores have been calculated,
the next strategy has to be chosen.
Should explore all strategies initially such
that it can learn how each will perform
Periodically should rotate strategies because a
strategy that did poorly before, could be viable
now
Remember, the score for the last ran algorithm
is not updated, only the other candidates have
their score updated

Roulette Wheel MIT99
Chooses next algorithm with a probability
equivalent to its score
Assign an initial score to each strategy such
that each will have a chance to run
Favors the better scoring algorithms, but will
still pick others.

87
Experiments

3 parameters
Number of streams
Arrival Pattern
Number of service preferences

88
Experiment Setup

5 different query plans
Select and Window-Join Operators
Incoming streams use simulated data with a
Poisson arrival pattern. The mean arrival time is
altered to control burstiness
Want to show that Adaptive better meets the
preferences than a single algorithm, if not, then
the techniques are not worthwhile

89
Single Stream Result (2 Requirements)
The adaptive strategy performs as well as PTT in
this environment
90
Single Stream Result (3 Requirements)
91
Multi Stream Result (2 Requirements)
The Adaptive Framework performs as well as, if
not better than both individual scheduling
algorithms, with differing service requirements.
92
Multi Stream Result (3 Requirements)
93
Related Work Comparison
Research Work Comparison
Aurora CCC02 More complex QoS model. Makes use of alternate adaptive techniques
STREAM MWA03 Only Meets memory requirement
NiagaraCQ CDT00 Only adapts prior to query execution. Concerned more with generating optimal query plans
Eddies AH00 (Telegraph) Finer grain adaptive strategy, look to incorporate in future
XJoin UF00 Finer grained technique, look to incorporate in future
UF01 Only focuses on maximizing rate, uses a single adaptive strategy
Tukwila IFF99 UF98 Reorganizes plans on the fly, look to incorporate this in the future
94
Conclusions

Identified a gap in existing CQ research and
proposed a novel adaptive technique to address
the problem.
Draws on genetic algorithms and AI research
Alters the scheduling algorithm based on how well
the execution is meeting the service preferences
The adaptive strategy is showing some promising
experiment results
Never performs worse than any single strategy
Often performs as well as the best strategy, and
often outperforms it.
Adapts to varying user environments without
manually changing scheduling strategies

95
Overall Blizz