XJoin: A Reactively-Scheduled Pipelined Join Operator - PowerPoint PPT Presentation

About This Presentation

Title:

XJoin: A Reactively-Scheduled Pipelined Join Operator

Description:

Efficiently evaluate equi-join in online query processing over distributed ... Out1 (item_id) Out2 (item_id, sum) No more bids for item 1080! CS561 - XJoin. 39 ... – PowerPoint PPT presentation

Number of Views:91

Avg rating:3.0/5.0

Slides: 44

Provided by: webC6

Learn more at: http://web.cs.wpi.edu

Category:

more less

Transcript and Presenter's Notes

Title: XJoin: A Reactively-Scheduled Pipelined Join Operator

1
XJoin A Reactively-Scheduled Pipelined Join
Operator

IEEE Bulletin, 2000
by Tolga Urhan and Michael J. Franklin

Based on a talk prepared by Asima Silva Leena
Razzaq
2
Goal of XJoin

Efficiently evaluate equi-join in online query
processing over distributed data sources
Optimization objectives
Having small memory footprint
Fast initial result delivery
Hiding intermittent delays in data arrival

3
Outline

Hash Join History
Motivation of XJoin
Challenges in Developing XJoin
Three Stages of XJoin
Preventing Duplicates
Experimental Results
Conclusion

4
Classic Hash Join

2-phase build and probe
Only one table is hashed in memory

2. Probe
1. Build
5
Hybrid Hash Join

One table is hashed both to disk and memory
(partitions)
G. Graefe, Query Evaluation Techniques for Large
Databases. ACM 1993.

6
Symmetric Hash Join (Pipelined)

Both tables are hashed (both kept in main memory
only)
A. Wilschut, P. M.G. Apers, Dataflow Query
Execution in a Parallel Main-Memory Environment,
DPD 1991.

OUTPUT
Source S
Source R
7
Problem of SHJ

Memory intensive
Wont work for large input streams.
Wont allow for many joins to be processed in a
pipeline (or even in parallel).

8
New Problem in Online Query Processing over
Distributed Data Sources

Unpredictable data access due to link congestion,
load balances, etc.
Three classes of delays
Initial Delay first tuple arrives from remote
source more slowly than usual
Slow Delivery data arrives at a constant, but
slower than expected rate
Bursty Arrival data arrives in a fluctuating
manner

9
Question

Why are delays undesirable?
Prolong the time for first output
Slow the processing if wait for data to first be
there before acting
If too fast, you want to avoid loosing any data
Waste time if you sit idle while no data is
coming
Unpredictable, one single strategy wont work

10
Motivation of XJoin

Produce results incrementally when available
Tuples returned as soon as produced
Allow progress to be made when one or more
sources experience delays by
Background processing performed on previously
received tuples so results are produced even when
both inputs are stalled

11
XJoin Design

Tuples are stored in partitions (Hash Join)
A memory-resident (m-r) portion
A disk-resident (d-r) portion

12
(No Transcript)
13
Challenges in Developing XJoin

Manage flow of tuples between memory and
secondary storage (when and how to do it)
Control background processing when inputs are
delayed (reactive scheduling idea)
Ensure the full answer is produced
Ensure duplicate tuples are not produced
Provide both quick initial result as well as good
overall throughput

14
XJoin Stages

XJoin proceeds in 3 stages (separate threads)

M M
M D
D D
15

1st Stage Memory-to-Memory Join
M E M O R Y
SOURCE-B
SOURCE-A
16
1st Stage Memory-to-Memory Join

Join processing continues as long as
Memory permits, and
One of the inputs is producing tuples
If memory is full, one partition is picked to be
flushed to disk and append to the end of
disk-resident portion
If no new input, then stage 1 is blocked and
stage 2 starts

17
Why Stage 1?

In-memory operations are much faster and cheaper
than on-disk operations, thus guaranteeing result
are produced as soon as possible.

18
Question

What does the 2nd Stage do?
When does the 2nd Stage start?
Hints
What occurs when data input (tuples) are too
large for memory?
Answer
The 2nd Stage joins Memory-to-Disk
Occurs when both the inputs are blocking

19
Stage 2
20
2nd Stage Memory-to-Disk Join

Activated when 1st Stage is blocked
Performs 3 steps
Choose the partition according to throughput and
size of partition from one source
Use tuples from d-r portion to probe m-r portion
of other source and output matches, till d-r
completely processed
Check if either input resumed producing tuples.
If yes, resume 1st Stage. If no, choose another
d-r portion and continue the 2nd Stage.

21
Controlling 2nd Stage

Cost of 2nd Stage is hidden when both inputs
experience delays
Tradeoff ?
What are the benefits of using the second stage?
Produce results when input sources are stalled
Allows varying input rates
What is the disadvantage?
The second stage must complete a d-r portion
before checking for new input (overhead)
To address the tradeoff, use an activation
threshold
Pick a partition likely to produce many tuples
right now

22
3rd Stage Disk-to-Disk Join

Clean-up stage
Assume that all data for both inputs has arrived
Assume that 1st and 2nd stage have completed
Why is this step necessary?
Completeness of answer make sure that all result
tuples are being produced.
Reason some tuples in disk-resident portions may
not have chance to join each other.

23
Preventing Duplicates

When could duplicates be produced?
Duplicates could be produced in both the 2nd and
3rd stages which may perform overlapping work.
How to address it?
XJoin prevents duplicates with timestamps.
When address this?
During processing when trying to join two tuples.

24
Time Stamping Part 1

2 fields are added to each tuple
Arrival TimeStamp (ATS)
Indicates the time when the tuple first arrived
in memory
Departure TimeStamp (DTS)
Indicates the time when the tuple was flushed to
disk
ATS, DTS indicates when tuple was in memory
When did two tuples get joined in the 1st state?
If Tuple As DTS is within Tuple Bs ATS, DTS
Tuples that meet this overlap condition are not
considered for joining at the 2nd or 3rd stage

25
Detecting Tuples Joined in 1st Stage

Tuples joined in first stage
B1 arrived after A and before A was flushed to
disk

Tuples not joined in first stage
B2 arrived after A and after A was flushed to disk

26
Time Stamping Part 2

For each partition, keep track of
ProbeTS time when a 2nd stage probe was done
DTSlast the DTS of the last tuple of the
disk-resident portion
Several such probes may occur
Keep an ordered history of such probe descriptors
Usage
All tuples before and including at time DTSlast
were joined in stage 2 with all tuples in main
memory at time ProbeTS

27
Detecting Tuples Joined in 2nd stage
Partition 2
DTSlast
ProbeTS
ATS
DTS
20
340
350
550
700
900
Tuple A
100
200
overlap
Partition 2
100
300
800
900
Tuple B
500
600
ATS
DTS
History list for the corresponding partition.
All A tuples in Partition 2 up to DTSlast
350, Were joined with m-r tuples that arrived
before Partition 2s ProbeTS.
28
Experiments

HHJ (Hybrid Hash Join)
XJoin (with 2nd stage and with caching)
XJoin (without 2nd stage)
XJoin (with aggressive usage of 2nd stage)

29
Case 1 Slow NetworkBoth Sources Are Slow
30
Case 1 Slow NetworkBoth Sources Are Slow
(Bursty)

XJoin improves delivery time of initial answers
-gt interactive performance
The reactive background processing is an
effective solution to exploit intermittent delays
to keep continued output rates
Shows that 2nd stage is very useful if there is
time for it

31
Case 2 Fast NetworkBoth Sources Are Fast
32
Case 2 Fast NetworkBoth Sources Are Fast

All XJoin variants deliver initial results
earlier.
XJoin also can deliver the overall result in
equal time to HHJ
HHJ delivers the 2nd half of the result faster
than XJoin.
2nd stage cannot be used too aggressively if new
data is coming in continuously

33
Conclusion

Can be conservative on space (small footprint)
Can produce initial result as early as possible
Can hide intermittent data delays
Can be used in conjunction with online query
processing to manage data streams (limited)

34
How to Further Optimize XJoin?

Resuming Stage 1 as soon as data arrives
Removing no-longer-joining tuples timely
More

35
References

Urhan, Tolga and Franklin, Michael J. XJoin
Getting Fast Answers From Slow and Bursty
Networks.
Urhan, Tolga and Franklin, Michael J. XJoin A
Reactively-Scheduled Pipelined Join Operator.
Hellerstein, Franklin, Chandrasekaran, Deshpande,
Hildrum, Madden, Raman, and Shah. Adaptive Query
Processing Technology in Evolution. IEEE Data
Engineering Bulletin, 2000.
Hellerstein and Avnur, Ron. Eddies Continuously
Adaptive Query Processing.
Babu and Wisdom, Jennifer. Continuous Queries
Over Data Streams.

36
Stream New Query Context

Challenges faced by XJoin
potentially unbounded growing join state
Indefinite delay of some join results
Solutions
Exploit semantic constraints to remove
no-longer-joining data timely
Constraints sliding window, punctuations

37
Punctuation

Punctuation is predicate on stream elements that
evaluates to false for every element following
the punctuation.

ID
Name
Age
no more tuples for students whose age are less
than or equal to 18!
9961234
Edward
17
9961235
Justin
19
9961238
Janet
18

(0, 18
9961256
Anna
20

38
An Example
Open Stream
item_id seller_id open_price timestamp 1080
jsmith 130.00 Nov-10-03 90300 lt1080, ,
, gt 1082 melissa 20.00 Nov-10-03
91000 lt1082, , , gt
Query For each item that has at least one bid,
return its bid-increase value. Select
O.item_id, Sum (B.bid_price -
O.open_price) From Open O, Bid B Where
O.item_id B.item_id Group by O.item_id
Bid Stream
item_id bidder_id bid_price timestamp 1080
pclover 175.00 Nov-14-03 82700 1082
smartguy 30.00 Nov-14-03 83000 1080
richman 177.00 Nov-14-03 85200 lt1080, , ,
gt
Open Stream
Group-byitem_id (sum())
Joinitem_id
Out1 (item_id)
Out2 (item_id, sum)
Bid Stream
No more bids for item 1080!
39
PJoin Execution Logic
3
3
2
Join State (Memory-Resident Portion)
State of Stream A (Sa)
State of Stream B (Sb)
Hash Table
Hash Table
Purge Cand. Pool
Purge Cand. Pool

3 5 3 9 9

3

Punct. Set (PSb)
Punct. Set (PSa)
1
3
lt10
4
Join State (Disk-Resident Portion)
Hash(ta) 1
Hash Table
Hash Table
5 9 3 5
3
Tuple ta

Stream B
Stream A
40
PJoin Execution Logic
Join State (Memory-Resident Portion)
State of Stream A (Sa)
State of Stream B (Sb)
Hash Table
Hash Table
Purge Cand. Pool
Purge Cand. Pool

3 5 3 9 9

Punct. Set (PSb)
Punct. Set (PSa)
3
lt10
Join State (Disk-Resident Portion)
Hash(pa) 1
Hash Table
Hash Table
5 9 3 5
3
Punctuation pa

Stream B
Stream A
41
PJoin vs. XJoin Memory Overhead
Tuple inter-arrival 2 milliseconds Punctuation
inter-arrival 40 tuples/punctuation
42
PJoin vs. XJoin Tuple Output Rate
Tuple inter-arrival 2 milliseconds Punctuation
inter-arrival 30 tuples/punctuation
43
Conclusion

Memory requirement for PJoin state almost
insignificant compare to XJoins.
Increase in join state of XJoin leading to
increasing probe cost, thus affecting tuple
output rate.
Eager purge is best strategy for minimizing join
state.
Lazy purge with appropriate purge threshold
provides significant advantage in increasing
tuple output rate.