Event Stream Processing with Out-of-Order Data Arrival - PowerPoint PPT Presentation

About This Presentation
Title:

Event Stream Processing with Out-of-Order Data Arrival

Description:

... Ding , Elke A. Rundensteiner, and Murali Mani. Worcester Polytechnic Institute, Worcester MA USA ... Raising interest in the database community. Wild-range ... – PowerPoint PPT presentation

Number of Views:161
Avg rating:3.0/5.0
Slides: 33
Provided by: SK169
Learn more at: https://davis.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Event Stream Processing with Out-of-Order Data Arrival


1
Event Stream Processing with Out-of-Order Data
Arrival
  • Presenter Mo Liu
  • Presentation based on
  • Ming Li, Mo Liu, Luping Ding , Elke A.
    Rundensteiner, and Murali Mani
  • Worcester Polytechnic Institute, Worcester MA USA
  • DEPSA at ICDCS 2007, June 29th 2007, Toronto ON
    Canada

2
Outline
  • Introduction
  • Preliminary
  • Problem with Out-of-Order Event Arrival
  • Solution
  • Experiment
  • Conclusion
  • Related Work

3
Introduction Event Stream Processing
  • Raising interest in the database community
  • Wild-range and growing applications

Example of Event Stream Processing Shoplifting
in Retail Management
4
Introduction Complex Event Processing (CEP)
  • Event Stream Processing Engine
  • Stream engine specific for event stream query
    generic for detecting and extracting expected
    pattern sequence
  • Performance gain compared to stream system using
    joins to handle event sequence query

SASE Approach
5
Introduction Limitations
  • Total Order Assumption in event arrivals
  • Order in which the events are received by the
    query system is the same as their timestamp order
  • By this assumption, later arrival means larger
    timestamp
  • What if Out-of-Order?
  • Out-of-Order data arrival is common in
    distributed computing environment (i.e., due to
    network traffic)
  • Systems based on total order assumption (i.e.
    SASE) miss qualified results and produce spurious
    results

6
Outline
  • Introduction
  • Preliminary
  • Problem with Out-of-Order Event Arrival
  • Solution
  • Experiment
  • Conclusion
  • Related Work

7
Preliminary Query Language
  • EVENT ltevent patterngt
  • WHERE ltqualificationgt
  • WITHIN ltwindowgt

Example EVENT SEQ (A, B, D) WITHIN 10
seconds
Queries in SASE assume above language structure
8
Preliminary Finding Result Sequences
  • SSC (Sequence Scan and Construction)
  • Sequence Scan employs an NFA to detect
    matches Sequence Construction constructs
    expected results
  • NFA with AIS (Active Instance Stack)

AIS associates a stack with each state of the NFA
storing the events that triggered the NFA
transition to this state
  • RIP (Most Recent Instance in Previous Stack)
    field
  • The field records the temporal order relevant
    to the query

9
Preliminary Finding Result Sequences (Cont.)
  • Example

EVENT SEQ(A, B, D) WITHIN 10 Seconds


A
B
D
0
1
2
3
a3
a3 b6
b6 d10
a3 b6 d10
a7
a7 b11
b11 d15
a3 b6 d15 a3 b11 d15 a7 b11 d15
WD
a16
S1
S2
S3
f f a
b
b
a
c
b
a
d
f
c
d
1
11
3
5
6
7
10
12
13
15
Timestamp
16 18 18
10
Preliminary Purging Operator States
  • Example

EVENT SEQ(A, B, D) WITHIN 10 Seconds


A
B
D
0
1
2
3
PSSC You see d15 ? Purge a3 and so on
() a3
(b6) d10
(a3) b6
() a7
(b11) d15
(a7) b11
S1
S3
S2
a
c
b
a
d
f
c
d
f f a
b
b
3
5
6
7
10
12
13
15
1
11
16 18 19
Timestamp
11
Outline
  • Introduction
  • Preliminary
  • Problem with Out-of-Order Event Arrival
  • Solution
  • Experiment
  • Conclusion
  • Related Work

12
Problem with Out-of-Order at SSC Incomplete
Event Retrieval
EVENT SEQ(A, B, D) WITHIN 10 Seconds
SSC Missing Result
b
a
c
b
a
d
f
c
d
f
a
b
d
f

11
3
5
6
7
10
12
13
15
1
16
0
2
18
Received Order
Out-of-Order Event Arrival


Produced Result
Correct Result
A
B
D
0
1
2
3
a3 b6 d10 a7 b11 d15
a0 b1 d2 a3 b6 d10 a7 b11 d15
Missing!
() a3
(b6) d10
(a3) b6
() a7
(b11) d15
(a7) b11
13
Problem with Out-of-Order at SSC Event
Misplacement
Produced Result
Correct Result
a3 b6 d8 a3 b11 d8
a3 b6 d8
a3
a3 b6
b6 d10
a7
a7 b11
b11 d15
Wrong!
b11 d8
Missing!
S1
S2
S3
Incorrect AIS Appending
a
c
b
a
d
f
c
d
f
b
d
f
b

3
5
6
7
10
12
13
15
11
16
1
8
18
Received Order
Out-of-Order Event Arrival
14
Problem with Out-of-Order at PSSC
Purge in SS You see d15 then purge a3 and so
on After that, OOO d8 comes ? Missing Result!
unauthorized AIS purge ? CLAIM Any
data purge of active instance stack (AIS) is
unauthorized unless total order on the data
arrival holds for the input stream
EVENT SEQ(A, B, D) WITHIN 10 Seconds


A
B
D
0
1
2
3
() a3
(b6) d10
(a3) b6
() a7
(b11) d15
(a7) b11
a3 b6 d8
S1
S2
S3
b
a
c
b
a
d
f
c
d
f
d
f
b

11
3
5
6
7
10
12
13
15
1
16
8
18
Received Order
Out-of-Order Event Arrival Example 3
If precise query result is required, and memory
resources is limited, WD in SS would not be
sufficient for handling Out-of-order event
arrival!
15
Outline
  • Introduction
  • Preliminary
  • Problem with Out-of-Order Event Arrival
  • Solution
  • Experiment
  • Conclusion
  • Related Work

16
Solution in SSC
  • Event Retrieval Mechanism
  • To avoid incomplete retrieval, all states of
    the NFA need to be set active before the
    retrieval over the event stream.

b
a
c
b
a
d
f
c
d
f
b
a
d
f

11
3
5
6
7
10
12
13
15
1
0
16
2
17
Received Order
Out-of-Order Event Arrival


A
B
D
Produced Result
0
1
2
3
a0 b1 d2 a3 b6 d10 a7 b11 d15
() a0
(a0) b1
(b1) d2
() a3
(b6) d10
(a3) b6
() a7
(b11) d15
(a7) b11
17
Solution in SSC (Cont.)
  • AIS Construction Mechanism
  • For avoiding event misplacement, use sort
    semantics instead of append semantics

a3 b8 d10 a7 b8 d10 a3 b8 d15 a7 b8 d15
a3
a3 b6
a7
b8 d10
a7 b8
a7 b11
b11 d15
S1
S2
S3
Correct AIS Appending
f
b
b
f
a
c
b
a
d
f
c
d
b

11
3
5
6
7
10
12
13
15
1
16
8
18
Received Order
Out-of-Order Event Arrival
18
SSC Algorithm with Out-of-Order Handling
  • Out-of-Order Handling Incorporated SSC
  • Input
  • (1) Sequence Query EVENT SEQ (E1, E2, ,
    Em) WITHIN W
  • (2) AIS constructed from previously input
    events
  • (3) newly received event ei (under event
    type Ei)
  • Output
  • (1) updated AIS
  • (2) sequence output of SSC
  • 1. IF event type Ei is among E1, E2, , Em
  • 2. insert ei into stack Si (using sort
    semantics)
  • 3. set eis RIP
  • 4. check the RIP values of the instances in
    stack Si1 and reset the ones being
    affected by ei
  • 5. produce event sequences containing ei if
    any

19
Optimization
  • Out-of-Order Handling Incorporated SSC with
    AIS_CLOCK
  • Input and output Same as Algorithm 1
  • 1. IF event type Ei is among E1, E2, , Em
  • 2. IF ei.timestamp lt AIS_CLOCK
  • 3. buffer ei
  • 4. insert ei into stack Si (using sort
    semantics)
  • 5. set eis RIP
  • 6. check the RIP values of the instances in
    stack Si1 and reset the ones
    being affected
  • 7. produce event sequences containing
    ei if any
  • 8. ELSE
  • 9. buffer ei
  • 10. insert ei into stack Si (using
    append semantics)
  • 11. set eis RIP
  • 12. IF Ei Em
  • 13. produce event sequences
    containing ei if any

20
Solution for PSSC
  • Using K-Slack
  • We apply K-Slack based on time units. It
    assumes that the out-of-ordering in event
    arrivals is within a range of k time units. That
    is, an event can be delayed for at most k time
    units.

a3 b6 d8
a
c
b
a
d
f
c
d
f
b
d
f
b

3
5
6
7
10
12
13
15
11
16
1
8
18
Received Order
21
  • Purge condition
  • ei.timestamp W K lt CLOCK
  • (After waiting for K time units, no
    out-of-order event with timestamp less than ei
    W can arrive. Thus ei will no longer be able to
    contribute to forming a new candidate event
    sequence)
  • CLOCK
  • Its value equals to largest timestamp seen so
    far from the received events is maintained.

22
PSSC Algorithm With Out-of-Order Handling
  • Out-of-Order Incorporated SSC Purge (PSSC)
  • Input (1) current AIS (2) CLOCK triggering from
    SSC
  • Output updated AIS
  • 1. On receiving a CLOCK triggering
  • 2. for event instance e in AIS
  • 3. IF e.timestamp W K lt CLOCK
  • 4. purge e

23
Optimization 1 AIS partition
We can divide each stack in AIS into two parts
outdated event instances (e.timestamp W K gt
CLOCK ) up-to-date event instances. (e.timestamp
W gt CLOCK)
SEQ(A, B, D)
W7 K10 (large)
SSC output when d13 comes
Cost !
a3 b5 d18 a3 b5 d18 a3 b11 d18 a7 b11 d18
b1
a3
a7
a3 b5
b5 d10
divider
a7 b11
b11 d18
S1
S2
S3
c
b
b
a
c
b
a
d
f
f
f
d
11
3
4
5
7
10
12
13
18
1
18
15
Received Order
Out-of-Order Event Arrival
24
Optimization 2 Lazy Purge
For each CLOCK update, only the instance in the
last AIS stack will be checked for data purge.
For any instance is purged from there, we can
purge instances in other AIS stacks following the
RIP path.
b6 d10
a3
a3 b6
b11 d15
a7
a7 b11
25
Outline
  • Introduction
  • Preliminary
  • Problem with Out-of-Order Event Arrival
  • Solution
  • Experiment
  • Conclusion
  • Related Work

26
Experiment 1Sequence Scan and Construction
(SSC)
SEQ (A, B, C, D, E, F))
CPU gain on applying the AIS_CLOCK
Out-of-order data percentage is 90
Y axis cost Inserting events and resetting RIP
27
Experiment 2 Applying AIS partition during the
SSC purge
  • Performance Gain On Memory

Performance Gain on CPU cost
28
Outline
  • Introduction
  • Preliminary
  • Problem with Out-of-Order Event Arrival
  • Solution
  • Experiment
  • Conclusion
  • Related Work

29
Conclusion
  • In this work, we address the problem of
    processing event stream with out-of-order data
    arrival
  • we analyze the problems state-of-the-art event
    stream processing technology would experience
    when faced with out-of-order data arrival
  • we propose new implementation and optimization
    strategies for the core stream algebra operators
  • we conduct an experimental study that clearly
    demonstrates the effectiveness of our proposed
    approach over existing solutions

30
Outline
  • Introduction
  • Preliminary
  • Problem with Out-of-Order Event Arrival
  • Solution
  • Experiment
  • Conclusion
  • Related Work

31
Related Work
  • Some initial work uses K-slack to investigate the
    out-of-order problem for homogenous-input stream
    systems
  • Aurora deals with out of order within
    operator-level Order-sensitive operators wait a
    certain period of time before closing each window
  • Cayuga system deals with out-of-order by waiting
    K time unite before all the processing, which has
    higher latency then ours
  • Stream punctuation confirms that a certain value
    or time stamp will no longer appear in the future
    input streams. It requires certain service to
    first be created and appropriately associated

32
Thank you!?
Write a Comment
User Comments (0)
About PowerShow.com