Title: Continuous Intersection Joins Over Moving Objects
1Continuous Intersection Joins Over Moving Objects
- Rui Zhang
- University of Melbourne
- Dan Lin
- Purdue University
- Kotagiri Ramamohanarao
- University of Melbourne
- Elisa Bertino
- Purdue University
2Outline
- Backgrounds
- Intersection Joins on moving objects
- Indexes for moving objects
- Algorithms
- Adapting existing algorithms
- Our approach
- Time constrained processing
- Improvement techniques
- Experiments
3Motivation
- (Traditional) Intersection join
- Given two sets of spatial objects A and B, find
all object pairs i,j, where i?A, j ?B, such
that i intersects j. - Intersection join on moving objects
- Moving
- Continuous
4Join Algorithms
- Nested loops join
- Basic
- Expensive
- Block nested loops join
- Efficient
- Dependent on buffer size
- Index nested loops join
- Efficient and robust
- Sort-merge join
- Efficient
- Difficult for spatial objects
5Indexing Moving Objects
u
u
u
- Monitoring moving objects
- Sampling-based
- Trajectory-based
- p p ( t ref ) v (t - t ref )
- TM maximum update interval
- R-tree SIGMOD84
- Minimum bounding rectangle (MBR)
- TPR-tree SIGMOD00
- Add time parameters to the R-tree
- Other indexes Bx-tree VLDB04, STRIPES
SIGMOD04 - Only for points
u
u
u
u
6Naive Algorithm (NaiveJoin)
- Join nodes from two TPR-trees recursively
- If intersected, check on children
- Otherwise, disregard it
- For an update, compute its join pairs and update
the answer
Join result
a1,b1, 0,3
a2,b2, 1,4
a3,b4, 6,8
Node access (IO)
roots, N1, N2, N3, N4
Comparison (CPU)
root A vs root B, N1 vs N3, N2 vs N4
7Extended TP-Join Algorithm (ETP-Join)
- Time Parameterized Join (TP-Join) SIGMOD02
- Current result a1,b1
- Expiry time 1
- Event that causes the change a2,b2
Join result
a1,b1, 0,3
a2,b2, 1,4
a3,b4, 6,8
8Summary
- NaiveJoin
- One tree traversal per update, but expensive
traversal
- ETP-Join
- Cheaper traversal, but too frequent traversals
For the 1st TP-Join
Node access (IO)
roots, N1, N2, N3, N4
Comparison (CPU)
root A vs root B, N1 vs N3, N2 vs N4
Node access (IO)
roots, N1, N3
Comparison (CPU)
root A vs root B, N1 vs N3
Too long
Too short
9Key Problem
- Find a good time range for computing the join
pairs - Observation
- Consider object a and b
- Let the next update time for them be ta and tb
- Perfect time range for computing their join
result is tc, min(ta,tb) - How do we know ta or tb?
- TM gives a bound for them
- Time range is cut from tc, ? to tc, tcTM
- Is this correct for all objects?
- Yes. Proof in technical report
http//www.cs.mu.oz.au/rui/publication/TR_mj.pdf
10Time Constrained Processing (TC-Join)
- NaiveJoin with constrained processing time range
tc, tcTM
Join result
a1,b1, 0,3
a2,b2, 1,4
a3,b4, 6,8
Node access (IO)
roots, N1, N3
Comparison (CPU)
root A vs root B, N1 vs N3
11Further Optimization (MTB-Join)
- Many objects will not update at the time bound
- Put objects in time buckets
- Each time bucket has an associated TPR-tree
- An object is inserted into the tree whose time
bucket contains the objects latest update time
tc is in TM, 3/2TM
12Improvement on the Basic Join Algorithm
- Plane Sweep
- Sorting based on the lower left corner in
dimension x - Two sequences Sa a3, a4, a5 Sb b1, b2,
b3, b4 - Two essential components for PS
- Lower bound
- Upper bound
13Other Improvements
- Sorting dimension selection
- Smaller average speed
- Intersection check
- First intersection check and then plane sweep
14Experiments setting
- Computer 2.6G Pentium IV CPU, 1G RAM
- Datasets Uniform, Gaussian, Battlefield
- Measure IO and Time
Parameter Value
Node capacity 113
Maximum update interval (TM) 60, 120, 240
Maximum object speed 1, 2, 3, 4, 5
Object size ( of space) 0.5, 0.1, 0.2, 0.4, 0.8
Dataset size 1K, 10K, 50K, 100K
Dataset Uniform, Gaussian, Battlefield
15Experiments TC processing
Up to 15 times improvement
16Experiments Improvement techniques
Up to 6 times improvement
17Comparison Initial Join
MTB-Join outperforms others Half an hour for
NaiveJoin
18Comparison Maintenance
Up to 104 times improvement
Time for processing the join for one second Time for processing the join for one second Time for processing the join for one second Time for processing the join for one second
1K 10K 100K
MTB-Join 0.03 millisecs 0.05 secs 6 secs
ETP-Join 6.3 secs 15 mins hours
19Conclusion and future work
- Conclusion
- Time Constrained processing
- Further optimization by bucketing in time
- Improvement techniques
- Several orders of magnitude performance
improvement - Future work
- Applying TC processing to other queries
20References
- R-tree SIGMOD04
- Antonin Guttman. R-Trees A Dynamic Index
Structure for Spatial Searching . ACM SIGMOD
Conference 1984. - TPR-tree SIGMOD00
- S. Saltenis, C. S.Jensen, S. T. Leutenegger, and
M. A. Lopez. Indexing the positions of
continuously moving objects. ACM SIGMOD
Conference 2000. - Bx-tree VLDB04
- C. Jensen, D. Lin, and B.C.Ooi. Query and update
efficient B-tree based indexing of moving
objects. International conference on Very Large
Databases, 2004. - STRIPES SIGMOD04
- J. M. Patel, Y. Chen, and V. P. Chakka. STRIPES
An efficient index for predicted trajectories.
ACM SIGMOD Conference 2004. - TP-Join SIGMOD02
- Y. Tao and D. Papadias. Time-parameterized
queries in spatio-temporal databases. ACM SIGMOD
Conference 2002.
21Questions
- Please send your questions to
- Rui Zhang
- rui_at_csse.unimelb.edu.au