Title: Rimma V. Nehme
1SCUBA Scalable Cluster-Based Algorithm for
Evaluating Continuous Spatio-Temporal Queries on
Moving Objects
- Rimma V. Nehme
- Department of Computer Sciences,
- Purdue University,
- W.Lafayette, IN 47906 USA
- rnehme_at_cs.purdue.edu
Elke A. Rundensteiner Department of Computer
Sciences, Worcester Polytechnic Institute,
Worcester, MA 01609 USA rundenst_at_cs.wpi.edu
2Outline
- Motivation
- Related Work
- Our Approach SCUBA
- Experimental Study
- Conclusion
3Challenges for Continuous Query Processing on
Spatio-Temporal Data Streams
- moving objects
- dynamic range query
- dynamic kNN query
- Scalability
- Large number of objects
- Large number of queries
- Limited Resources
- Memory
- CPU
- Real-time Response
- Requirement
Novel Idea Exploit the fact that objects
naturally move in groups (i.e., clusters) to
optimize query evaluation
The challenge is to provide fast query response
in update-intensive environments
4Motivation
Monitor the traffic in the red areas
Continuously return the area covered by the heard
during the migration
5Big Picture
- SINA MXA04
- SEA-CNN XMA05
- Q-Index PXK02
- PSoup CF03
- NiagaraCQ CDT00
Novel Idea
We use clustering as means to improve execution
of spatio-temporal queries on moving objects
6Our Idea Moving Clusters
Continuously retrieve closest police car next to
me
- Main Idea Abstracting individual entities into a
cluster based on common attributes - - Direction
- - Speed
- - Spatial Position
- With cluster abstractions,
- we want to minimize the number of unnecessary
individual object/query joins, thus optimizing
query evaluation
Police Car
Scalable Cluster-Based Algorithm for Evaluating
Continuous Spatio-Temporal Queries on Moving
Objects (SCUBA)
7Advantage of Moving Clusters
If two abstractions do not overlap' then we can
discard negative candidates and avoid individual
joins.
- When clusters dont overlap, we avoid many joins
of individual objects within those clusters
m1
m2
No need to join objects/queries in m1 with
queries/objects in m2
- Moving object
- Spatio-temporal range query
We present SCUBA in the context of continuous
spatio-temporal range queries
8Advantage of Moving Clusters
- Objects/Queries continuously move
- Grid cells are static
- If put in grid, we have to continuously have to
take them and put into and out of grid cells. - Instead we want to make "flexible cells" i.e.,
moving clusters
9Architecture Overview
- SCUBA-enabled motion operator execution
- Answers produced periodically (every ?)
CAPE
Grid-based Join Between/Within Clusters
? Time interval expires
Results Data Stream
-moving object
SCUBA - Motion Operator
-range query
Moving Clusters
Moving Queries Data Stream
Moving Objects Data Stream
10Network Constrained Movement
Movement is constrained within road network
Roads edges Intersections connection nodes
Connection Node (CNLoc)
New York City
SCUBA supports both constrained and unconstrained
movement.
11Moving Cluster Representation in SCUBA
Cluster members moving objects
Max Cluster Size
Centroid
Actual Cluster Size
TD
Moving clusters expire after some time
Cluster members moving queries
Direction Vector
Cluster Member Representation Inside Cluster
Cluster member (moving object)
12SCUBA Execution
- SCUBA produces result periodically (every ? time
units)
- Phase I Cluster Pre-Join Maintenance
- Formation of new clusters
- Dissolving empty clusters
- Expanding existing clusters
Cluster Pre-Join Maintenance
Cluster-Based Joining
Cluster Post-Join Maintenance
DONE
? Timeout
DONE
- Phase II Cluster-Based Joining
- Joining clusters
- Joining objects and queries inside clusters
DONE
- Phase III Cluster Post-Join Maintenance
- Dissolving expiring clusters
- Relocating non-expiring clusters based on
velocity vector
13Phase I Cluster Pre-Join Maintenance
- Clustering is done incrementally (upon the
arrival of updates) - Location update format
- (ID, Loct, t, Speed, CNLoc, ...)
- Use 2 thresholds destination
- TD distance threshold
- TS speed threshold
- Destination
Clustering New Object Example
M1
M1
M2
M2
(1) New moving object arrives
(2) Hash object into grid
Parent Cluster
(4) If cluster has expanded check for overlap
with neighboring cells (make new entries if
necessary)
(3) Add object to cluster and update cluster
attributes
-centroid position -radius -average speed -member
count
(5) If object left existing cluster, for a new
cluster and old cluster is empty, dissolve old
cluster.
Clustering Algorithm is based on Leader-Follower
Clustering Algorithm (J.A. Hartigan. Clustering
Algorithms, John Wiley and Sons 1975)
M3
M3
14Phase II Cluster-Based Joining
Incremental Clustering
Cluster-Based Join
Phase II
Cluster-Based Join
1. Join-Between
2. Join-Within
15Phase II Cluster-Based Joining
- Join-Between
- Between two clusters
- Join-Within
- For each cluster (joining objects and queries
inside) - For two overlapping clusters (cross-join between
objects and queries from the two clusters)
16Phase III Cluster Post-Join Maintenance
- Clear the grid
- Dissolve expiring clusters
- Relocate non-expiring clusters based on
velocity vector back into the grid
Insert into the grid
New Cluster Position Updated
Dissolved
17Moving Cluster-Based Load Shedding
- Load Shedding - process of dropping excess load
from the system when the demand on resources is
above the system capacity TCZ03. - Load shedding reduces resource requirements by
dropping data, thereby sacrificing the accuracy
of the query answers. - The main goal is to minimize the degradation in
accuracy.
Focus Discarding data inside moving clusters
18Experimental Settings
- Implemented inside Java-based CAPE streaming
system RDZ05 - Used Network-based Generator of Moving Objects
BR02 to generate a set of moving objects and
moving queries in Worcester County (Tiger Line
files) - Unless mentioned otherwise, the following are the
parameters used - 10,000 moving objects and 10,000 moving queries.
- Clustering Thresholds
- TD 100 (spatial units),
- TS 10 (spatial units/time units)
- TN 0 (no load shedding)
- Grid 100x100
19Experimental Results
- From Dissimilar to Similar Motion
- Higher skew factor means more dense objects and
queries (i.e., more clusterable) - Compare against regular grid-based execution
(termed REGULAR)
20Experimental Results
- Incremental vs. Non-incremental
- Join time slightly improves with non-incremental
clustering - But clustering wait time outweighs advantage of
faster join
21Experimental Results
- Performance of regular grid-based execution
improves with finer granularity of grid cells
(But memory requirements increase as well)
22Experimental Results
- Cluster Maintenance
- Cluster maintenance time is cheap relative to
the join time
23Conclusions
- Designed SCUBA is a novel cluster-based algorithm
for continuously evaluating spatio-temporal
queries. - Scalability in SCUBA is achieved through shared
cluster-based execution. - Implemented SCUBA in CAPE streaming database
- Experimental results show that SCUBA outperforms
regular grid-based indexing scheme when executing
on densely moving objects - Clustering significantly improves performance
when processing densely moving objects - Maintaining clusters (overhead) is very small
24Future Work
- Non-circular clusters
- Extend to other types of spatio-temporal queries
- CKNN
- Aggregate
- Hierarchical clustering (merge and break-down
clusters)
25Thank you.
Mass Pike in Boston Satellite Image, Google Maps
2006
26Additional Slides
27References
- BR02 Brinkhoff T. 'A Framework for Generating
Network-Based Moving Objects', GeoInformatica,
Vol. 6, No. 2, Kluwer, 2002, 153-180 - SDK02 D. Stojanovic and S. DjordjevicKajan
Locationbased Web services for tracking and
visual route analysis of mobile objects. In
Proceedings of Yu INFO Conference, Kopaonik,
2002, CD ROM (Serbian). - GL04 Gedik, B., Liu, L. MobiEyes Distributed
Processing of Continuously Moving Queries on
Moving Objects in a Mobile System. EDBT, 2004. - MXA04 Mokbel, M., Xiong, X., Aref, W. SINA
Scalable Incremental Processing of Continuous
Queries in Spatio-temporal Databases. SIGMOD,
2004. - PXK02 Prabhakar, S., Xia, Y., Kalashnikov,
D., Aref, W., Hambrusch, S. Query Indexing and
Velocity Constrained Indexing Scalable
Techniques for Continuous Queries on Moving
Objects. IEEE Transactions on Computers, 51(10)
1124-1140, 2002. - XMA05 Xiong, X., Mokbel, M., Aref, W. SEA-CNN
Scalable Processing of Continuous K-Nearest
Neighbor Queries in Spatio-temporal Databases.
ICDE, 2005. - WCL02 Ouri Wolfson, Hu Cao, Hai Lin, Goce
Trajcevski, Fengli Zhang, Naphtali Rishe
Management of Dynamic Location Information in
DOMINO. EDBT 2002 769-771 - BBH04 L. Becker, H. Blunck, K. Hinrichs, J.
Vahrenhold A Framework for Representing Moving
Objects. Proceedings of the 14th International
Conference on Database and Expert Systems
Applications (DEXA 2004) Berlin, 2004, 854 - 863 - AG04 V. T. Almeida and R. H. Guting. Indexing
the trajectories of moving objects in networks.
Technical Report 309, FernuniversitÄat Hagen,
Fachbereich Informatik, 2004. - PJT00 D. Pfoser, C. S. Jensen, and Y.
Theodoridis. Novel approaches to the indexing of
moving object trajectories. In Proceedings of the
26th International Conference on Very Large
Databases, pages 395406, 2000. - TPS02 Yufei Tao, Dimitris Papadias, and
Qiongmao Shen. Continuous Nearest Neighbor
Search. In VLDB, 2002. - LPM02 Iosif Lazaridis, Kriengkrai Porkaew, and
Sharad Mehrotra. Dynamic Queries over Mobile
Objects. In EDBT, 2002 - SR01 Zhexuan Song and Nick Roussopoulos.
K-Nearest Neighbor Search for Moving Query Point.
In SSTD, 2001. - LPM02 Iosif Lazaridis, Kriengkrai Porkaew, and
Sharad Mehrotra. Dynamic Queries over Mobile
Objects. In EDBT, 2002. - TPS Yufei Tao, Dimitris Papadias, and Qiongmao
Shen. Continuous Nearest Neighbor Search. In
VLDB, 2002. - SJL00 Simonas Saltenis, Christian S. Jensen,
Scott T. Leutenegger, and Mario A. Lopez.
Indexing the Positions of Continuously Moving
Objects. In SIGMOD, 2000. - RDZ05 Elke A. Rundensteiner, Luping Ding, Yali
Zhu, Timothy Sutherland and Bradford Pielech,
CAPEA Constraint-Aware Adaptive Stream
Processing Engine, Invited Book Chapter, in
Stream Data Management (Advances in Database
Systems Series), 2005, chapter 5, Springer
Verlag, pp. 83-111.
28Data Structures
- Objects Table
- Queries Table
- ClusterHome Table
- ClusterStorage Table
- ClusterGrid
1
46
2
56
37
42