On Discovering Moving Clusters in Spatio-temporal Data - PowerPoint PPT Presentation

About This Presentation
Title:

On Discovering Moving Clusters in Spatio-temporal Data

Description:

On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 22
Provided by: Pano48
Learn more at: https://cs.stanford.edu
Category:

less

Transcript and Presenter's Notes

Title: On Discovering Moving Clusters in Spatio-temporal Data


1
On Discovering Moving Clusters in Spatio-temporal
Data
  • Panos Kalnis
  • National University of Singapore
  • Nikos Mamoulis
  • University of Hong Kong
  • Spiridon Bakiras
  • Hong Kong University of Science and Technology

2
What is a Moving Cluster?
  • Dense clusters of objects that move similarly for
    a long time period
  • Not necessarily the same objects during the
    lifetime of the cluster
  • Examples
  • Migrating animals
  • Convoy of cars
  • Military applications
  • Solutions
  • Efficient exact and approximate algorithms

3
Problem Formulation
  • Example
  • Moving cluster

4
Related Work (Static)
  • Partition-based clustering (k-medoids)
  • Hierarchical clustering (BIRCH, CURE)
  • Density-based clustering (DBSCAN)

5
Related Work (Moving Objects)
  • Grouping trajectories Vlachos et.al, ICDE 02
  • Trajectory cluster Constant set of objects
    through its lifetime
  • Only similar movement no space proximity
  • Dense areas over time Hadjieleftheriou et.al,
    SSTD 03
  • Static dense regions
  • No common objects between regions in sequence
  • Incremental DBSCAN/OPTICS Ester et.al, VLDB 98
  • Only a small percentage of objects moves
  • Maintaining Data Bubbles Nassar et.al, SIGMOD
    04
  • Redistributes updated objects in existing bubbles

6
MC1 The Straight-forward approach
  • G set of moving clusters
  • Apply clustering to next timeslice Si
  • Expand moving clusters in G
  • Add new moving clusters to G
  • Report ending clusters

7
Hash-based DBSCAN
  • Memory
  • 10M objects with 1GB RAM

8
MC1 is inefficient!
  1. Checks all possible combination of clusters in
    consecutive timeslices
  2. Performs clustering for every timeslice

9
MC2 Minimizing Redundant Checks
  • Clustering in every timeslice
  • Select a random object in c1
  • Search the object in S2
  • Repeat for remaining objects
  • Max (1-?)ci objects

c1c2 is a moving cluster
10
Ambiguity Cases ?lt0.5
c0c1, c2 c0c2, c1
11
MC3 Approximate Moving Clusters
  • Intuition Many clusters will remain the same
    even if objects move
  • Avoid performing clustering in every timeslice
  • For an object o
  • If o belongs to cluster c in timeslice Si
  • Assume that o also belongs to c in the next
    timeslice (notice objects may have moved)

12
Refine clusters
  • Hash new clusters in a grid
  • Legal cluster
  • Does not meet/intersect with other clusters
  • It is connected (cells meet)
  • Objects in legal clusters are not considered
    further
  • For the rest of the objects, perform clustering
  • Possible inaccuracies!!!

13
Minimize Error
  • Perform exact clustering to absorb (may not
    eliminate) the accumulated error
  • Period for exact clustering Grows linearly,
    drops exponentially
  • Exact clustering If more that aG clusters have
    been added/removed

14
Experimental Evaluation
  • 10K-50K objects per timeslice
  • 50-100 timeslices, up to 5M objects
  • Linux, C, 1.3GHz CPU, 1.2GB RAM
  • Generator Clusters move/rotate, objects
    appear/disappear

15
Varying data size (10K-50K per timeslice)
Avg 87
  • ?0.9, a0.1
  • Larger dataset larger clusters, more interactions

16
Varying number of clusters (100-800 per timeslice)
96
87
73
  • 5M objects, ?0.9, a0.1
  • Many clusters Reaches error threshold fast

17
Varying a
  • 5M objects, ?0.9, 800 clusters
  • a small may not recover!!!

18
Varying a for different agilities
  • Low agility Fewer errors ? faster

19
MC3 for varying ?
  • 5M objects, a0.1, 800 clusters
  • ? large incorrect clusters are pruned for not
    satisfying the ? criterion

20
Conclusions
  • Moving clusters
  • Objects may move/change
  • Exact and approximate solutions
  • Future work
  • Automatic setting of parameter a
  • Better error estimation
  • Constraints (e.g, moving cluster must span at
    least k timeslices)

21
Questions?
Write a Comment
User Comments (0)
About PowerShow.com