Title: Modeling Spatial and Spatio-temporal Co-occurrence Patterns
1Modeling Spatial and Spatio-temporal
Co-occurrence Patterns
- Mete Celik
- Spatial Database / Data Mining Group
- Department of Computer Science
- University of Minnesota
- mcelik_at_cs.umn.edu
- Advisor Shashi Shekhar
2Biography
- Education
- Ph.D., Student, Dept of Computer Science, U. of
Minnesota, MN, 2002 Present. - M.S., Ph.D., Student, Dept. of Electronic Eng.
Erciyes University, Turkey, 1999 - 2002. - B. S., Dept. of Control Computer Eng., Erciyes
University, Turkey, 1995 - 1999.
- Major Projects
- US DoD/ERDC/TEC - Modeling and Mining
Spatio-Temporal Co-occurrence Patterns - Defined interest measures, and designed models
and algorithms to analyze moving object datasets. - JGI/DTC - A Digital Library to Archive Research
Material from Jane Goodalls Gombe Chimpanzee
Project - Designed a database schema, developed digital
archive databases - and a web search engine to query visual
materials of Gombe project. - NGA - Spatio-Temporal Pattern Mining for
Multi-Jurisdiction Multi-Temporal Activity
Datasets - Designed models and algorithms to discover
graph-based hotspots (high-crime activity
streets) - AHPCRC - High Performance Spatial Data Mining
- Formulized scalable solutions and designed new
heuristics for spatial auto-regression model.
3Thesis Related Publications
- Chapter 2
- Zonal Co-location Pattern Discovery with Dynamic
Parameters, w/ J. M. Kang, S. Shekhar, In Proc.
of 7th IEEE Int l Conf. on Data Mining (ICDM),
NE, 2007. - Chapter 3
- Mixed-drove Spatio-temporal Co-occurrence Pattern
Mining , w/ S. Shekhar, J. P. Rogers, J. A.
Shine, Accepted to the IEEE TKDE 2008. - Mixed-drove Spatio-temporal Co-occurrence Pattern
Mining A Summary of Results, w/ S. Shekhar, J.P.
Rogers, J.A. Shine, and J.S. Yoo, In Proc. of 6th
IEEE Int l Conf. on Data Mining (ICDM), Hong
Kong, 2006. - Chapter 4
- Sustained Emerging Spatio-temporal Co-occurrence
Pattern Mining A Summary of Results, w/ S.
Shekhar, J.P. Rogers, and J.A. Shine, In Proc. of
IEEE Int l Conf. on Tools on Artificial
Intelligence (ICTAI), Washington, D.C., 2006.
4Other Publications
- Spatio-temoral Co-occurrence Pattern Mining
- Mining At Most Top-K Mixed-drove Spatio-temporal
Co-occurrence Patterns A Summary of Results, w/
S. Shekhar, J.P.Rogers, J.A.Shine, and J.M. Kang,
In Proc. of the Workshop on Spatio-Temporal Data
Mining (IEEE ICDE 2007), Turkey, 2007. - Discovery of Co-evolving Spatial Event Sets, w/
J. S. Yoo, S. Shekhar, S. Kim, In Proc. of the
SIAM Intl Conf. on Data Mining (SDM), Bethesda,
Maryland, 2006. - Spatial Co-location Pattern Mining
- Zonal Co-location Pattern Discovery with Dynamic
Parameters, w/ J. M. Kang, S. Shekhar, In Proc.
of 7th IEEE Int l Conf. on Data Mining (ICDM),
NE, 2007. - A Join-less Approach for Co-location Pattern
Mining A Summary of Results, w/ J. S. Yoo, S.
Shekhar, In Proc. of the 5th IEEE Intl Conf. on
Data Mining (ICDM), Houston, Texas, 2005. - Misc. Spatial Data Mining
- Spatial Dependency Modeling Using Spatial
Auto-regression, w/ B.M. Kazar, S. Shekhar, D.
Boley, D. J. Lilja, In Proc. of the ISPRS/ICA
Workshop on Geospatial Analysis and Modeling as
part of Intl Conf. GICON, Austria, 2006. - Parameter Estimation for the Spatial
Auto-Regression Model A Rigorous Approach, w/
B. M. Kazar, S. Shekhar, and D. Boley, The Second
NASA Data Mining Workshop Issues and
Applications in Earth Science, California, 2006.
5Outline
- Motivation and Related Work
- Example Mixed-drove Co-occurrence Pattern
- Limitation of Related Work
- Contributions
- Related Work
- MDCOP Mining Problem
- Proposed MDCOP Mining Algorithms
- Evaluation
- Conclusion and Future Work
6MDCOP Motivating Example Input
- Manpack stinger
- (2 Objects)
- M1A1_tank
- (3 Objects)
- M2_IFV
- (3 Objects)
- Field_Marker
- (6 Objects)
- T80_tank
- (2 Objects)
- BRDM_AT5 (enemy) (1 Object)
7MDCOP Motivating Example Output
- Manpack stinger
- (2 Objects)
- M1A1_tank
- (3 Objects)
- M2_IFV
- (3 Objects)
- Field_Marker
- (6 Objects)
- T80_tank
- (2 Objects)
- BRDM_AT5 (enemy) (1 Object)
8Why are mixed-drove patterns important?
- Improving capabilities of information processing
Earth Science, environmental management,
government services, and transportation. - Helping explanatory or descriptive use for
intelligence, resource allocation, confirmatory. - Public health (Infectious emerging diseases)
- Ecology (tracking species and pollutant
movements) - Homeland defense (looking for growing events,
biodefense) - Military
- Identifying patterns or critical elements
- Predicting near-feature locations of enemy units
http//www.nytimes.com/library/tech/00/01/circuits
/articles/20giss.html
http//www.my-etest.co.uk/authors/gillmac1/
9Challenges
- Current interest measures (i.e. participation
index) are not sufficient to quantify such
patterns - New composite interest measure must be created
and formalized. - The set of candidate patterns grows exponentially
with the number of object-types. - Spatio-temporal datasets are huge
- Computationally efficient algorithms must be
developed.
10Related Work 1 -Mining of uniform group of moving
objects
Flock Pattern Gudmundsson05
Moving clusters Kalnis05
- Does not recognize group of mixed object-types
- Does not recognize if time intervals are discrete
- Treats different type of objects as same
- Patterns should be in consecutive time slots
11Related Work 2 -Mining of mixed group of moving
objects
- Generalize co-location patterns to
spatio-temporal domain - Collocation episodes Cao et. al. ICDM06
- Reference centric model
- Topological patterns Wang et. al. CIKM05
- Semantics are not well-defined for moving objects
12Example American Football
Sketch of the game
time slot t0
time slot t1
time slot t2
time slot t3
- Flock pattern There is no flock patterns
- Moving objects are not same type.
- Broken blitz play The objective of the offensive
wide receivers (W) is to outrun any linebackers
(L) and defensive backs (C) and get behind them,
catching an undefended pass while running
untouched for a touchdown.
- Moving clusters There is no moving clusters
- There is no pattern in consecutive time slots
- Do not take into account different object type
patterns
- t0 Ws and Cs, Ws and Ls are co-located. W.1,
C.1 W.4 C.2,
- t1 Ws begin their run, while Cs remain in
their original position possibly due to a fake
handoff from the Q to running back.
- Co-location episodes There is no co-location
episodes - There is no reference object type in consecutive
time slots.
- Broken blitz play The objective of the offensive
wide receivers (W) is to outrun any linebackers
(L) and defensive backs (C) and get behind them,
catching an undefended pass while running
untouched for a touchdown.
- t2 Ws cross over each other and try to drift
further away from their respective Cs.
- t3 Q shows signs of throwing the football, Cs
run to their respective Ws. S guards Ws.
- Topological patterns There is no topological
pattern - Will discover W,C, W,L, W,S, and W,C,S.
- However W,L, W,S, W,C,S (1/4) are not time
persistent.
- Output MDCOP (wide receiver, cornerback)
13Related Work Summary
Spatio-Temporal Pattern Level Level Time Interval Time Interval
Spatio-Temporal Pattern Object Object-type Consecutive Non-consecutive
Mining of uniform group of moving objects Flock Pattern 1,2 X X
Mining of uniform group of moving objects Moving Clusters 3 X X
Mining of mixed group of moving objects Collocation Episodes X reference centric X
Mining of mixed group of moving objects Topological Patterns X Semantics are not well defined Semantics are not well defined
Mining of mixed group of moving objects Mixed-drove Pattern (MDCOP) X X X
Proposed approach will catch MDCOP (W, C).
14Proposed Approach Key Ideas
- Defined a new monotonic composite interest
measure - Generalize Participation Index, a monotonic
interest measure for spatial co-location patterns - Temporal persistence over an interval
- Developed a novel and computationally efficient
MDCOP mining algorithms - Exploit monotonic interest measure to prune
candidates - Alternative designs for temporal persistence
- i) post-processing after reuse of
co-location-miner - ii) temporal pruning after pattern size 1,2,3,.
- iii) temporal pruning as soon as possible
15Outline
- Introduction
- Related Work
- MDCOP Mining Problem and Algorithms
- Key Concepts Interest Measure
- Formal Problem Definition
- Algorithms
- Naïve Approach
- MDCOP-Miner
- FastMDCOP-Miner
- Evaluation
- Conclusion and Future Work
16Key Concepts-1
Sketch of the game
time slot t3
time slot t2
time slot t1
time slot t0
- Spatial co-location Col is a set of object-types
frequently co-located. - ColW, C co-locations in t0 and t3.
- Participation ratio PR(oi, Col) (instances (oi)
in co-location Col)/ instances (oi)
t0 PR(W,Col)2/4, PR(C,Col)2/2
t1 PR(W,Col)0, PR(C,Col)0
t2 PR(W,Col)0, PR(C,Col)0
t3 PR(W,Col)3/4, PR(C,Col)2/2
- Spatial prevalence measure Participation index
PI minPR(oi, Col) - ColW, C gt PI (Col)min (PR(W,C),
PR(C,Col)) 2/4 for t0 - gt PI (Col)min (PR(W,C),
PR(C,Col)) 3/4 for t3 - A co-location is called spatial prevalent if PI
of it is not less than a given threshold ?p .
17Key Concepts-2
- Definition 1 The time prevalence or persistence
measure of the pattern P - TP(P,T)( of time slots where the pattern
occurs) / (total of time slots) - time interval TT0 , T1 , , Tn
- Example
- W, C pairs are co-located in time slots t0 and
t3, - Time prevalence of pattern W, C 2/4
A pattern is time prevalent if its time
prevalence measure is not less than a given
threshold ?time i.e.,if ?time 0.5 than W, C
is time prevalent, because TPW, C 2/4 gt
?time
Spatial Prevalence Measure Spatial Prevalence Measure Spatial Prevalence Measure Spatial Prevalence Measure Time prevalence index
time slot t0 time slot t1 time slot t2 time slot t3 Time prevalence index
W, C 2/4 0 0 3/4 2/4
W, L 2/4 0 0 0 1/4
W,S 0 0 0 3/4 1/4
18Key Concepts-3
- Definition 2 The mixed-drove prevalence measure
of a pattern Pi - a composition of the spatial prevalence and time
prevalence measure. - A mixed-drove prevalence measure is monotonically
decreasing with respect to MDCOP size. - Example
- if ?P gt0.5 co-location W,C is spatial
prevalent in time slots t0 and t3 - Prob(W,C)2/4
A pattern is mixed-drove prevalent, if its
mixed-drove prevalence satisfies thresholds ?P
and ?time . i.e., if ?P 0.5 and ?time 0.5,
than W,Cis prevalent since Prob(W,C)2/4 gt
0.5
Spatial Prevalence Measure Spatial Prevalence Measure Spatial Prevalence Measure Spatial Prevalence Measure Time prevalence index
time slot t0 time slot t1 time slot t2 time slot t3 Time prevalence index
W, C 2/4 0 0 3/4 2/4
W, L 2/4 0 0 0 1/4
W,S 0 0 0 3/4 1/4
19Problem Definition
- Given
- A set P of Boolean ST object-types over a common
ST framework - A neighbor relation R over locations
- A spatial prevalence index threshold, ?P
- A time prevalence index threshold, ?time
- Find
- Mixed-drove spatio-temporal co-occurrence
patterns whose spatial prevalence gt ?P and time
prevalence gt ?time - Objective
- Minimize computation cost
- Constraints
- Correctness
- Completeness
- Monotonic composite interest measure
20Example An American Football Play
time slot t0
time slot t1
time slot t2
time slot t3
Sketch of the game
- Input
- Each play is a spatio-temporal dataset
- Boolean object-types are role of players (e.g.
wide receiver, cornerback, liner backers) - Objects are instances of object types (instances
of W are W.1, W.2, W.3, W.4) - The duration of the play is 4 time units
- Neighborhood relation may be less than 1 meter or
an average arm distance - Spatial prevalence threshold ?p 0.5
- Time prevalence threshold ?time 0.5
- Output
- Pattern W, C
21Proposed Approach Key Ideas
Key Decision Timing of temporal pruning i)
post-processing after reuse of co-location-miner i
i) after pattern size 1,2,3,. iii) as soon as
possible
iii) FastMDCOP-Miner
ii) MDCOP-Miner
i) Naïve Approach
22Execution Trace Summary of MDCOPs
time slot t0
time slot t1
time slot t2
time slot t3
Pattern t0 t1 t2 t3 Time prevalence index after sizek
Pattern t0 t1 t2 t3 Time prevalence index FastMDCOP MDCOP-Miner Naive
W C 2/4 0 0 3/4 2/4 Survived Survived Survived
Generate W L 2/4 0 0 0 1/4 Already Pruned Survived
Generate W L 2/4 0 FastMDCOP not calculated pruned
size 2 W S 0 0 0 2/4 1/4 Already Pruned Survived
size 2 W S 0 0 FastMDCOP no generation not calculated pruned
C S 0 0 0 1 1/4 Already Pruned Survived
C S 0 0 FastMDCOP no generation not calculated pruned
Pruning FastMDCOP
MDCOP-Miner
Generate size 3 W S C 0 0 0 2/4 1/4 not generated not generated Survived
Pruning at the post processing Pruning at the post processing Pruning at the post processing Pruning at the post processing Naïve
23 Naïve Approach
MDCOP-Miner
OutputMDCOPs WC
OutputMDCOPs WC
- Temporal pruning
- (Prune WL, WS, CS, WSC)
t3
t1
t2
t0
t3
t1
t2
t0
No generation
No generation
WSC
No generation
No generation
No generation
No generation
No generation
Temporal pruning
triples
triples
WCWL WS WQ CS CL CQ SL SQ LQ
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
pairs
pairs
Temporal pruning WL, WS, CS
Inputspatio-temporal dataset ?P 0.5 , ?time
0.5
Inputspatio-temporal dataset, ?P 0.5 , ?time
0.5
Bold Black and Blue Common Blue Underlined
Spatial Pruning Bold Red Difference
24Proposed Approaches
- Naïve Approach
- Step 1) Find spatial co-locations for all time
slots - Step 2) Post-processing Prune time
non-prevalent MDCOPs. - Limitation Redundant generation of
non-prevalent MDCOPs before post processing - Pseudo-code
- MDCOP-Miner
- Idea Push the post processing step inside the
loop. - Step 1) Find MDCOP patterns by applying MDCOP
prevalence interest measure. - Advantage MDCOP-Miner eliminates redundant
candidate generation of non-prevalent MDCOPs - Pseudo-code
- Initialization
- while k in (1,2,3, K)
- For each time slot
- generate co-locations
- generate co-location instances
- prune_spatial non-prevalent co-loc
-
- Post-processing step
- For size k spatial prevalent co-locations
- Calculate_time_prevalence_index
- prune_temporal_non-prevalent_co-occur
- Initialization
- while k in (1,2,3, K)
- For each time slot
- generate co-occur
- generate co-occur instances
- prune_spatial_non-prevalent co-occur
- calculate_time_prevalence_index
- prune_temporal_non-prevalent_co-occur
25 FastMDCOP-Miner
MDCOP-Miner
OutputMDCOPs WC
OutputMDCOPs WC
t3
t1
t2
t0
No generation
No generation
No generation
No generation
triples
t3
t1
t2
t0
No generation
No generation
No generation
No generation
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WCWL ---- ---- ---- ---- ---- ---- ---- ----
WC WL WS WQ CS CL CQ SL SQ LQ
Temporal pruning
pairs
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
Temporal pruning WL, WS, CS
Temporal pruning
Inputspatio-temporal dataset, ?P 0.5 , ?time
0.5
Inputspatio-temporal dataset, ?P 0.5 , ?time
0.5
Bold Black and Blue Common Blue Underlined
Spatial Pruning Bold Red Difference
26Proposed Approaches
- FastMDCOP-Miner
- Idea Prune time non-prevalent patterns as early
as possible. - Step 1) Find MDCOP patterns by applying MDCOP
prevalence interest measure. - Advantage MDCOP-Miner eliminates redundant
candidate generation of time non-prevalent MDCOPs - Pseudo-code
- MDCOP-Miner
- Idea Push the post processing step inside the
loop. - Step 1) Find MDCOP patterns by applying MDCOP
prevalence interest measure. - Pseudo-code
- Initialization
- while k in (1,2,3, K)
- For each time slot
- generate co-occur
- generate co-occur instances
- prune_spatial_non-prevalent co-occur
- calculate_time_prevalence_index
- prune_temporal_non-prevalent_co-occur
- Initialization
- while k in (1,2,3, K)
- For each time slot
- generate co-occur
- generate co-occur instances
- prune_spatial_non-prevalent co-occur
- calculate_time_prevalence_index
- prune_temporal_non-prevalent_co-occur
27Execution Trace (FastMDCOP-Miner) t0
time slot t1
time slot t2
time slot t3
time slot t0
Step 2
Step 1
t0 Time prevalence
WC 1 1/4
WL 1 1/4
WS 0 0
WQ 0 0
CL 0 0
CS 0 0
CQ 0 0
LS 0 0
LQ 0 0
SQ 0 0
t0 t0 t0 t0 t0 t0 t0 t0 t0 t0
Patterns WC WL WS WQ CL CS CQ LS LQ SQ
Instances W.1 C.1 W.2 L.1
Instances W.4 C.2 W.3 L.2
Instances
Instances
P. Ratio 2/4 2/2 2/4 2/2
P. Index 2/4 2/4
?P 2/4 PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED
28Execution Trace (FastMDCOP-Miner) t1
time slot t0
time slot t1
time slot t2
time slot t3
Step 2
Step 1
t0 t1 Time prevalence
WC 1 0 1/4
WL 1 0 1/4
WS 0 0 0
WQ 0 0 0
CL 0 0 0
CS 0 0 0
CQ 0 0 0
LS 0 0 0
LQ 0 0 0
SQ 0 0 0
t1 t1 t1 t1 t1 t1 t1 t1 t1 t1
Patterns WC WL WS WQ CL CS CQ LS LQ SQ
Instances
Instances
Instances
Instances
P. Ratio
P. Index
?P 2/4 PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED
29Execution Trace (FastMDCOP-Miner) t2
time slot t0
time slot t1
time slot t2
time slot t3
Step 2
Step 1
t0 t1 t2 Time prevalence
WC 1 0 0 1/4
WL 1 0 0 1/4
WS 0 0 0 0
WQ 0 0 0 0
CL 0 0 0 0
CS 0 0 0 0
CQ 0 0 0 0
LS 0 0 0 0
LQ 0 0 0 0
SQ 0 0 0 0
t2 t2 t2 t2 t2 t2 t2 t2 t2 t2
Patterns WC WL WS WQ CL CS CQ LS LQ SQ
Instances
Instances
Instances
Instances
P. Ratio
P. Index
?P 2/4 PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED
30Execution Trace (FastMDCOP-Miner) t3
time slot t0
time slot t1
time slot t2
time slot t3
Step 2
Step 1
t0 t1 t2 t3 Time prevalence
WC 1 0 0 1 2/4
WL 1 0 0 0 1/4
WS 0 0 0 0
WQ 0 0 0 0
CL 0 0 0 0
CS 0 0 0 0
CQ 0 0 0 0
LS 0 0 0 0
LQ 0 0 0 0
SQ 0 0 0 0
t3 t3 t3 t3 t3 t3 t3 t3 t3 t3
Patterns WC WL WS WQ CL CS CQ LS LQ SQ
Instances W.2 C.1
Instances W.4 C.1
Instances W.1 C.2
Instances
P. Ratio 3/4 2/2
P. Index 3/4
?P 2/4
31Execution Trace (FastMDCOP-Miner) time prev.
index
- Calculate time prevalence indices of spatial
prevalent co-locations (step 8) - Find mixed-drove prevalent MDCOPs (step9)
- A,B, A,C, B,C
time slot t0
time slot t1
time slot t2
time slot t3
t0 t1 t2 t3 Time prevalence
WC 1 0 0 1 2/4
WL 1 0 0 0 Pruned
WS 0 0 0 Already Pruned
WQ 0 0 0 Already Pruned
CL 0 0 0 Already Pruned
CS 0 0 0 Already Pruned
CQ 0 0 0 Already Pruned
LS 0 0 0 Already Pruned
LQ 0 0 0 Already Pruned
SQ 0 0 0 Already Pruned
- Calculate time prevalence indices of spatial
prevalent co-locations - Find mixed-drove prevalent MDCOPs
- W,C
32Analytical Evaluation
- Lemma 1 A spatial prevalence measure, e.g.,
participation index, is monotonically
non-increasing in the size of the MDCOPs at each
time slot. Â
- Lemma 2 A mixed-drove prevalence index measure
is monotonically non-increasing with the size of
MDCOP over space and time.
Lemma 1
Lemma 2
Spatial prevalence index Spatial prevalence index Spatial prevalence index Spatial prevalence index Time prevalence index
t0 t1 t2 t3 Time prevalence index
WC 2/4 - - 3/4 2/4
WS - - - 3/4 1/4
CS - - - 1 1/4
WCS - - - 2/4 1/4
Spatial prevalence of W,C,S pattern is less
than or equal to the spatial prevalence of
sub-patterns W,C, W,S, C,S. Mixed-drove
prevalence index measure of pattern W, C, S is
less than or equal to the spatial prevalence of
sub-patterns W,C, W,S, C,S
33Analytical Evaluation
- Theorem 1 The MDCOP-miner and FastMDCOP-Miner
are complete. - Proof Algorithms find all MDCOPs whose
- spatial prevalence gt ?P and time prevalence gt
?time - Any subset of MDCOP prevalent pattern is MDCOP
prevalent (Lemma 2) - None of the functions of the algorithm miss any
prevalent MDCOP. - prune_spatial_non-prevalent_co-occur, and
prune_temporal_non-prevalent_co-occur - Theorem 2 The MDCOP-miner and FastMDCOP-Miner is
correct. If a MDCOP pattern P is returned by
algorithms then P is a prevalent MDCOP. - Proof The pruning steps of prune_spatial_non-pr
evalent_co-occur and prune_temporal_non-prevalen
t_co-occur prune out candidates not meeting the
given thresholds.
34Outline
- Introduction
- Related Work
- MDCOP Mining Problem and Algorithm
- Evaluation
- Analytical Evaluation
- Performance Evaluation
- Conclusion and Future Work
35Performance Evaluation Experiment Design
- Experiment Goals Compare FastMDCOP-Miner,MDCOP-Mi
ner with Naïve Approach - What is the effect of number of time slots?
- What is the effect of number of object-types?
- What is the effect of spatial prevalence
threshold ?P ? - What is the effect of time prevalence threshold
?time ?
- Metric of comparison Computational complexity
- Workload Vehicle moving dataset and synthetic
dataset - Hardware Intel Centrino PIV 1.60GHz, 512 Mb of
RAM
36Real Dataset Description
- Vehicle movement dataset
- 15 time slots, x and y coordinates are in meter
- 22 distinct vehicle types and their instances
- Minimum instance number 2, maximum instance
number 78 - Average instance number 19
Output Spatio-temporal Co-occurrence Pattern
(Manpack_stinger ltM1, M2gt , fire cover (e,g.,
Bradley tank ltT1, T2gt))
Example Input from Spatio-temporal Dataset
37Real Dataset What is the effect of number of
time slots?
- Fixed Parameters
- Spatial threshold ?P 0.2
- Time threshold ?time 0.8
- Distance150m
- of object types 22
- Execution times of the algorithms increase, as
the number of time slots is increased. - MDCOP-Miner and Naïve approach generates some
number of size 2 candidates. - FastMDCOP-Miner generates less candidates that
the other algorithms due to early pruning. - FastMDCOP-Miner outperforms other algorithms.
38Real Dataset What is the effect of number of
object-types?
- Fixed parameters
- Spatial threshold ?P 0.2
- Time threshold ?time 0.8
- of time slots 15
- Distance150m
- Execution times of the algorithms increase, as
the number of object-types is increased. - MDCOP-Miner and Naïve approach generates some
number of size 2 candidates. - FastMDCOP-Miner outperforms other algorithms by
generating less candidates.
39Real Dataset What is the effect of time
prevalence index threshold?
- Fixed parameters
- Spatial threshold ?p 0.5
- of time slots 15
- Distance150m
- of object types 22
- Execution times of the algorithms decrease, as
the time prevalence threshold is increased. - FastMDCOP-Miner outperforms other algorithms. Its
advantage increases as the threshold increases.
40Real Dataset What is the effect of spatial
prevalence index threshold?
- Fixed parameters
- Time threshold ?time 0.5
- of time slots 15
- Distance150m
- of object types 22
- Execution times of the algorithms decrease, as
the spatial prevalence threshold is increased. - FastMDCOP-Miner outperforms other algorithms.
41Synthetic Dataset Generation
42Synthetic Dataset What is the effect of number
of time slots?
- Fixed parameters
- Spatial threshold ?P 0.3
- Time threshold ?time 0.9
- Distance10
- of object types 200
- Execution times of the algorithms increase, as
the number of time slots is increased. - MDCOP-Miner and Naïve approach generates some
number of size 2 candidates. - FastMDCOP-Miner generates less candidates that
the other algorithms.
43Synthetic Dataset What is the effect of number
of object-types?
- Fixed parameters
- Spatial threshold ?p 0.3
- Time threshold ?time 0.8
- of time slots 20
- Distance10m
- MDCOP-Miner and Naïve approach generates same
number of size 2 candidates. - The ratio of the increase in the execution time
of naïve approach greater than that of other
algorithms as the number of object-type
increases. - Naïve approach generates redundant non-persistent
co-locations. - FastMDCOP-Miner outperforms other algorithms by
generating less candidates.
44Synthetic Dataset What is the effect of time
prevalence index threshold?
- Fixed parameters
- Spatial threshold ?p 0.4
- of time slots 50
- Distance10m
- of object types200
- Execution times of the algorithms decrease, as
the time prevalence threshold is increased. - FastMDCOP-Miner outperforms other algorithms. Its
advantage increases as the threshold increases. - The ratio of the increase in the execution time
of naïve approach greater than that of other
algorithms as the number of object-type
increases.
45Synthetic Dataset What is the effect of spatial
prevalence index threshold?
- Fixed parameters
- Time threshold ?time 0.8
- of time slots 50
- Distance10m
- of object types200
- Execution times of the algorithms decrease, as
the spatial prevalence threshold is increased. - FastMDCOP-Miner outperforms other algorithms.
46Synthetic Dataset What is the effect of noise
instances and average number of co-occurrence
instances?
- Fixed parameters
- Spatial threshold ?p 0.3
- Time threshold ?time 0.8
- of time slots 20
- Distance10m
- Fixed parameters
- Spatial threshold ?p 0.3
- Time threshold ?time 0.8
- of time slots 20
- Distance10m
- Execution times of the algorithms increase, as
the spatial prevalence threshold is increased. - FastMDCOP-Miner more robust than other algorithms.
- FastMDCOP-Miner outperforms other algorithms.
47Summary of Experimental Results
- What is the effect of number of time slots?
- The cost of Naïve approach and non-persistent
candidate generation increases as the number of
time slots increases. - FastMDCOP-Miner outperforms other algorithms.
- What is the effect of number of object-types?
- The ratio of the increase in the execution time
of naïve approach greater than that of other
algorithms as the number of object-type
increases. - FastMDCOP-Miner more robust than other
algorithms. It detects non-persistent patterns as
early as possible. - What is the effect of spatial prevalence
threshold? - The cost of Naïve approach is higher than that of
other algorithms for low values of spatial
prevalence threshold. - What is the effect of time prevalence threshold?
- Naïve approach is not sensitive to the time
prevalence threshold since it is used in the post
processing step. - FastMDCOP-Miner is the most sensitive algorithm
to the time threshold. It performs well
especially for high time thresholds. - In all experiments, Naïve approach and
MDCOP-Miner generates same number of size 2
candidates.
48Outline
- Introduction
- Related Work
- MDCOP Mining Problem
- Proposed MDCOP Mining Algorithms
- Evaluation
- Conclusion and Future Work
49Contributions described today
- Mixed-drove spatio-temporal co-occurrence
patterns (MDCOPs) and the MDCOP mining problem
are defined. -
- A new monotonic composite interest measure
defined. - Developed a novel and computationally efficient
MDCOP mining algorithms. -
- Proposed algorithm is correct and complete in
finding mixed-drove prevalent patterns. - Performance evaluation using real and synthetic
datasets
50Spatio-temporal Co-occurrence Pattern Taxonomy
- 1. Spatial co-location
- Global and zonal co-location patterns, etc.
- 2. Co-occurrence patterns of moving objects
- Flock pattern, mixed-drove pattern, follow
pattern, moving clusters, etc.
- 3. Emerging or vanishing co-occurrence patterns
- Emerging pattern Interest measure getting
stronger by the time - Vanishing pattern Interest measure getting
weaker by the time
4. Co-evolving patterns
5. Periodic co-occurrence patterns 6.
Spatio-temporal cascade patterns . . .
- ICDM05 - Discovering co-evolving spatio-temporal
event sets
- TKDE08 and ICDM06 - Mixed-Drove Spatio-Temporal
Co-occurrence - Pattern Mining
- ICDE-STDM07 - Mining At Most Top-K Mixed-drove
Spatio-temporal - Co-occurrence Patterns
- ICDM07 Zonal Co-location Pattern Mining
- ICDM05 Joinless Approach for Co-location
Pattern Mining
- ICTAI06 - Sustained Emerging Spatio-temporal
Co-occurrence Pattern Mining
51Chapter 2- Zonal Co-location Pattern Discovery
Given different object types of spatial events
and zone boundaries Find Co-located subset of
event types specific to zones Method A novel
algorithm by using an indexing structure.
1
2
4
3
Zones 2,4 Zone 3
52Chapter 4 - Sustained Emerging ST Co-occurrence
Pattern Discovery
- Given A set P of Boolean ST object-types over a
common ST framework - Find Sustained emerging spatio-temporal
co-occurrence patterns whose prevalence measure
increase over time. - Method Developing novel algorithms by defining
monotonic interest measures.
53Future Work Short Term
- Spatial co-location
- Interest measure participation index
- Global and zonal co-location patterns, etc.
- Co-occurrence patterns of moving objects
- Flock pattern, mixed-drove pattern, follow
pattern, cross pattern, moving clusters, etc. - Emerging or vanishing co-occurrence patterns
- Emerging pattern Interest measure getting
stronger by the time - Vanishing pattern Interest measure getting
weaker by the time - Co-evolving patterns
- Periodic co-occurrence patterns
- Spatio-temporal cascade patterns
- Efficient methods
- Comparison of int. measures with statistical int.
measures
54Future Work Long Term
- Spatial and Spatio-temporal Pattern Mining Design
- Crime Analysis, GIS, Epidemiology
- Challenges
- discovering patterns and anomalies from enormous
frequently updated spatial and spatio-temporal
datasets, - developing an ontological framework for spatial
and spatio-temporal analysis, - integrating spatial and spatio-temporal data from
multiple agencies, distributed data, and
multi-scale data
55Acknowledgements
- Adviser Prof. Shashi Shekhar
- Committee Prof. Jaideep Srivastava, Prof.
Arindam Banerjee, and Prof. Sudipto Banerjee - Spatial Databases and Data Mining Group
- TEC collaborators James P. Rogers, James A.
Shine - Dept. of Computer Science
56References
- 1 J. Gudmundsson, M. v. Kreveld, and B.
Speckmann, Efficient Detection of Motion Patterns
in Spatio-Temporal Data Sets, ACM-GIS,250-257,
2004. - 2 P. Laube and S. Imfeld, Analyzing relative
motion within groups of trackable moving point
objects, in In GIScience, number 2478 in Lecture
notes in Computer Science. Berlin Springer, pp.
132-144, 2002. - 3 P. Kalnis, N. Mamoulis, and S. Bakiras, On
Discovering Moving Clusters in Spatio-temporal
Data, 9th Int'l Symp. on Spatial and Temporal
Databases (SSTD), Angra dos Reis, Brazil, 2005. - 4 Y. Huang, S. Shekhar, and H. Xiong,
Discovering Co-location Patterns from Spatial
Datasets A General Approach, IEEE Trans. on
Knowledge and Data Eng. (TKDE), vol. 16(12), pp.
1472-1485, 2004. - 5 M. Hadjieleftheriou, G. Kollios, P. Bakalov,
and V. J. Tsotras, Complex Spatio-Temporal
Pattern Queries, VLDB, pp. 877-888, 2005. - 6 C. du Mouza and P. Rigaux, Mobility Patterns,
GeoInformatica, 9(4), 297-319, 2005. - 7 J. S. Yoo and S. Shekhar, A Join-less
Approach for Mining Spatial Co-location Patterns,
IEEE Trans. on Knowledge and Data Eng. (TKDE),
Vol.18, No.10, 2006.