Modeling Spatial and Spatio-temporal Co-occurrence Patterns - PowerPoint PPT Presentation

About This Presentation
Title:

Modeling Spatial and Spatio-temporal Co-occurrence Patterns

Description:

Modeling Spatial and Spatiotemporal Cooccurrence Patterns – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 12
Provided by: C643
Category:

less

Transcript and Presenter's Notes

Title: Modeling Spatial and Spatio-temporal Co-occurrence Patterns


1
Modeling Spatial and Spatio-temporal
Co-occurrence Patterns
  • Mete Celik
  • Spatial Database / Data Mining Group
  • Department of Computer Science
  • University of Minnesota
  • mcelik_at_cs.umn.edu
  • Advisor Shashi Shekhar

2
Biography
  • Education
  • Ph.D., Student, Dept of Computer Science, U. of
    Minnesota, MN, 2002 Present.
  • M.S., Ph.D., Student, Dept. of Electronic Eng.
    Erciyes University, Turkey, 1999 - 2002.
  • B. S., Dept. of Control Computer Eng., Erciyes
    University, Turkey, 1995 - 1999.
  • Major Projects
  • US DoD/ERDC/TEC - Modeling and Mining
    Spatio-Temporal Co-occurrence Patterns
  • Defined interest measures, and designed models
    and algorithms to analyze moving object datasets.
  • JGI/DTC - A Digital Library to Archive Research
    Material from Jane Goodalls Gombe Chimpanzee
    Project
  • Designed a database schema, developed digital
    archive databases
  • and a web search engine to query visual
    materials of Gombe project.
  • NGA - Spatio-Temporal Pattern Mining for
    Multi-Jurisdiction Multi-Temporal Activity
    Datasets
  • Designed models and algorithms to discover
    graph-based hotspots (high-crime activity
    streets)
  • AHPCRC - High Performance Spatial Data Mining
  • Formulized scalable solutions and designed new
    heuristics for spatial auto-regression model.

3
Thesis Related Publications
  • Chapter 2
  • Zonal Co-location Pattern Discovery with Dynamic
    Parameters, w/ J. M. Kang, S. Shekhar, In Proc.
    of 7th IEEE Int l Conf. on Data Mining (ICDM),
    NE, 2007.
  • Chapter 3
  • Mixed-drove Spatio-temporal Co-occurrence Pattern
    Mining , w/ S. Shekhar, J. P. Rogers, J. A.
    Shine, Accepted to the IEEE TKDE 2008.
  • Mixed-drove Spatio-temporal Co-occurrence Pattern
    Mining A Summary of Results, w/ S. Shekhar, J.P.
    Rogers, J.A. Shine, and J.S. Yoo, In Proc. of 6th
    IEEE Int l Conf. on Data Mining (ICDM), Hong
    Kong, 2006.
  • Chapter 4
  • Sustained Emerging Spatio-temporal Co-occurrence
    Pattern Mining A Summary of Results, w/ S.
    Shekhar, J.P. Rogers, and J.A. Shine, In Proc. of
    IEEE Int l Conf. on Tools on Artificial
    Intelligence (ICTAI), Washington, D.C., 2006.

4
Other Publications
  • Spatio-temoral Co-occurrence Pattern Mining
  • Mining At Most Top-K Mixed-drove Spatio-temporal
    Co-occurrence Patterns A Summary of Results, w/
    S. Shekhar, J.P.Rogers, J.A.Shine, and J.M. Kang,
    In Proc. of the Workshop on Spatio-Temporal Data
    Mining (IEEE ICDE 2007), Turkey, 2007.
  • Discovery of Co-evolving Spatial Event Sets, w/
    J. S. Yoo, S. Shekhar, S. Kim, In Proc. of the
    SIAM Intl Conf. on Data Mining (SDM), Bethesda,
    Maryland, 2006.
  • Spatial Co-location Pattern Mining
  • Zonal Co-location Pattern Discovery with Dynamic
    Parameters, w/ J. M. Kang, S. Shekhar, In Proc.
    of 7th IEEE Int l Conf. on Data Mining (ICDM),
    NE, 2007.
  • A Join-less Approach for Co-location Pattern
    Mining A Summary of Results, w/ J. S. Yoo, S.
    Shekhar, In Proc. of the 5th IEEE Intl Conf. on
    Data Mining (ICDM), Houston, Texas, 2005. 
  • Misc. Spatial Data Mining
  • Spatial Dependency Modeling Using Spatial
    Auto-regression, w/ B.M. Kazar, S. Shekhar, D.
    Boley, D. J. Lilja, In Proc. of the ISPRS/ICA
    Workshop on Geospatial Analysis and Modeling as
    part of Intl Conf. GICON, Austria, 2006.
  • Parameter Estimation for the Spatial
    Auto-Regression Model A Rigorous Approach, w/
    B. M. Kazar, S. Shekhar, and D. Boley, The Second
    NASA Data Mining Workshop Issues and
    Applications in Earth Science,  California, 2006.

5
Outline
  • Motivation and Related Work
  • Example Mixed-drove Co-occurrence Pattern
  • Limitation of Related Work
  • Contributions
  • Related Work
  • MDCOP Mining Problem
  • Proposed MDCOP Mining Algorithms
  • Evaluation
  • Conclusion and Future Work

6
MDCOP Motivating Example Input
  • Manpack stinger
  • (2 Objects)
  • M1A1_tank
  • (3 Objects)
  • M2_IFV
  • (3 Objects)
  • Field_Marker
  • (6 Objects)
  • T80_tank
  • (2 Objects)
  • BRDM_AT5 (enemy) (1 Object)

7
MDCOP Motivating Example Output
  • Manpack stinger
  • (2 Objects)
  • M1A1_tank
  • (3 Objects)
  • M2_IFV
  • (3 Objects)
  • Field_Marker
  • (6 Objects)
  • T80_tank
  • (2 Objects)
  • BRDM_AT5 (enemy) (1 Object)

8
Why are mixed-drove patterns important?
  • Improving capabilities of information processing
    Earth Science, environmental management,
    government services, and transportation.
  • Helping explanatory or descriptive use for
    intelligence, resource allocation, confirmatory.
  • Public health (Infectious emerging diseases)
  • Ecology (tracking species and pollutant
    movements)
  • Homeland defense (looking for growing events,
    biodefense)
  • Military
  • Identifying patterns or critical elements
  • Predicting near-feature locations of enemy units

http//www.nytimes.com/library/tech/00/01/circuits
/articles/20giss.html
http//www.my-etest.co.uk/authors/gillmac1/
9
Challenges
  • Current interest measures (i.e. participation
    index) are not sufficient to quantify such
    patterns
  • New composite interest measure must be created
    and formalized.
  • The set of candidate patterns grows exponentially
    with the number of object-types.
  • Spatio-temporal datasets are huge
  • Computationally efficient algorithms must be
    developed.

10
Related Work 1 -Mining of uniform group of moving
objects
Flock Pattern Gudmundsson05
Moving clusters Kalnis05
  • Does not recognize group of mixed object-types
  • Does not recognize if time intervals are discrete
  • Treats different type of objects as same
  • Patterns should be in consecutive time slots

11
Related Work 2 -Mining of mixed group of moving
objects
  • Generalize co-location patterns to
    spatio-temporal domain
  • Collocation episodes Cao et. al. ICDM06
  • Reference centric model
  • Topological patterns Wang et. al. CIKM05
  • Semantics are not well-defined for moving objects

12
Example American Football
Sketch of the game
time slot t0
time slot t1
time slot t2
time slot t3
  • Flock pattern There is no flock patterns
  • Moving objects are not same type.
  • Broken blitz play The objective of the offensive
    wide receivers (W) is to outrun any linebackers
    (L) and defensive backs (C) and get behind them,
    catching an undefended pass while running
    untouched for a touchdown.
  • Moving clusters There is no moving clusters
  • There is no pattern in consecutive time slots
  • Do not take into account different object type
    patterns
  • t0 Ws and Cs, Ws and Ls are co-located. W.1,
    C.1 W.4 C.2,
  • t1 Ws begin their run, while Cs remain in
    their original position possibly due to a fake
    handoff from the Q to running back.
  • Co-location episodes There is no co-location
    episodes
  • There is no reference object type in consecutive
    time slots.
  • Broken blitz play The objective of the offensive
    wide receivers (W) is to outrun any linebackers
    (L) and defensive backs (C) and get behind them,
    catching an undefended pass while running
    untouched for a touchdown.
  • t2 Ws cross over each other and try to drift
    further away from their respective Cs.
  • t3 Q shows signs of throwing the football, Cs
    run to their respective Ws. S guards Ws.
  • Topological patterns There is no topological
    pattern
  • Will discover W,C, W,L, W,S, and W,C,S.
  • However W,L, W,S, W,C,S (1/4) are not time
    persistent.
  • Output MDCOP (wide receiver, cornerback)

13
Related Work Summary
Spatio-Temporal Pattern Level Level Time Interval Time Interval
Spatio-Temporal Pattern Object Object-type Consecutive Non-consecutive
Mining of uniform group of moving objects Flock Pattern 1,2 X X
Mining of uniform group of moving objects Moving Clusters 3 X X
Mining of mixed group of moving objects Collocation Episodes X reference centric X
Mining of mixed group of moving objects Topological Patterns X Semantics are not well defined Semantics are not well defined
Mining of mixed group of moving objects Mixed-drove Pattern (MDCOP) X X X
Proposed approach will catch MDCOP (W, C).
14
Proposed Approach Key Ideas
  • Defined a new monotonic composite interest
    measure
  • Generalize Participation Index, a monotonic
    interest measure for spatial co-location patterns
  • Temporal persistence over an interval
  • Developed a novel and computationally efficient
    MDCOP mining algorithms
  • Exploit monotonic interest measure to prune
    candidates
  • Alternative designs for temporal persistence
  • i) post-processing after reuse of
    co-location-miner
  • ii) temporal pruning after pattern size 1,2,3,.
  • iii) temporal pruning as soon as possible

15
Outline
  • Introduction
  • Related Work
  • MDCOP Mining Problem and Algorithms
  • Key Concepts Interest Measure
  • Formal Problem Definition
  • Algorithms
  • Naïve Approach
  • MDCOP-Miner
  • FastMDCOP-Miner
  • Evaluation
  • Conclusion and Future Work

16
Key Concepts-1
Sketch of the game
time slot t3
time slot t2
time slot t1
time slot t0
  • Spatial co-location Col is a set of object-types
    frequently co-located.
  • ColW, C co-locations in t0 and t3.
  • Participation ratio PR(oi, Col) (instances (oi)
    in co-location Col)/ instances (oi)

t0 PR(W,Col)2/4, PR(C,Col)2/2
t1 PR(W,Col)0, PR(C,Col)0
t2 PR(W,Col)0, PR(C,Col)0
t3 PR(W,Col)3/4, PR(C,Col)2/2
  • Spatial prevalence measure Participation index
    PI minPR(oi, Col)
  • ColW, C gt PI (Col)min (PR(W,C),
    PR(C,Col)) 2/4 for t0
  • gt PI (Col)min (PR(W,C),
    PR(C,Col)) 3/4 for t3
  • A co-location is called spatial prevalent if PI
    of it is not less than a given threshold ?p .

17
Key Concepts-2
  • Definition 1 The time prevalence or persistence
    measure of the pattern P
  • TP(P,T)( of time slots where the pattern
    occurs) / (total of time slots)
  • time interval TT0 , T1 , , Tn
  • Example
  • W, C pairs are co-located in time slots t0 and
    t3,
  • Time prevalence of pattern W, C 2/4

A pattern is time prevalent if its time
prevalence measure is not less than a given
threshold ?time i.e.,if ?time 0.5 than W, C
is time prevalent, because TPW, C 2/4 gt
?time
Spatial Prevalence Measure Spatial Prevalence Measure Spatial Prevalence Measure Spatial Prevalence Measure Time prevalence index
time slot t0 time slot t1 time slot t2 time slot t3 Time prevalence index
W, C 2/4 0 0 3/4 2/4
W, L 2/4 0 0 0 1/4
W,S 0 0 0 3/4 1/4
18
Key Concepts-3
  • Definition 2 The mixed-drove prevalence measure
    of a pattern Pi
  • a composition of the spatial prevalence and time
    prevalence measure.
  • A mixed-drove prevalence measure is monotonically
    decreasing with respect to MDCOP size.
  • Example
  • if ?P gt0.5 co-location W,C is spatial
    prevalent in time slots t0 and t3
  • Prob(W,C)2/4

A pattern is mixed-drove prevalent, if its
mixed-drove prevalence satisfies thresholds ?P
and ?time . i.e., if ?P 0.5 and ?time 0.5,
than W,Cis prevalent since Prob(W,C)2/4 gt
0.5
Spatial Prevalence Measure Spatial Prevalence Measure Spatial Prevalence Measure Spatial Prevalence Measure Time prevalence index
time slot t0 time slot t1 time slot t2 time slot t3 Time prevalence index
W, C 2/4 0 0 3/4 2/4
W, L 2/4 0 0 0 1/4
W,S 0 0 0 3/4 1/4
19
Problem Definition
  • Given
  • A set P of Boolean ST object-types over a common
    ST framework
  • A neighbor relation R over locations
  • A spatial prevalence index threshold, ?P
  • A time prevalence index threshold, ?time
  • Find
  • Mixed-drove spatio-temporal co-occurrence
    patterns whose spatial prevalence gt ?P and time
    prevalence gt ?time
  • Objective
  • Minimize computation cost
  • Constraints
  • Correctness
  • Completeness
  • Monotonic composite interest measure

20
Example An American Football Play
time slot t0
time slot t1
time slot t2
time slot t3
Sketch of the game
  • Input
  • Each play is a spatio-temporal dataset
  • Boolean object-types are role of players (e.g.
    wide receiver, cornerback, liner backers)
  • Objects are instances of object types (instances
    of W are W.1, W.2, W.3, W.4)
  • The duration of the play is 4 time units
  • Neighborhood relation may be less than 1 meter or
    an average arm distance
  • Spatial prevalence threshold ?p 0.5
  • Time prevalence threshold ?time 0.5
  • Output
  • Pattern W, C

21
Proposed Approach Key Ideas
Key Decision Timing of temporal pruning i)
post-processing after reuse of co-location-miner i
i) after pattern size 1,2,3,. iii) as soon as
possible
iii) FastMDCOP-Miner
ii) MDCOP-Miner
i) Naïve Approach
22
Execution Trace Summary of MDCOPs
time slot t0
time slot t1
time slot t2
time slot t3
Pattern t0 t1 t2 t3 Time prevalence index after sizek
Pattern t0 t1 t2 t3 Time prevalence index FastMDCOP MDCOP-Miner Naive
W C 2/4 0 0 3/4 2/4 Survived Survived Survived
Generate W L 2/4 0 0 0 1/4 Already Pruned Survived
Generate W L 2/4 0 FastMDCOP not calculated pruned
size 2 W S 0 0 0 2/4 1/4 Already Pruned Survived
size 2 W S 0 0 FastMDCOP no generation not calculated pruned
C S 0 0 0 1 1/4 Already Pruned Survived
C S 0 0 FastMDCOP no generation not calculated pruned
Pruning FastMDCOP
MDCOP-Miner
Generate size 3 W S C 0 0 0 2/4 1/4 not generated not generated Survived
Pruning at the post processing Pruning at the post processing Pruning at the post processing Pruning at the post processing Naïve
23
Naïve Approach

MDCOP-Miner
OutputMDCOPs WC
OutputMDCOPs WC
  • Temporal pruning
  • (Prune WL, WS, CS, WSC)

t3
t1
t2
t0
t3
t1
t2
t0

No generation
No generation
WSC
No generation
No generation
No generation
No generation
No generation
Temporal pruning
triples
triples

WCWL WS WQ CS CL CQ SL SQ LQ
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
pairs
pairs
Temporal pruning WL, WS, CS
Inputspatio-temporal dataset ?P 0.5 , ?time
0.5
Inputspatio-temporal dataset, ?P 0.5 , ?time
0.5
Bold Black and Blue Common Blue Underlined
Spatial Pruning Bold Red Difference
24
Proposed Approaches
  • Naïve Approach
  • Step 1) Find spatial co-locations for all time
    slots
  • Step 2) Post-processing Prune time
    non-prevalent MDCOPs.
  • Limitation Redundant generation of
    non-prevalent MDCOPs before post processing
  • Pseudo-code
  • MDCOP-Miner
  • Idea Push the post processing step inside the
    loop.
  • Step 1) Find MDCOP patterns by applying MDCOP
    prevalence interest measure.
  • Advantage MDCOP-Miner eliminates redundant
    candidate generation of non-prevalent MDCOPs
  • Pseudo-code
  • Initialization
  • while k in (1,2,3, K)
  • For each time slot
  • generate co-locations
  • generate co-location instances
  • prune_spatial non-prevalent co-loc
  • Post-processing step
  • For size k spatial prevalent co-locations
  • Calculate_time_prevalence_index
  • prune_temporal_non-prevalent_co-occur
  1. Initialization
  2. while k in (1,2,3, K)
  3. For each time slot
  4. generate co-occur
  5. generate co-occur instances
  6. prune_spatial_non-prevalent co-occur
  7. calculate_time_prevalence_index
  8. prune_temporal_non-prevalent_co-occur

25

FastMDCOP-Miner

MDCOP-Miner
OutputMDCOPs WC
OutputMDCOPs WC
t3
t1
t2
t0

No generation
No generation
No generation
No generation
triples
t3
t1
t2
t0


No generation
No generation
No generation
No generation
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WCWL ---- ---- ---- ---- ---- ---- ---- ----
WC WL WS WQ CS CL CQ SL SQ LQ
Temporal pruning
pairs

WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
WCWL WS WQ CS CL CQ SL SQ LQ
WC WL WS WQ CS CL CQ SL SQ LQ
Temporal pruning WL, WS, CS
Temporal pruning
Inputspatio-temporal dataset, ?P 0.5 , ?time
0.5
Inputspatio-temporal dataset, ?P 0.5 , ?time
0.5
Bold Black and Blue Common Blue Underlined
Spatial Pruning Bold Red Difference
26
Proposed Approaches
  • FastMDCOP-Miner
  • Idea Prune time non-prevalent patterns as early
    as possible.
  • Step 1) Find MDCOP patterns by applying MDCOP
    prevalence interest measure.
  • Advantage MDCOP-Miner eliminates redundant
    candidate generation of time non-prevalent MDCOPs
  • Pseudo-code
  • MDCOP-Miner
  • Idea Push the post processing step inside the
    loop.
  • Step 1) Find MDCOP patterns by applying MDCOP
    prevalence interest measure.
  • Pseudo-code
  1. Initialization
  2. while k in (1,2,3, K)
  3. For each time slot
  4. generate co-occur
  5. generate co-occur instances
  6. prune_spatial_non-prevalent co-occur
  7. calculate_time_prevalence_index
  8. prune_temporal_non-prevalent_co-occur
  1. Initialization
  2. while k in (1,2,3, K)
  3. For each time slot
  4. generate co-occur
  5. generate co-occur instances
  6. prune_spatial_non-prevalent co-occur
  7. calculate_time_prevalence_index
  8. prune_temporal_non-prevalent_co-occur

27
Execution Trace (FastMDCOP-Miner) t0
time slot t1
time slot t2
time slot t3
time slot t0
Step 2
Step 1
t0 Time prevalence
WC 1 1/4
WL 1 1/4
WS 0 0
WQ 0 0
CL 0 0
CS 0 0
CQ 0 0
LS 0 0
LQ 0 0
SQ 0 0
t0 t0 t0 t0 t0 t0 t0 t0 t0 t0
Patterns WC WL WS WQ CL CS CQ LS LQ SQ
Instances W.1 C.1 W.2 L.1
Instances W.4 C.2 W.3 L.2
Instances
Instances
P. Ratio 2/4 2/2 2/4 2/2
P. Index 2/4 2/4
?P 2/4 PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED
28
Execution Trace (FastMDCOP-Miner) t1
time slot t0
time slot t1
time slot t2
time slot t3
Step 2
Step 1
t0 t1 Time prevalence
WC 1 0 1/4
WL 1 0 1/4
WS 0 0 0
WQ 0 0 0
CL 0 0 0
CS 0 0 0
CQ 0 0 0
LS 0 0 0
LQ 0 0 0
SQ 0 0 0
t1 t1 t1 t1 t1 t1 t1 t1 t1 t1
Patterns WC WL WS WQ CL CS CQ LS LQ SQ
Instances
Instances
Instances
Instances
P. Ratio
P. Index
?P 2/4 PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED
29
Execution Trace (FastMDCOP-Miner) t2
time slot t0
time slot t1
time slot t2
time slot t3
Step 2
Step 1
t0 t1 t2 Time prevalence
WC 1 0 0 1/4
WL 1 0 0 1/4
WS 0 0 0 0
WQ 0 0 0 0
CL 0 0 0 0
CS 0 0 0 0
CQ 0 0 0 0
LS 0 0 0 0
LQ 0 0 0 0
SQ 0 0 0 0
t2 t2 t2 t2 t2 t2 t2 t2 t2 t2
Patterns WC WL WS WQ CL CS CQ LS LQ SQ
Instances
Instances
Instances
Instances
P. Ratio
P. Index
?P 2/4 PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED PRUNED
30
Execution Trace (FastMDCOP-Miner) t3
time slot t0
time slot t1
time slot t2
time slot t3
Step 2
Step 1
t0 t1 t2 t3 Time prevalence
WC 1 0 0 1 2/4
WL 1 0 0 0 1/4
WS 0 0 0 0
WQ 0 0 0 0
CL 0 0 0 0
CS 0 0 0 0
CQ 0 0 0 0
LS 0 0 0 0
LQ 0 0 0 0
SQ 0 0 0 0
t3 t3 t3 t3 t3 t3 t3 t3 t3 t3
Patterns WC WL WS WQ CL CS CQ LS LQ SQ
Instances W.2 C.1
Instances W.4 C.1
Instances W.1 C.2
Instances
P. Ratio 3/4 2/2
P. Index 3/4
?P 2/4
31
Execution Trace (FastMDCOP-Miner) time prev.
index
  • Calculate time prevalence indices of spatial
    prevalent co-locations (step 8)
  • Find mixed-drove prevalent MDCOPs (step9)
  • A,B, A,C, B,C

time slot t0
time slot t1
time slot t2
time slot t3
t0 t1 t2 t3 Time prevalence
WC 1 0 0 1 2/4
WL 1 0 0 0 Pruned
WS 0 0 0 Already Pruned
WQ 0 0 0 Already Pruned
CL 0 0 0 Already Pruned
CS 0 0 0 Already Pruned
CQ 0 0 0 Already Pruned
LS 0 0 0 Already Pruned
LQ 0 0 0 Already Pruned
SQ 0 0 0 Already Pruned
  • Calculate time prevalence indices of spatial
    prevalent co-locations
  • Find mixed-drove prevalent MDCOPs
  • W,C

32
Analytical Evaluation
  • Lemma 1 A spatial prevalence measure, e.g.,
    participation index, is monotonically
    non-increasing in the size of the MDCOPs at each
    time slot.  
  • Lemma 2 A mixed-drove prevalence index measure
    is monotonically non-increasing with the size of
    MDCOP over space and time.

Lemma 1
Lemma 2
Spatial prevalence index Spatial prevalence index Spatial prevalence index Spatial prevalence index Time prevalence index
t0 t1 t2 t3 Time prevalence index
WC 2/4 - - 3/4 2/4
WS - - - 3/4 1/4
CS - - - 1 1/4
WCS - - - 2/4 1/4
Spatial prevalence of W,C,S pattern is less
than or equal to the spatial prevalence of
sub-patterns W,C, W,S, C,S. Mixed-drove
prevalence index measure of pattern W, C, S is
less than or equal to the spatial prevalence of
sub-patterns W,C, W,S, C,S
33
Analytical Evaluation
  • Theorem 1 The MDCOP-miner and FastMDCOP-Miner
    are complete.
  • Proof Algorithms find all MDCOPs whose
  • spatial prevalence gt ?P and time prevalence gt
    ?time
  • Any subset of MDCOP prevalent pattern is MDCOP
    prevalent (Lemma 2)
  • None of the functions of the algorithm miss any
    prevalent MDCOP.
  • prune_spatial_non-prevalent_co-occur, and
    prune_temporal_non-prevalent_co-occur
  • Theorem 2 The MDCOP-miner and FastMDCOP-Miner is
    correct. If a MDCOP pattern P is returned by
    algorithms then P is a prevalent MDCOP.
  • Proof The pruning steps of prune_spatial_non-pr
    evalent_co-occur and prune_temporal_non-prevalen
    t_co-occur prune out candidates not meeting the
    given thresholds.

34
Outline
  • Introduction
  • Related Work
  • MDCOP Mining Problem and Algorithm
  • Evaluation
  • Analytical Evaluation
  • Performance Evaluation
  • Conclusion and Future Work

35
Performance Evaluation Experiment Design
  • Experiment Goals Compare FastMDCOP-Miner,MDCOP-Mi
    ner with Naïve Approach
  • What is the effect of number of time slots?
  • What is the effect of number of object-types?
  • What is the effect of spatial prevalence
    threshold ?P ?
  • What is the effect of time prevalence threshold
    ?time ?
  • Metric of comparison Computational complexity
  • Workload Vehicle moving dataset and synthetic
    dataset
  • Hardware Intel Centrino PIV 1.60GHz, 512 Mb of
    RAM

36
Real Dataset Description
  • Vehicle movement dataset
  • 15 time slots, x and y coordinates are in meter
  • 22 distinct vehicle types and their instances
  • Minimum instance number 2, maximum instance
    number 78
  • Average instance number 19

Output Spatio-temporal Co-occurrence Pattern
(Manpack_stinger ltM1, M2gt , fire cover (e,g.,
Bradley tank ltT1, T2gt))
Example Input from Spatio-temporal Dataset
37
Real Dataset What is the effect of number of
time slots?
  • Fixed Parameters
  • Spatial threshold ?P 0.2
  • Time threshold ?time 0.8
  • Distance150m
  • of object types 22
  • Execution times of the algorithms increase, as
    the number of time slots is increased.
  • MDCOP-Miner and Naïve approach generates some
    number of size 2 candidates.
  • FastMDCOP-Miner generates less candidates that
    the other algorithms due to early pruning.
  • FastMDCOP-Miner outperforms other algorithms.

38
Real Dataset What is the effect of number of
object-types?
  • Fixed parameters
  • Spatial threshold ?P 0.2
  • Time threshold ?time 0.8
  • of time slots 15
  • Distance150m
  • Execution times of the algorithms increase, as
    the number of object-types is increased.
  • MDCOP-Miner and Naïve approach generates some
    number of size 2 candidates.
  • FastMDCOP-Miner outperforms other algorithms by
    generating less candidates.

39
Real Dataset What is the effect of time
prevalence index threshold?
  • Fixed parameters
  • Spatial threshold ?p 0.5
  • of time slots 15
  • Distance150m
  • of object types 22
  • Execution times of the algorithms decrease, as
    the time prevalence threshold is increased.
  • FastMDCOP-Miner outperforms other algorithms. Its
    advantage increases as the threshold increases.

40
Real Dataset What is the effect of spatial
prevalence index threshold?
  • Fixed parameters
  • Time threshold ?time 0.5
  • of time slots 15
  • Distance150m
  • of object types 22
  • Execution times of the algorithms decrease, as
    the spatial prevalence threshold is increased.
  • FastMDCOP-Miner outperforms other algorithms.

41
Synthetic Dataset Generation
42
Synthetic Dataset What is the effect of number
of time slots?
  • Fixed parameters
  • Spatial threshold ?P 0.3
  • Time threshold ?time 0.9
  • Distance10
  • of object types 200
  • Execution times of the algorithms increase, as
    the number of time slots is increased.
  • MDCOP-Miner and Naïve approach generates some
    number of size 2 candidates.
  • FastMDCOP-Miner generates less candidates that
    the other algorithms.

43
Synthetic Dataset What is the effect of number
of object-types?
  • Fixed parameters
  • Spatial threshold ?p 0.3
  • Time threshold ?time 0.8
  • of time slots 20
  • Distance10m
  • MDCOP-Miner and Naïve approach generates same
    number of size 2 candidates.
  • The ratio of the increase in the execution time
    of naïve approach greater than that of other
    algorithms as the number of object-type
    increases.
  • Naïve approach generates redundant non-persistent
    co-locations.
  • FastMDCOP-Miner outperforms other algorithms by
    generating less candidates.

44
Synthetic Dataset What is the effect of time
prevalence index threshold?
  • Fixed parameters
  • Spatial threshold ?p 0.4
  • of time slots 50
  • Distance10m
  • of object types200
  • Execution times of the algorithms decrease, as
    the time prevalence threshold is increased.
  • FastMDCOP-Miner outperforms other algorithms. Its
    advantage increases as the threshold increases.
  • The ratio of the increase in the execution time
    of naïve approach greater than that of other
    algorithms as the number of object-type
    increases.

45
Synthetic Dataset What is the effect of spatial
prevalence index threshold?
  • Fixed parameters
  • Time threshold ?time 0.8
  • of time slots 50
  • Distance10m
  • of object types200
  • Execution times of the algorithms decrease, as
    the spatial prevalence threshold is increased.
  • FastMDCOP-Miner outperforms other algorithms.

46
Synthetic Dataset What is the effect of noise
instances and average number of co-occurrence
instances?
  • Fixed parameters
  • Spatial threshold ?p 0.3
  • Time threshold ?time 0.8
  • of time slots 20
  • Distance10m
  • Fixed parameters
  • Spatial threshold ?p 0.3
  • Time threshold ?time 0.8
  • of time slots 20
  • Distance10m
  • Execution times of the algorithms increase, as
    the spatial prevalence threshold is increased.
  • FastMDCOP-Miner more robust than other algorithms.
  • FastMDCOP-Miner outperforms other algorithms.

47
Summary of Experimental Results
  • What is the effect of number of time slots?
  • The cost of Naïve approach and non-persistent
    candidate generation increases as the number of
    time slots increases.
  • FastMDCOP-Miner outperforms other algorithms.
  • What is the effect of number of object-types?
  • The ratio of the increase in the execution time
    of naïve approach greater than that of other
    algorithms as the number of object-type
    increases.
  • FastMDCOP-Miner more robust than other
    algorithms. It detects non-persistent patterns as
    early as possible.
  • What is the effect of spatial prevalence
    threshold?
  • The cost of Naïve approach is higher than that of
    other algorithms for low values of spatial
    prevalence threshold.
  • What is the effect of time prevalence threshold?
  • Naïve approach is not sensitive to the time
    prevalence threshold since it is used in the post
    processing step.
  • FastMDCOP-Miner is the most sensitive algorithm
    to the time threshold. It performs well
    especially for high time thresholds.
  • In all experiments, Naïve approach and
    MDCOP-Miner generates same number of size 2
    candidates.

48
Outline
  • Introduction
  • Related Work
  • MDCOP Mining Problem
  • Proposed MDCOP Mining Algorithms
  • Evaluation
  • Conclusion and Future Work

49
Contributions described today
  • Mixed-drove spatio-temporal co-occurrence
    patterns (MDCOPs) and the MDCOP mining problem
    are defined.
  • A new monotonic composite interest measure
    defined.
  • Developed a novel and computationally efficient
    MDCOP mining algorithms.
  • Proposed algorithm is correct and complete in
    finding mixed-drove prevalent patterns.
  • Performance evaluation using real and synthetic
    datasets

50
Spatio-temporal Co-occurrence Pattern Taxonomy
  • 1. Spatial co-location
  • Global and zonal co-location patterns, etc.
  • 2. Co-occurrence patterns of moving objects
  • Flock pattern, mixed-drove pattern, follow
    pattern, moving clusters, etc.
  • 3. Emerging or vanishing co-occurrence patterns
  • Emerging pattern Interest measure getting
    stronger by the time
  • Vanishing pattern Interest measure getting
    weaker by the time

4. Co-evolving patterns
5. Periodic co-occurrence patterns 6.
Spatio-temporal cascade patterns . . .
  • ICDM05 - Discovering co-evolving spatio-temporal
    event sets
  • TKDE08 and ICDM06 - Mixed-Drove Spatio-Temporal
    Co-occurrence
  • Pattern Mining
  • ICDE-STDM07 - Mining At Most Top-K Mixed-drove
    Spatio-temporal
  • Co-occurrence Patterns
  • ICDM07 Zonal Co-location Pattern Mining
  • ICDM05 Joinless Approach for Co-location
    Pattern Mining
  • ICTAI06 - Sustained Emerging Spatio-temporal
    Co-occurrence Pattern Mining

51
Chapter 2- Zonal Co-location Pattern Discovery
Given different object types of spatial events
and zone boundaries Find Co-located subset of
event types specific to zones Method A novel
algorithm by using an indexing structure.
1
2
4
3
Zones 2,4 Zone 3
52
Chapter 4 - Sustained Emerging ST Co-occurrence
Pattern Discovery
  • Given A set P of Boolean ST object-types over a
    common ST framework
  • Find Sustained emerging spatio-temporal
    co-occurrence patterns whose prevalence measure
    increase over time.
  • Method Developing novel algorithms by defining
    monotonic interest measures.

53
Future Work Short Term
  • Spatial co-location
  • Interest measure participation index
  • Global and zonal co-location patterns, etc.
  • Co-occurrence patterns of moving objects
  • Flock pattern, mixed-drove pattern, follow
    pattern, cross pattern, moving clusters, etc.
  • Emerging or vanishing co-occurrence patterns
  • Emerging pattern Interest measure getting
    stronger by the time
  • Vanishing pattern Interest measure getting
    weaker by the time
  • Co-evolving patterns
  • Periodic co-occurrence patterns
  • Spatio-temporal cascade patterns
  • Efficient methods
  • Comparison of int. measures with statistical int.
    measures

54
Future Work Long Term
  • Spatial and Spatio-temporal Pattern Mining Design
  • Crime Analysis, GIS, Epidemiology
  • Challenges
  • discovering patterns and anomalies from enormous
    frequently updated spatial and spatio-temporal
    datasets,
  • developing an ontological framework for spatial
    and spatio-temporal analysis,
  • integrating spatial and spatio-temporal data from
    multiple agencies, distributed data, and
    multi-scale data

55
Acknowledgements
  • Adviser Prof. Shashi Shekhar
  • Committee Prof. Jaideep Srivastava, Prof.
    Arindam Banerjee, and Prof. Sudipto Banerjee
  • Spatial Databases and Data Mining Group
  • TEC collaborators James P. Rogers, James A.
    Shine
  • Dept. of Computer Science

56
References
  • 1 J. Gudmundsson, M. v. Kreveld, and B.
    Speckmann, Efficient Detection of Motion Patterns
    in Spatio-Temporal Data Sets, ACM-GIS,250-257,
    2004.
  • 2 P. Laube and S. Imfeld, Analyzing relative
    motion within groups of trackable moving point
    objects, in In GIScience, number 2478 in Lecture
    notes in Computer Science. Berlin Springer, pp.
    132-144, 2002.
  • 3 P. Kalnis, N. Mamoulis, and S. Bakiras, On
    Discovering Moving Clusters in Spatio-temporal
    Data, 9th Int'l Symp. on Spatial and Temporal
    Databases (SSTD), Angra dos Reis, Brazil, 2005.
  • 4 Y. Huang, S. Shekhar, and H. Xiong,
    Discovering Co-location Patterns from Spatial
    Datasets A General Approach, IEEE Trans. on
    Knowledge and Data Eng. (TKDE), vol. 16(12), pp.
    1472-1485, 2004.
  • 5 M. Hadjieleftheriou, G. Kollios, P. Bakalov,
    and V. J. Tsotras, Complex Spatio-Temporal
    Pattern Queries, VLDB, pp. 877-888, 2005.
  • 6 C. du Mouza and P. Rigaux, Mobility Patterns,
    GeoInformatica, 9(4), 297-319, 2005.
  • 7 J. S. Yoo and S. Shekhar, A Join-less
    Approach for Mining Spatial Co-location Patterns,
    IEEE Trans. on Knowledge and Data Eng. (TKDE),
    Vol.18, No.10, 2006.
Write a Comment
User Comments (0)
About PowerShow.com