Title: Dot Plots For Time Series Analysis
1Dot Plots For Time Series Analysis
- Dragomir Yankov, Eamonn Keogh, Stefano Lonardi
- Dept. of Computer Science Eng.
- University of California Riverside
- Ada Waichee Fu
- Dept. of Computer Science Eng.
- The Chinese University of Hong Kong
2Sequence analysis with dot plots
- Introduced by Gibbs McIntyre (1970)
t
a
g
t
a
a t g t a g
- Observed patterns
- Matches (homologies)
- Reverses
- Gaps (differences or mutations)
3Dot Plots For Time Series Analysis
- Problem statement How can we meaningfully adapt
the DP analysis for real value data
- The DP method would ideally be
- Robust to noise
- Invariant to value and time shifts
- Invariant to certain amount of time warping
- Efficiently computable
4Related work
Recurrence plots (Eckman et al (1987))
- Provide intuitive 2D view
- of multidimensional dynamical systems
- Matrix is computed over the heaviside function
Problem with recurrence plots
Matches are locally (point) based rather than
subsequence based
5The proposed solution
- Reducing the dot plot procedure to the motif
finding problem
- Applying the Random Projection algorithm for
finding motifs in time series data (Chiu et al
2003)
It satisfies the initial requirements of
robustness to outliers and invariance to time and
value shifts
- Presegmenting the series to achieve time warping
invariance
6Dot plots and motif finding
- Def match, trivial match, motif
- D(P,Q) lt R, we say that Q is a match of P
- D(P,Q) lt R,D(P,Q1)lt R, we say that Q1 is
a trivial match of P
- A non trivial match is a motif
- Def Time series dot plot a plot that contains
a - point at position (i,j) iff TS1(i) and TS2(j)
represent - the same motif
7The Random Projection algorithm
- Based on PROJECTION (Buhler Tompa 2002)
- Algorithm outline
- Split the TS into subsequences and symbolize them
- Separate the symbolic sequences into classes of
equivalence using PROJECTION - Mark as motifs sequences from the same class of
equivalence
8Random Projection symbolization
Utilizes the Symbolic Aggregate Approximation
(SAX) scheme
- Applies PAA (Piecewise Aggregate Approximation)
9Random Projectionmotif finding
- d random dimensions are masked and the strings
are divided into separate bins
- The symbolic representations of the plotted
time series are stored into tables
10Random Projectionmotif finding
- Updating the dot plot collision matrix
- The update is performed for m iterations.
11Random Projection for streaming
- Complexity space O(M), time O(mM)
- For practical data sets M is very sparse
- For time series data small values of m (order of
10) generate highly descriptive plots
- Random Projection as online algorithm
- Good time performance
- Updatability
12Experimental evaluation
Dot Plots for anomaly detection
- Recurrent data
- with variable
- state length
- The anomaly is of the same type A
- Small time warpings (shifts) are detected B
- Larger time warpings are omitted C
13Experimental evaluation
Dot Plots for anomaly detection
Recurrent data with fixed state length
14Experimental evaluation
Dot Plots for pattern detection
Stock market data
15Experimental evaluation
Dot Plots for pattern detection
Audio data
16Experimental evaluation
Dot Plots for pattern detection
MUMer
Random Projection
Discrete data for some tasks obtaining a real
value representation is beneficial
17Dynamic sliding window
- The fixed window does not perform well when
- The size of the recurrent states varies
- We do not guess correctly the size of the states
- Solution use time series segmentation heuristics
and a dynamic sliding window
18Dynamic sliding window
Comparison of the dynamic and fixed sliding
windows
Tide data set
Synthetic dataset
The dynamic sliding window preserves
more information about the frequency variability
19Conclusion
- This work studies the problem of building dot
plots for real value time series data - It demonstrates its equivalence to the motif
finding problem - Introduced is an efficient and robust approach
for building the dot plots - The performance of the tool is evaluated
empirically on a number of data sets with
different characteristics - Finally, a dynamic sliding window technique is
proposed, which improves the quality and the
descriptiveness of the plots