Time Series - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Time Series

Description:

How correlated is the series with itself at various lag values? ... Stationary regardless of b values. E[Yt] = 0, can generalize. ACF of an MA process ... – PowerPoint PPT presentation

Number of Views:196
Avg rating:3.0/5.0
Slides: 57
Provided by: Andre267
Category:
Tags: series | time

less

Transcript and Presenter's Notes

Title: Time Series


1
Time Series
  • Math 419/592
  • Winter 2009
  • Prof. Andrew Ross
  • Eastern Michigan University

2
Overview of Stochastic Models
3
But first, a word from our sponsor
  • Take Math 560
  • (Optimization)
  • this fall!
  • Sign up soon or it will disappear

4
Outline
  • Look at the data!
  • Common Models
  • Multivariate Data
  • Cycles/Seasonality
  • Filters

5
Look at the data!
  • or else!

6
Atmospheric CO2
Years 1958 to now vertical scale 300 to 400ish
7
(No Transcript)
8
Ancient sunspot data
9
Our Basic Procedure
  • Look at the data
  • Quantify any pattern you see
  • Remove the pattern
  • Look at the residuals
  • Repeat at step 2 until no patterns left

10
Our basic procedure, version 2.0
  • Look at the data
  • Suck the life out of it
  • Spend hours poring over the noise
  • What should noise look like?

11
One of these things is not like the others
12
Stationarity
  • The upper-right-corner plot is Stationary.
  • Mean doesn't change in time
  • no Trend
  • no Seasons (known frequency)
  • no Cycles (unknown frequency)
  • Variance doesn't change in time
  • Correlations don't change in time
  • Up to here, weakly stationary
  • Joint Distributions don't change in time
  • That makes it strongly stationary

13
Our Basic Notation
  • Time is t, not n
  • even though it's discrete
  • State (value) is Y, not X
  • to avoid confusion with x-axis, which is time.
  • Value at time t is Yt, not Y(t)
  • because time is discrete
  • Of course, other books do other things.

14
Detrending deterministic trend
  • Fit a plain linear regression, then subtract it
    out
  • Fit Yt mt b,
  • New data is Zt Yt mt b
  • Or use quadratic fit, exponential fit, etc.

15
Detrending stochastic trend
  • Differencing
  • For linear trend, new data is Zt Yt Yt-1
  • To remove quadratic trend, do it again
  • Wt Zt Zt-1Yt 2Yt-1 Yt-2
  • Like taking derivatives
  • Whats the equivalent if you think the trend is
    exponential, not linear?
  • Hard to decide regression or differencing?

16
Removing Cycles/Seasons
  • Will get to it later.
  • For the next few slides, assume no cycles/seasons.

17
A brief big-picture moment
  • How do you compare two quantities?
  • Multiply them!
  • If theyre both positive, youll get a big,
    positive answer
  • If theyre both big and negative
  • If one is positive and one is negative
  • If one is bigpositive and the other is
    smallpositive

18
Where have we seen this?
  • Dot product of two vectors
  • Proportional to the cosine of the angle between
    them (do they point in the same direction?)
  • Inner product of two functions
  • Integral from a to b of f(x)g(x) dx
  • Covariance of two data sets x_i, y_i
  • Sum_i (x_i y_i)

19
Autocorrelation Function
  • How correlated is the series with itself at
    various lag values?
  • E.g. If you plot Yt1 versus Yt and find the
    correlation, that's the correl. at lag 1
  • ACF lets you calculate all these correls. without
    plotting at each lag value.
  • ACF is a basic building block of time series
    analysis.

20
Fake data on bus IATs
21
Properties of ACF
  • At lag 0, ACF1
  • Symmetric around lag 0
  • Approx. confidence-interval bars around ACF0
  • To help you decide when ACF drops to near-0
  • Less reliable at higher lags
  • Often assume ACF dies off fast enough so its
    absolute sum is finite.
  • If not, called long-term memory e.g.
  • River flow data over many decades
  • Traffic on computer networks

22
How to calculate ACF
  • R, Splus, SAS, SPSS, Matlab, Scilab will do it
    for you
  • Excel download PopTools (free!)
  • http//www.cse.csiro.au/poptools/
  • Excel, etc do it yourself.
  • First find avg. and std.dev. of data
  • Next, find AutoCoVariance Function (ACVF)
  • Then, divide by variance of data to get ACF

23
ACVF at lag h
  • Y-bar is mean of whole data set
  • Not just mean of N-h data points
  • Left side old way, can produce correlgt1
  • Right side new way
  • Difference is End Effects
  • Pg 30 of Peña, Tiao, Tsay
  • (if it makes a difference, you're up to no good?)

24
Common Models
  • White Noise
  • AR
  • MA
  • ARMA
  • ARIMA
  • SARIMA
  • ARMAX
  • Kalman Filter
  • Exponential Smoothing, trend, seasons

25
White Noise
  • Sequence of I.I.D. Variables et
  • meanzero, Finite std.dev., often unknown
  • Often, but not always, Gaussian

26
AR AutoRegressive
  • Order 1 YtaYt-1 et
  • E.g. New (90 of old) random fluctuation
  • Order 2 Yta1Yt-1 a2Yt-2 et
  • Order p denoted AR(p)
  • p1,2 common gt2 rare
  • AR(p) like p'th order ODE
  • AR(1) not stationary if agt1
  • EYt 0, can generalize

27
Things to do with AR
  • Find appropriate order
  • Estimate coefficients
  • via Yule-Walker eqn.
  • Estimate std.dev. of white noise
  • If estimated agt0.98, try differencing.

28
MA Moving Average
  • Order 1
  • Yt b0et b1et-1
  • Order q MA(q)
  • In real data, much less common than AR
  • But still important in theory of filters
  • Stationary regardless of b values
  • EYt 0, can generalize

29
ACF of an MA process
  • Drops to zero after lagq
  • That's a good way to determine what q should be!

30
ACF of an AR process?
  • Never completely dies off, not useful for finding
    order p.
  • AR(1) has exponential decay in ACF
  • Instead, use Partial ACFPACF, which dies after
    lagp
  • PACF of MA never dies.

31
ARMA
  • ARMA(p,q) combines AR and MA
  • Often p,q lt 1 or 2

32
ARIMA
  • AR-Integrated-MA
  • ARIMA(p,d,q)
  • dorder of differencing before applying ARMA(p,q)
  • For nonstationary data w/stochastic trend

33
SARIMA, ARMAX
  • Seasonal ARIMA(p,d,q)-and-(P,D,Q)S
  • Often S
  • 12 (monthly) or
  • 4 (quarterly) or
  • 52 (weekly)
  • Or, S7 for daily data inside a
    week
  • ARMAXARMA with outside explanatory variables
    (halfway to multivariate time series)

34
State Space Model, Kalman Filter
  • Underlying process that we don't see
  • We get noisy observations of it
  • Like a Hidden Markov Model (HMM), but state is
    continuous rather than discrete.
  • AR/MA, etc. can be written in this form too.
  • State evolution (vector) St F St-1 ht
  • Observations (scalar) Yt H St et

35
ARCH, GARCH(p,q)
  • (Generalized) AutoRegressive Conditional
    Heteroskedastic (heteroscedastic?)
  • Like ARMA but variance changes randomly in time
    too.
  • Used for many financial models

36
Exponential Smoothing
  • More a method than a model.

37
Exponential Smoothing EWMA
  • Very common in practice
  • Forecasting w/o much modeling of the process.
  • At forecast of series at time t
  • Pick some parameter a between 0 and 1
  • At a Yt (1-a)At-1
  • or At At-1 a(error in period t)
  • Why call it Exponential?
  • Weight on Yt at lag k is (1-a)k

38
How to determine the parameter
  • Train the model try various values of a
  • Pick the one that gives the lowest sum of
    absolute forecast errors
  • The larger a is, the more weight given to recent
    observations
  • Common values are 0.10, 0.30, 0.50
  • If best a is over 0.50, there's probably some
    trend or seasonality present

39
Holt-Winters
  • Exponential smoothing no trend or seasonality
  • Excel/Analysis Toolpak can do it if you tell it a
  • Holt's method accounts for trend.
  • Also known as double-exponential smoothing
  • Holt-Winters accounts for trend seasons
  • Also known as triple-exponential smoothing

40
Multivariate
  • Along with ACF, use Cross-Correlation
  • Cross-Correl is not 1 at lag0
  • Cross-Correl is not symmetric around lag0
  • Leading Indicator one series' behavior helps
    predict another after a little lag
  • Leading means coming before, not better than
    others
  • Can also do cross-spectrum, aka coherence

41
Cycles/Seasonality
  • Suppose a yearly cycle
  • Sample quarterly 3-med, 6-hi, 9-med, 12-lo
  • Sample every 6 months 3-med, 9-med
  • Or 6-hi, 12-lo
  • To see a cycle, must sample at twice its freq.
  • Demo spreadsheet
  • This is the Nyquist limit
  • Compact Disc samples at 44.1 kHz, top of
    human hearing is 20 kHz

42
The basic problem
  • We have data, want to find
  • Cycle length (e.g. Business cycles), or
  • Strength of seasonal components
  • Idea use sine waves as explanatory variables
  • If a sine wave at a certain frequency explains
    things well, then there's a lot of strength.
  • Could be our cycle's frequency
  • Or strength of known seasonal component
  • Explainscorrelates

43
Correlate with Sine Waves
  • Ordinary covar
  • At freq. Omega,
  • (means are zero)
  • Problem what if that sine is out of phase with
    our cycle?

44
Solution
  • Also correlate with a cosine
  • 90 degrees out of phase with sine
  • Why not also with a 180-out-of-phase?
  • Because if that had a strong correl, our original
    sine would have a strong correl of opposite sign.
  • Sines Cosines, Oh Mycombine using complex
    variables!

45
The Discrete Fourier Transform
  • Often a scaling factor like 1/T, 1/sqrt(T),
    1/2pi, etc. out front.
  • Some people use i instead of -i
  • Often look only at the frequencies
  • k0,...,T-1

46
Hmm, a sum of products
  • That reminds me of matrix multiplication.
  • Define a matrix F whose j,k entry is
  • exp(-ijk2pi/T)
  • Then
  • Matrix multiplication takes T2 operations
  • This matrix has a special structure, can do it in
    about T log T operations
  • That's the FFTFast Fourier Transform
  • Easiest if T is a power of 2

47
So now we have complex values...
  • Take magnitude argument of each DFT result
  • Plot squared magnitude vs. frequency
  • This is the Periodogram
  • Large value that frequency is very strong
  • Often plotted on semilog-y scale, decibels
  • Example spreadsheet

48
Spreadsheet Experiments
  • First, play with amplitudes
  • (1,0) then (0,1) then (1,.5) then (1,.7)
  • Next, play with frequency1
  • 2pi/8 then 2pi/4
  • 2pi/6, 2pi/7, 2pi/9, 2pi/10
  • 2pi/100, 2pi/1000
  • Summarize your results for yourself. Write it
    down!
  • Reset to 2pi/8 then play with phase2
  • 0, 1, 2, 3, 4, 5, 6...
  • Now add some noise to Yt

49
Interpretations
  • Value at k0 is mean of data series
  • Called DC component
  • Area under periodogram is proportional to
    Var(data series)
  • Height at each pointhow much of variance is
    explained by that frequency
  • Plotting argument vs. frequency shows phase
  • Often need to smooth with moving avg.

50
What is FT of White Noise?
  • Try it!
  • Why is it called white noise?
  • Pink noise, etc. (look up in Wikipedia)

51
Filtering part 1
  • Zero out the frequencies you don't want
  • Invert the FT
  • FT is its own inverse! Not like Laplace
    Transform.
  • This is frequency-domain filtering
  • MP3 files filter out the freqs. you wouldn't
    hear
  • because they're overwhelmed by stronger
    frequencies

52
Filtering part 2
  • Time-domain filtering example spreadsheet
  • Smoothing moving average
  • Filters out high frequencies (noise is high-freq)
  • Low-pass filter
  • Detrending differencing
  • Filters out trends and slow cycles (which look
    like trends, locally)
  • High-pass filter
  • Band-pass filter
  • Band-reject filter (esp. 12-month cycles)

53
Filtering
  • Time-domain filter's freq. response comes from
    the FT of its averaging coefficients
  • Example spreadsheet
  • This curve is called the Transfer Function
  • Good audio speakers publish their frequency
    response curves

54
Long-history time series
  • Ordinary theory assumes that ACF dies off faster
    than 1/h
  • But some time series don't satisfy that
  • River flows
  • Packet amounts on data networks
  • Connected to chaos fractals

55
Bibliography
  • Enders Applied Econometric Time Series
  • Kedem Fokianos Regression Models for Time
    Series Analysis
  • Pena, Tao, Tsay A Course in Time Series
    Analysis
  • Brillinger lecture notes for Stat 248 at UC
    Berkeley
  • BrillingerTime Series Data Analysis and Theory
  • Brockwell Davis Introduction to Time Series
    and Forecasting

56
1 real way, 2 fake ways
Write a Comment
User Comments (0)
About PowerShow.com