Time Series Forecasting

About This Presentation

Title:

Time Series Forecasting

Description:

Time Series Forecasting Part I What is a Time Series ? Components of Time Series Evaluation Methods of Forecast Smoothing Methods of Time Series – PowerPoint PPT presentation

Number of Views:1971

Avg rating:3.0/5.0

Slides: 52

Provided by: rou669

Category:

more less

Transcript and Presenter's Notes

Title: Time Series Forecasting

1
Time Series Forecasting Part I

What is a Time Series ?
Components of Time Series
Evaluation Methods of Forecast
Smoothing Methods of Time Series
Time Series Decomposition

by Duong Tuan Anh Faculty of Computer Science and
Engineering September 2011
1
2
What is a Time series ?

A time series is a collection of observations
made sequentially in time.

A study on random sample of 4000 graphics from 15
of the the worlds news papers published between
1974 and 1989 found that more than 75 of all
graphics were time series.
Examples Financial time series, scientific time
series
2
3
Time series models

Regression models
Predict the response over time of the variable
under study to changes in one or more of the
explanatory variables.
Deterministic models of time series
Stochastic models of time series
All the three kinds of models can be used for
forecasting.

3
4
Components of a time series

The pattern or behavior of the data in a time
series has several components.
Theoretically, any time series can be decomposed
into
Trend
Cyclical
Seasonal
Irregular
However, this decomposition is often not
straight-forward because these factors interact.

4
5
Trend component

The trend component accounts for the gradual
shifting of the time series to relatively higher
or lower values over a long period of time.
Trend is usually the result of long-term factors
such as changes in the population, demographics,
technology, or consumer preferences.

5
6
Seasonal component

The seasonal component accounts for regular
patterns of variability within certain time
periods, such as a year.
The variability does not always correspond with
the seasons of the year (i.e. winter, spring,
summer, fall).
There can be, for example, within-week or
within-day seasonal behavior.

6
7
Cyclical component

Any regular pattern of sequences of values above
and below the trend line lasting more than one
year can be attributed to the cyclical component.
Usually, this component is due to multiyear
cyclical movements in the economy.

7
8
Evaluating Methods of forecasts

Forecasting method is selected - many times by
intuition, previous experience, or computer
resource availability
Divide the data into two sections - an
initialization part and a test part
Use the forecast technique to determine the
fitted values for the initialization data set
Use the forecast technique to forecast the test
data set and determine the forecast errors
Evaluate errors (MAD, MPE, MSD, MAPE)
Use the technique, modify, or develop new model

8
9
Evaluation Methods of Forecasts

There are three measures of accuracy of the
fitted models MAPE, MAD and MSD for each of the
sample forecasting and smoothing methods.
For all three measures, the smaller the value,
the better the fit of the model.
Use these statistics to compare the fit of the
different methods.
MAPE (Mean Absolute Percentage Error) measure the
accuracy of fitted time series values. It
expresses accuracy as a percentage.
?(yt-yt)/yt
MAPE -------------- ? 100 (yt ? 0)
n

9
10
MAPE, MAD, and MSD

where yt is the actual value, yt is the fitted
value and n is the number of observations.
MAD (Mean Absolute Deviation) expresses accuracy
in the same units as the data, which help
conceptualize the amount of error.
?yt-yt
MAD ----------
n
where yt is the actual value, yt is the fitted
value and n is the number of observations.

10
11
MAPE, MAD, and MSD

MSD(Mean Squared Deviation) is a more sensitive
measure of an unusually large forecast error than
MAD.
?(yt-yt)2
MSD ----------
n
where yt is the actual value, yt is the fitted
value and n is the number of observations.

11
12
Methods of smoothing time series

Arithmetic Moving Average
Exponential Smoothing Methods
Holt-Winters method for Exponential Smoothing
Smoothing a time series to eliminate some of
short-term fluctuations.
Smoothing also can be done to remove seasonal
fluctuations, i.e., to deseasonalize a time
series.
These models are deterministic in that no
reference is made to the sources or nature of the
underlying randomness in the series.
The models involves extrapolation techniques.

12
13
Averaging Methods

Simple Averages - quick, inexpensive (should only
be used on stationary data)
Moving Average method consists of computing an
average of the most recent n data values for the
series and using this average for forecasting the
value of the time series for the next period.
Moving averages are useful if one can assume item
to be forecast will stay steady over time.
Series of arithmetic means used only for
smoothing, provides overall impression of data
over time
? (most recent n
data items)
Moving Average -------------------------------
-----------
n

13
14
Moving average methods

Works best with stationary data.
The smaller the number, the more weight given to
recent periods.
A smaller number is desirable when there are
sudden shifts in the level of the series.
The greater the number, less weight is given to
more recent periods.
The larger the order of the moving average, the
greater the smoothing effect. Larger n when
there are wide, infrequent fluctuations in the
data.
By smoothing recent actual values, removes
randomness.

14
15
Weighted Moving Averages

Weighted Moving Average - place more weight on
recent observations. Sum of the weights needs to
equal 1.
Used when trend is present
Older data usually less important
?(weight for period n)(Value in
period n)
WMA --------------------------------------------
------------
?weights

15
16
Notes on Moving Averages

MA models do not provide information about
forecast confidence.
We can not calculate standard errors.
We can not explain the stochastic component of
the time series. This stochastic component
creates the error in our forecast.

16
17
Exponential Smoothing Methods

Single Exponential Smoothing (Averaging)
Double Exponential Smoothing Holts Method
Winters Model.
Note
- Single Exponential Smoothing is for series
without trend and without seasonal component.
- Double Exponential Smoothing is for series
with trend and without seasonal component.
- Winters model is for for series with trend
and seasonal component.

17
18
Single Exponential Smoothing

Continually revising a forecast in light of more
recent experiences. Averaging (smoothing) past
values of a series in a decreasing (exponential)
manner. The observations are weighted with more
weight being given to the more recent
observations
At aYt-1 (1 a) At-1
(S1)
New forecast a ? (old observation) (1- a)
? old forecast
Here we denote the original series by yt and
the smoothed series by At.
The equation can be rewritten as
At At-1 a(Yt At-1)

18
19
Single Exponential Smoothing

When looking at the formula new forecast is
really the old forecast plus a times the error in
the old forecast
To get started, we need a smoothing constant a,
an initial forecast, and an actual value. We can
use the first actual as the forecast value or we
can average the first n observations.
The smoothing constant serves as the weighting
factor. When a is close to 1, the new forecast
will include a substantial adjustment for any
error that occurred in the preceding forecast.
When a is close to 0, the new forecast is very
similar to the old forecast.

19
20
Single Exponential Smoothing (cont.)

The smoothing constant a is not an arbitrary
choice - but generally falls between 0.1 and 0.5.
If we want predictions to be stable and random
variation smoothed, use a small a. If we want a
rapid response, a larger a value is required.

20
21
Why Exponential?

At ?Yt-1 (1- ?)At-1
At-1 ?Yt-2 (1- ?)At-2
At-2 ?Yt-3 (1- ?)At-3
At ?Yt-1 (1- ?) ?Yt-2 (1- ?) ?2Yt-3
. (1 - ?) ?kYt-k1
?k decreases exponentially.

21
22
The small a here smooths the data.
22
23
The large a in this example responds quickly to
the data.
23
24
Tracking

Use a tracking signal (measure of errors over
time) and setting limits. For example, if we
forecast n periods, count the number of negative
and positive errors. If the number of positive
errors is substantially less or greater than n/2,
then the process is out of control.
Can also use 95 prediction interval (1.96 sqrt
(MSE)). If the forecast error is outside of the
interval, use a new optimal a.
Looking back at the .1 single exponential
smoothing
1.96sqrt(24261) -305 Observation 21 is
out-of-control. We need to re-evaluate alpha
level because this technique is biased.

24
25
Exponential Smoothing Adjusted for Trend Holts
method

In some situations, the observed data are
trending and contain information that allows the
anticipation of future upward movement.
In that case, a linear trend forecast function is
needed.
Holts smoothing method allows for evolving local
linear trend in a time series and can be used to
forecast.
When there is a trend, an estimate of the current
slope and the current level is required.

25
26
Holts Method

Holts method uses two coefficients.
a is the smoothing constant for the level
b is the trend smoothing constant - used to
remove random error.
Advantage of Holts method it provides
flexibility in selecting the rates at which the
level and trend are tracked.

26
27
Equations in Holts method

The exponentially smoothed series, or the current
level estimate
At ?Yt (1- ?)(At-1 Tt-1)
(S2)
The trend estimate
Tt ?(At At-1)(1- ?)Tt-1
(S3)
Forecast p periods into the future
Ytp At pTt
where
At new smoothed value (estimate of current
level)
Yt new actual value at time t.
Tt trend estimate
Ytp forecast for p periods into the future.
? smoothing constant for the level
? smoothing constant for trend estimate

27
28
How to initiate Holts method

To get started, initial values for A and T in
equation (S2) and (S3) must be determined.
One approach is to set A1 to Y1 and T1 to zero.
The second approach is to use the average of the
first five or six observations as A1. T1 is then
estimated by the slope of a line that is fit to
these five or six observations.

29
Holts method
Holt exponential smoothing with parameters ?
1.0 and ? 0.099 for time series of electricity
consumption.
30
Winters Method

Winters method is an easy way to account for
seasonality when data have a seasonal pattern.
It extends Holts Method to include an estimate
for seasonality.
a is the smoothing constant for the level
b is the trend smoothing constant - used to
remove random error.
g smoothing constant for seasonality
This formula removes seasonal effects. The
forecast is modified by multiplying by a seasonal
index.

30
31
Winters Method

The four equations used in Winters
(multiplication) smoothing are
The smoothed series or level estimate
At ?Yt /St-s (1- ?)(At-1 Tt-1)
The trend estimate
Tt ?(At At-1)(1- ?) Tt-1
The seasonality estimate
St ?Yt/At (1- ?)St-s
Forecast p periods into the future
Ytp (At pTt)St-sp

where At new smoothed value (estimate of
current level) Yt new actual value at time
t. Tt trend estimate Ytp forecast for p
periods into the future. Tt trend estimate
? smoothing constant for the level ?
smoothing constant for trend estimate ?
smoothing constant for seasonality estimate p
periods to be forecast into the future s
length of seasonality
WINTERS METHOD Is also called TRIPLE EXPONENTIAL
SMOOTHING )
31
32
How to initiate Winters method

To begin the Winters method, the initial values
for the smoothed series At, the trend Tt and the
seasonal indices St must be set.
One approach is to set the first estimate of At
to Y1. The trend is estimated to 0 and the
seasonal indices are each set to 1.0.

33
Winters Method
33
34
Decomposition

Decomposition is a procedure to identify the
component factors of a time series.
How the components relate to the original series
a model that expresses the time series variable Y
in terms of the components T (trend), C (cycle),
S (seasonal) and I (iregular).
Additive components model multiplicative
components model.
It is difficult to deal with cyclical component
of a time series. To keep things simple we assume
that any cycle in the data is part of the trend.
Additive model Yt Tt St It
Multiplicative model Yt Tt ? St ? It

35
Additive and multiplicative models

The additive model works best when the time
series has roughly the same variability through
the length of the series.
That is, all the values of the series fall within
a band with constant width centered on the trend.
The multiplicative model works best when the
variability of the time series increased with the
level.
That is the values of the series become larger as
the trend increases.
See the figure in the next slide.
Most economic time series have seasonal variation
that increases with the level of the series. So
multiplicative model is suitable to them.

36
(a) A time series with constant
variability (b) A time series with
variability increasing with level
37
Trend equations

Trend can be described by a straight line or a
smooth line.
Linear trend Tt a bt
Here Tt is the predicted value for the trend at
time t. The symbol t used for the variable
represents time and takes integer values 1,2,3,
The slope b is the average increase or decrease
in T for each one-period increase in time.
Time trend equations can be fit to the data using
the method of least squares.
Recall that this method selects the values of
coefficients in the trend equation (e.g. a and b)
so that the estimated trend values Tt are close
to the actual value Yt as measured by the sum of
squared errors criterion
SSE ? (Yt Tt)2
(See Appendix of this chapter for how to find a
and b)

38
Trend line for the Car Registrations Time Series
39
Additional trend curves

The life cycle of a new product has 3 stages
introduction, growth, and maturity and
saturation.
A curve is needed to model the trend over a new
product.
A simple function that allows for curvature is
the quadratic trend
Tt b0 b1t b2t2
When a time series starts slowly and then appears
to be increasing at an increasing rate
?Exponential trend
Tt b0 b1t
The coefficient b1 is related to the growth rate.

40
(No Transcript)
41
The increase in the number of salespeople is not
constant. It appears as if increasingly larger
numbers of people are being added in the later
years. An exponential trend curve fit to the
salepeople data has the equation
Tt 10.016(1.313)t
42
Seasonality

Several methods for measuring seasonal variation.
The basic idea
first estimate and remove the trend from the
original series and then smooth out the irregular
component. This leaves data containing only
seasonal variation.
The seasonal values are collected and summarized
to produce a number for each observed interval of
the year (week, month, quarter, and so on)

43
Identification of seasonal component

The identification of seasonal component in a
time series differs from trend analysis in two
ways
The trend is determined directly from the
original data, but the seasonal component is
determined indirectly after eliminating the other
components from the data.
The trend is represented by one best-fitting
curve, but a separate seasonal value has to be
computed for each observed interval.
If an additive decomposition is employed,
estimates of the trend, seasonal components are
added together to produce the original series.
If an multiplicative decomposition is employed,
estimates of individual components must be
multiplied together to produce the original series

44
Seasonal indices

The seasonal indices measure the seasonal
variation in the series.
Seasonal indices are percentages that show
changes over time.
Ex
With monthly data, a seasonal index of 1.0 for a
particular month means the expected value for
that month is 1/12 the total for the year.
An index of 1.25 for a different month implies
the observation for that month is expected to be
25 more than 1/12 of the annual total.
A monthly index of 0.80 indicates that the
expected level of that month is 20 less than
1/12 the total for the year.

45
Seasonal adjustment

After the seasonal component has been isolated,
it can be used to calculate seasonally adjusted
data.
Seasonal adjustment techniques are ad hoc methods
of computing seasonal indices and use those
indices to deseasonalize the series by removing
those seasonal variation.
For an multiplicative decomposition, the
seasonally adjusted data are computed by dividing
the original data by the seasonal component (i.e.
seasonal index)
deseasonalized data raw data/seasonal
index

46
Seasonal adjustment technique

Seasonal adjustment techniques are based on the
idea that a time series yt can be represented as
the product of 4 components
yt T ? S ? C ? I
The objective is to eliminate the seasonal
component S.
First, we try to isolate the combined trend and
cyclical components T ? C. This cannot be done
exactly instead an ad-hoc smoothing procedure is
used to remove T ? C from the original time
series.
For example, supposed that yt consists of monthly
data. Then a 12-month average ymt is computed
ymt (yt6 yt yt-1
yt-5)/12
Presumably ymt is relatively free of seasonal and
irregular fluctuations and is thus as estimate of
T ? C.
Now, we divide the original data by this estimate
of T ? C to obtain an estimate of the combined
seasonal and irregular components S ? I.

47
Seasonal adjustment technique (cont.)

S ? I yt/ ymt zt
The next step is to eliminate the irregular
component I in order to obtain the seasonal
index. To do this, we average the values of S ? I
corresponding to the same month.
In other words, suppose that y1 (and hence z1)
corresponds to January, y2 to February, etc., and
there are 48 months of data. We thus compute
zm1 (z1 z13 z25 z37)
zm2 (z2 z14 z26 z38)
zm12 (z12 z24 z36 z48)

48
Seasonal adjustment technique (cont.)

The rationale here is that when the
seasonal-irregular percentages zt are averaged
for each month (each quarter if the data are
quarterly), the irregular fluctuations will be
largely smoothed out.
The 12 averages zm1,, zm12 will then be
estimates of the seasonal indices. They should
sum close to 12.
The deseasonalization of the original series yt
is now straightforward just divide each value in
the series by its corresponding seasonal index.
Thus, the seasonally adjusted yat is obtained
from
ya1 y1/ zm1, ya2 y2/ zm2 , ya12 y12/
zm12, etc.

49
Appendix Least-square parameter estimates

Our goal is to minimize ? (Yt Yt)2 where Yt
a bXi is the fitted value of Y corresponding to
a particular observation Xi.
We minimize the expression by taking the partial
derivatives with respect to a and to b, setting
each equal to 0, and solving the resulting pair
of simultaneous equations

-2
(A.1) (A.2)
-2
50
Least-square parameter estimates

Equating these derivatives to zero and dividing
by -2, we get
?(Yi a bXi) 0
(A.3)
?Xi(Yi a bXi) 0
(A.4)
Finally by rewriting Eqs. (A.3) and (A.4), we
obtain the pair of simultaneous equations
?Yi aN b?Xi
(A.5)
?XiYi a?Xi b?Xi2
(A.6)
Now we can solve for a and b simultaneously by
multiplying (A.5) by ?Xi and Eq. (A.6) by N
?Xi?Yi aN?Xi b(?Xi)2
(A.7)
N?XiYi aN?Xi bN(?Xi)2 (A.8)