Title: A Brief Introduction to Statistical Forecasting
1A Brief Introduction to Statistical Forecasting
2Outline
- Principle Component Theory
- Applications
- Z Score
- VIPER
3Basic Forecast Methods
Simulation modeling
Statistical regression
S Fork Rio Grande, Colo
Snow
Rainfall
Heat
Apr-Jul streamflow avg
Snowpack
Runoff
Soil water
May 1 snowpack avg
Credit Tom Pagano
4The General Linear Regression Model
- where
- Y dependent variable
- Xi independent variables
- bi regression coefficients
- n number of independent variables
Credit Dave Garen
5The Problem
- If Xs are intercorrelated, they contain
redundant information, and the bs cannot be
meaningfully estimated. - However, we dont want to have to throw out most
of the Xs but prefer to retain them for
robustness.
Credit Dave Garen
6Example
Streamflow bo b1 (Snotel A) b2 (Snotel
B)
-gt Snotel sites are very well correlated -gt An
optimal b1 and b2 will be difficult to determine
since the correlation is so strong
7The Solution
- Possibilities
- 1) Pre-combine Xs into composite index(es),
e.g., Z-score method - 2) Principal components regression
- These are similar in concept but differ in the
mathematics.
Credit Dave Garen
8Principal Components Analysis
- Principal components regression is just like
standard regression except the independent
variables are principal components rather than
the original X variables. - Principal components are linear combinations of
the Xs.
Credit Dave Garen
9Principal Components Analysis
- Each principal component is a weighted sum of all
the Xs
. . .
Credit Dave Garen
10Principal Components Analysis
- The es are called eigenvectors, derived from a
matrix equation whose input is the correlation
matrix of all the Xs with each other. - Principal components are new variables that are
not correlated with each other. - The principal components transformation is
equivalent to a rotation of axes.
Credit Dave Garen
11Principal Components Analysis
Credit Dave Garen
12Principal Components Analysis
- The eigenvectors (weights) are based solely on
the intercorrelations among the Xs and have no
knowledge of Y (in contrast to Z-score, for which
the opposite is true). - Principal components can be used for purely
descriptive purposes, but we want to use them as
independent variables in a regression.
Credit Dave Garen
13Credit Dennis Hartmann
14Principal Components Analysis -- Example
- Independent Variables
- X1 X5 Snow water equivalent at 5 stations
- X6 X10 Water year to date precipitation at 5
stations - X11 Antecedent streamflow
- X12 Climate teleconnection index
Credit Dave Garen
15Correlation Matrix
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 Y
X1 1.0 .72 .67 .76 .81 .54 .31 .54 .38 .50 .18 .64 .65
X2 1.0 .67 .45 .80 .62 .45 .47 .31 .49 .14 .39 .60
X3 1.0 .49 .72 .84 .76 .86 .68 .85 .48 .56 .80
X4 1.0 .62 .42 .26 .36 .56 .38 .28 .59 .68
X5 1.0 .62 .49 .51 .44 .62 .32 .59 .73
X6 1.0 .93 .87 .83 .90 .63 .43 .85
X7 1.0 .82 .85 .90 .67 .32 .76
X8 1.0 .74 .84 .64 .39 .70
X9 1.0 .80 .70 .49 .84
X10 1.0 .64 .46 .79
X11 1.0 .36 .51
X12 1.0 .64
Credit Dave Garen
16First Five Eigenvectors
PC1 PC2 PC3 PC4 PC5
X1 0.265 0.444 0.004 0.074 -0.104
X2 0.249 0.325 -0.483 -0.030 0.315
X3 0.335 0.016 -0.178 0.149 -0.314
X4 0.229 0.353 0.456 -0.595 -0.009
X5 0.287 0.332 -0.148 0.120 0.412
X6 0.339 -0.168 -0.162 -0.106 -0.040
X7 0.308 -0.329 -0.150 -0.058 -0.015
X8 0.317 -0.197 -0.114 0.027 -0.261
X9 0.304 -0.240 0.299 -0.313 -0.103
X10 0.330 -0.197 -0.197 0.072 -0.129
X11 0.235 -0.349 0.351 0.168 0.692
X12 0.232 0.262 0.473 0.675 -0.212
var. 62.7 15.8 7.8 3.8 3.2
Credit Dave Garen
17Principal Components Regression Procedure
- Try the PCs in order
- Test for regression coefficient significance
(t-test) - Stop at first insignificant component
- Transform regression coefficients to be in terms
of original variables - Sign test coefficient signs must be same as
correlation with Y
Credit Dave Garen
18Summary
- Principal components analysis is a standard
multivariate statistical procedure - Can be used for descriptive purposes to reduce
the dimensionality of correlated variables - Can be taken a step further to provide new,
non-correlated independent variables for
regression - PCs taken in order, subject to t-test and sign
test - Final model is expressed in terms of original X
variables
Credit Dave Garen
19Soil Moisture at the interannual timescale
- Another example demonstrating importance of land
surface processes in the climate system Werner,
1999 - GCM run with and without active land surface
model in South America to explore the importance
of land surface processes in the climate system
variability in the Nordeste region. - Both simulations include full atmospheric model,
slab ocean model (no ocean dynamics), and dynamic
land surface model everywhere except tropical
South America in the Data Land simulation.
20Soil Moisture at the interannual timescale
- Modeled variability
- Full dynamic land surface model simulation
contains variability resembling observed
variability with connection between NH and SH
SSTs. - Fixed land surface model shows no connected
variability between NH and SH SSTs
21Resources
- Dave Garen VIPER slides
- Dennis Hartmann lecture notes (http//www.atmos.wa
shington.edu/dennis/)
22What does z-score regression do?
1. Combines predictors into weighted
indices, emphasizing good stations, minimizing
bad ones. 2. Compensates for missing data with
remaining data. 3. Regresses index against
target predictand
Credit Tom Pagano
23What is a z-score?
A z-score is a normalized anomaly
Z value - average
standard deviation
Credit Tom Pagano
24What is a z-score?
A z-score is a normalized anomaly
Z value - average
standard deviation
Credit Tom Pagano
25What is a z-score?
A z-score is a normalized anomaly
Z value - average
standard deviation
avg
stdev
135
30
60
15
Credit Tom Pagano
26What is a z-score?
A z-score is a normalized anomaly
Z value - average
standard deviation
avg
stdev
135
30
60
15
Z (90 60)/15 2
Credit Tom Pagano
27How good are the results
Under conditions of serially compete data, and
relatively normal conditions PCA and Z-Score
are effectively indistinguishable Skill and
behavior is similar to the official published
outlooks However Any tool is a weapon if
you hold it right. (aka A fool with a tool is
still a tool)
Credit Tom Pagano
Viper technical note - 1 basin
Pagano dissertation 29 basins
28Super Quick Primer on VIPER
29The Viper Main Interface Layout and interpretation
Credit Tom Pagano
30The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Global month changes
Credit Tom Pagano
31The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Global month changes
Predictors quality, availability
Historical statistics
Credit Tom Pagano
32The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Forecast vs observed time series
Station availability, weights
Global month changes
Predictors quality, availability
Historical statistics
Credit Tom Pagano
33The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Forecast vs observed time series
Station availability, weights
Global month changes
Predictors quality, availability
Fcst vs obs scatterplot
Helper variable Scatterplot/ Forecast progression
Historical statistics
Credit Tom Pagano
34The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Forecast vs observed time series
Station availability, weights
Global month changes
Predictors quality, availability
Fcst vs obs scatterplot
Helper variable Scatterplot/ Forecast progression
Settings
Probability bounds
Historical statistics
Credit Tom Pagano
35The Viper Main Interface Layout and interpretation
Theres more if you scroll right Relate any
variable to another
Credit Tom Pagano