A Brief Introduction to Statistical Forecasting - PowerPoint PPT Presentation

About This Presentation
Title:

A Brief Introduction to Statistical Forecasting

Description:

VIPER. Statistical regression. Basic Forecast Methods. May 1 snowpack % avg ... The Viper Main Interface. Layout and interpretation. Selecting. predictors and ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 36
Provided by: kvw
Category:

less

Transcript and Presenter's Notes

Title: A Brief Introduction to Statistical Forecasting


1
A Brief Introduction to Statistical Forecasting
  • Kevin Werner

2
Outline
  • Principle Component Theory
  • Applications
  • Z Score
  • VIPER

3
Basic Forecast Methods
Simulation modeling
Statistical regression
S Fork Rio Grande, Colo
Snow
Rainfall
Heat
Apr-Jul streamflow avg
Snowpack
Runoff
Soil water
May 1 snowpack avg
Credit Tom Pagano
4
The General Linear Regression Model
  • where
  • Y dependent variable
  • Xi independent variables
  • bi regression coefficients
  • n number of independent variables

Credit Dave Garen
5
The Problem
  • If Xs are intercorrelated, they contain
    redundant information, and the bs cannot be
    meaningfully estimated.
  • However, we dont want to have to throw out most
    of the Xs but prefer to retain them for
    robustness.

Credit Dave Garen
6
Example
Streamflow bo b1 (Snotel A) b2 (Snotel
B)
-gt Snotel sites are very well correlated -gt An
optimal b1 and b2 will be difficult to determine
since the correlation is so strong
7
The Solution
  • Possibilities
  • 1) Pre-combine Xs into composite index(es),
    e.g., Z-score method
  • 2) Principal components regression
  • These are similar in concept but differ in the
    mathematics.

Credit Dave Garen
8
Principal Components Analysis
  • Principal components regression is just like
    standard regression except the independent
    variables are principal components rather than
    the original X variables.
  • Principal components are linear combinations of
    the Xs.

Credit Dave Garen
9
Principal Components Analysis
  • Each principal component is a weighted sum of all
    the Xs

. . .
Credit Dave Garen
10
Principal Components Analysis
  • The es are called eigenvectors, derived from a
    matrix equation whose input is the correlation
    matrix of all the Xs with each other.
  • Principal components are new variables that are
    not correlated with each other.
  • The principal components transformation is
    equivalent to a rotation of axes.

Credit Dave Garen
11
Principal Components Analysis
Credit Dave Garen
12
Principal Components Analysis
  • The eigenvectors (weights) are based solely on
    the intercorrelations among the Xs and have no
    knowledge of Y (in contrast to Z-score, for which
    the opposite is true).
  • Principal components can be used for purely
    descriptive purposes, but we want to use them as
    independent variables in a regression.

Credit Dave Garen
13
Credit Dennis Hartmann
14
Principal Components Analysis -- Example
  • Independent Variables
  • X1 X5 Snow water equivalent at 5 stations
  • X6 X10 Water year to date precipitation at 5
    stations
  • X11 Antecedent streamflow
  • X12 Climate teleconnection index

Credit Dave Garen
15
Correlation Matrix
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 Y
X1 1.0 .72 .67 .76 .81 .54 .31 .54 .38 .50 .18 .64 .65
X2 1.0 .67 .45 .80 .62 .45 .47 .31 .49 .14 .39 .60
X3 1.0 .49 .72 .84 .76 .86 .68 .85 .48 .56 .80
X4 1.0 .62 .42 .26 .36 .56 .38 .28 .59 .68
X5 1.0 .62 .49 .51 .44 .62 .32 .59 .73
X6 1.0 .93 .87 .83 .90 .63 .43 .85
X7 1.0 .82 .85 .90 .67 .32 .76
X8 1.0 .74 .84 .64 .39 .70
X9 1.0 .80 .70 .49 .84
X10 1.0 .64 .46 .79
X11 1.0 .36 .51
X12 1.0 .64
Credit Dave Garen
16
First Five Eigenvectors
PC1 PC2 PC3 PC4 PC5
X1 0.265 0.444 0.004 0.074 -0.104
X2 0.249 0.325 -0.483 -0.030 0.315
X3 0.335 0.016 -0.178 0.149 -0.314
X4 0.229 0.353 0.456 -0.595 -0.009
X5 0.287 0.332 -0.148 0.120 0.412
X6 0.339 -0.168 -0.162 -0.106 -0.040
X7 0.308 -0.329 -0.150 -0.058 -0.015
X8 0.317 -0.197 -0.114 0.027 -0.261
X9 0.304 -0.240 0.299 -0.313 -0.103
X10 0.330 -0.197 -0.197 0.072 -0.129
X11 0.235 -0.349 0.351 0.168 0.692
X12 0.232 0.262 0.473 0.675 -0.212
var. 62.7 15.8 7.8 3.8 3.2
Credit Dave Garen
17
Principal Components Regression Procedure
  • Try the PCs in order
  • Test for regression coefficient significance
    (t-test)
  • Stop at first insignificant component
  • Transform regression coefficients to be in terms
    of original variables
  • Sign test coefficient signs must be same as
    correlation with Y

Credit Dave Garen
18
Summary
  • Principal components analysis is a standard
    multivariate statistical procedure
  • Can be used for descriptive purposes to reduce
    the dimensionality of correlated variables
  • Can be taken a step further to provide new,
    non-correlated independent variables for
    regression
  • PCs taken in order, subject to t-test and sign
    test
  • Final model is expressed in terms of original X
    variables

Credit Dave Garen
19
Soil Moisture at the interannual timescale
  • Another example demonstrating importance of land
    surface processes in the climate system Werner,
    1999
  • GCM run with and without active land surface
    model in South America to explore the importance
    of land surface processes in the climate system
    variability in the Nordeste region.
  • Both simulations include full atmospheric model,
    slab ocean model (no ocean dynamics), and dynamic
    land surface model everywhere except tropical
    South America in the Data Land simulation.

20
Soil Moisture at the interannual timescale
  • Modeled variability
  • Full dynamic land surface model simulation
    contains variability resembling observed
    variability with connection between NH and SH
    SSTs.
  • Fixed land surface model shows no connected
    variability between NH and SH SSTs

21
Resources
  • Dave Garen VIPER slides
  • Dennis Hartmann lecture notes (http//www.atmos.wa
    shington.edu/dennis/)

22
What does z-score regression do?
1. Combines predictors into weighted
indices, emphasizing good stations, minimizing
bad ones. 2. Compensates for missing data with
remaining data. 3. Regresses index against
target predictand
Credit Tom Pagano
23
What is a z-score?
A z-score is a normalized anomaly
Z value - average
standard deviation
Credit Tom Pagano
24
What is a z-score?
A z-score is a normalized anomaly
Z value - average
standard deviation
Credit Tom Pagano
25
What is a z-score?
A z-score is a normalized anomaly
Z value - average
standard deviation
avg
stdev
135
30
60
15
Credit Tom Pagano
26
What is a z-score?
A z-score is a normalized anomaly
Z value - average
standard deviation
avg
stdev
135
30
60
15
Z (90 60)/15 2
Credit Tom Pagano
27
How good are the results
Under conditions of serially compete data, and
relatively normal conditions PCA and Z-Score
are effectively indistinguishable Skill and
behavior is similar to the official published
outlooks However Any tool is a weapon if
you hold it right. (aka A fool with a tool is
still a tool)
Credit Tom Pagano
Viper technical note - 1 basin
Pagano dissertation 29 basins
28
Super Quick Primer on VIPER
29
The Viper Main Interface Layout and interpretation
Credit Tom Pagano
30
The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Global month changes
Credit Tom Pagano
31
The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Global month changes
Predictors quality, availability
Historical statistics
Credit Tom Pagano
32
The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Forecast vs observed time series
Station availability, weights
Global month changes
Predictors quality, availability
Historical statistics
Credit Tom Pagano
33
The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Forecast vs observed time series
Station availability, weights
Global month changes
Predictors quality, availability
Fcst vs obs scatterplot
Helper variable Scatterplot/ Forecast progression
Historical statistics
Credit Tom Pagano
34
The Viper Main Interface Layout and interpretation
Selecting predictors and predictands
Forecast vs observed time series
Station availability, weights
Global month changes
Predictors quality, availability
Fcst vs obs scatterplot
Helper variable Scatterplot/ Forecast progression
Settings
Probability bounds
Historical statistics
Credit Tom Pagano
35
The Viper Main Interface Layout and interpretation
Theres more if you scroll right Relate any
variable to another
Credit Tom Pagano
Write a Comment
User Comments (0)
About PowerShow.com