PROBABILISTIC FORECASTS AND THEIR VERIFICATION - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

PROBABILISTIC FORECASTS AND THEIR VERIFICATION

Description:

SINGLE FORECASTS Statistical rendition of pdf. ENSEMBLE FORECASTS NWP ... record (even for hind-casts) Statistical significance in comparative verification ... – PowerPoint PPT presentation

Number of Views:144
Avg rating:3.0/5.0
Slides: 75
Provided by: wd25
Category:

less

Transcript and Presenter's Notes

Title: PROBABILISTIC FORECASTS AND THEIR VERIFICATION


1
PROBABILISTIC FORECASTS AND THEIR VERIFICATION
  • Zoltan Toth
  • Environmental Modeling Center
  • NOAA/NWS/NCEP
  • Ackn. Yuejian Zhu and Olivier Talagrand (1)
  • (1) Ecole Normale Superior and LMD, Paris,
    France
  • http//wwwt.emc.ncep.noaa.gov/gmb/ens/index.html

2
OUTLINE / SUMMARY
  • SCIENCE OF FORECASTING
  • GOAL OF SCIENCE Forecasting
  • VERIFICATION Model development, user feedback
  • GENERATION OF PROBABILISTIC FORECASTS
  • SINGLE FORECASTS Statistical rendition of pdf
  • ENSEMBLE FORECASTS NWP-based, case-dependent pdf
  • ATTRIBUTES OF FORECAST SYSTEMS
  • RELIABILITY Forecasts look like nature
    statistically
  • RESOLUTION Forecasts indicate actual future
    developments
  • VERIFICATION OF PROBABILSTIC ENSEMBLE FORECASTS
  • UNIFIED PROBABILISTIC MEASURES Dimensionless
  • ENSEMBLE MEASURES Evaluate finite sample
  • STATISTICAL POSTPROCESSING OF FORECASTS
  • STATISTICAL RELIABILITY Make it perfect
  • STATISTICAL RESOLUTION Keep it unchanged

3
SCIENCE OF FORECASTING
  • Ultimate goal of science
  • Forecasting
  • Meteorology is in forefront
  • Weather forecasting constantly in publics eye
  • Approach
  • Observe what is relevant and available
  • Analyze data
  • Build general knowledge about nature based on
    analysis
  • Generalization abstraction Laws,
    relationships
  • Build model of reality based on general knowledge
  • Conceptual
  • Quantitative/numerical, including various
    physical etc processes
  • Analog
  • Predict whats not observable in
  • Space eg, data assimilation
  • Time - eg, future weather
  • Variables / processes
  • Verify (ie, compare with observations)

4
PREDICTIONS IN TIME
  • Method
  • Use model of nature for projection in time
  • Start model with estimate of state of nature at
    initial time
  • Sources of errors
  • Discrepancy between model and nature
  • Added at every time step
  • Discrepancy between estimated and actual state of
    nature
  • Initial error
  • Chaotic systems
  • Common type of dynamical systems
  • Characterized with at least one perturbation
    pattern that amplifies
  • All errors project onto amplifying directions
  • Any initial and/or model error
  • Predictability limited
  • Ed Lorenz legacy
  • Verification quantifies situation

5
MOTIVATION FOR ENSEMBLE FORECASTING
  • FORECASTS ARE NOT PERFECT - IMPLICATIONS FOR
  • USERS
  • Need to know how often / by how much forecasts
    fail
  • Economically optimal behavior depends on
  • Forecast error characteristics
  • User specific application
  • Cost of weather related adaptive action
  • Expected loss if no action taken
  • EXAMPLE Protect or not your crop against
    possible frost
  • Cost 10k, Potential Loss 100k gt Will protect
    if P(frost) gt Cost/Loss0.1
  • NEED FOR PROBABILISTIC FORECAST INFORMATION
  • DEVELOPERS
  • Need to improve performance - Reduce error in
    estimate of first moment
  • Traditional NWP activities (I.e., model, data
    assimilation development)
  • Need to account for uncertainty - Estimate higher
    moments
  • New aspect How to do this?
  • Forecast is incomplete without information on
    forecast uncertainty
  • NEED TO USE PROBABILISTIC FORECAST FORMAT

6
GENERATION OF PROBABILISTIC FORECASTS
  • How to determine forecast probability?
  • Fully statistical methods losing relevance
  • Numerical modeling
  • Liouville Equations provide pdfs
  • Not practical (computationally intractable)
  • Finite sample of pdf
  • Single or multiple (ensemble) integrations
  • Increasingly finer resolution estimate in
    probabilities
  • How to make (probabilistic) forecasts reliable?
  • Construct pdf
  • Assess reliability
  • Construct frequency distribution of observations
    following forecast classes
  • Replace form of forecast with associated
    frequency distribution of observations
  • Production and verification of forecasts
    connected in operations

7
FORECASTING IN A CHAOTIC ENVIRONMENT
PROBABILISTIC FORECASTING BASED ON A SINGLE
FORECAST One integration with an NWP model,
combined with past verification statistics
DETERMINISTIC APPROACH - PROBABILISTIC FORMAT
  • Does not contain all forecast information
  • Not best estimate for future evolution of system
  • UNCERTAINTY CAPTURED IN TIME AVERAGE SENSE -
  • NO ESTIMATE OF CASE DEPENDENT VARIATIONS IN FCST
    UNCERTAINTY

8
SCIENTIFIC BACKGROUND WEATHER FORECASTS ARE
UNCERTAIN
Buizza 2002
9
  • FORECASTING IN A CHAOTIC ENVIRONMENT - 2
  • DETERMINISTIC APPROACH - PROBABILISTIC FORMAT
  • PROBABILISTIC FORECASTING -
  • Based on Liuville Equations
  • Continuity equation for probabilities, given
    dynamical eqs. of motion
  • Initialize with probability distribution
    function (pdf) at analysis time
  • Dynamical forecast of pdf based on conservation
    of probability values
  • Prohibitively expensive -
  • Very high dimensional problem (state space x
    probability space)
  • Separate integration for each lead time
  • Closure problems when simplified solution sought

10
FORECASTING IN A CHAOTIC ENVIRONMENT -
3DETERMINISTIC APPROACH - PROBABILISTIC FORMAT
  • MONTE CARLO APPROACH ENSEMBLE FORECASTING
  • IDEA Sample sources of forecast error
  • Generate initial ensemble perturbations
  • Represent model related uncertainty
  • PRACTICE Run multiple NWP model integrations
  • Advantage of perfect parallelization
  • Use lower spatial resolution if short on
    resources
  • USAGE Construct forecast pdf based on finite
    sample
  • Ready to be used in real world applications
  • Verification of forecasts
  • Statistical post-processing (remove bias in 1st,
    2nd, higher moments)
  • CAPTURES FLOW DEPENDENT VARIATIONS
  • IN FORECAST UNCERTAINTY

11
6 hours ET / breeding cycle
Re-scaling
6hrs
Up to 16-d
Next T00Z
T00Z 80m
Re-scaling
T06Z 80m
Up to 16-d
Re-scaling
T12Z 80m
Up to 16-d
Re-scaling
T18Z 80m
Up to 16-d
12
USER REQUIREMENTSPROBABILISTIC FORECAST
INFORMATION IS CRITICAL

13
HOW TO DEAL WITH FORECAST UNCERTAINTY?
  • No matter what / how sophisticated forecast
    methods we use
  • Forecast skill limited
  • Skill varies from case to case
  • Forecast uncertainty must be assessed by
    meteorologists

THE PROBABILISTIC APPROACH
14
SOCIO-ECONOMIC BENEFITS OFSEAMLESS
WEATHER/CLIMATE FORECAST SUITE
Commerce Energy
Ecosystem Health
Hydropower Agriculture
Boundary Condition Sensitivity
Reservoir control Recreation
Transportation Fire weather
Initial Condition Sensitivity
Flood mitigation Navigation
Protection of Life/Property
Weeks
Minutes
Days
Hours
Years
Seasons
Months
15
ENSEMBLE FORECASTS
  • Definition
  • Finite sample to estimate full probability
    distribution
  • Full solution (Liouville Eqs.) computationally
    intractable
  • Interpretation (assignment of probabilities)
  • Crude
  • Step-wise increase in cumulative forecast
    probability distribution
  • Performance dependent on size of ensemble
  • Enhanced
  • Inter- extrapolation (dressing)
  • Performance improvement depends on quality of
    inter- extrapolation
  • Based on assumptions
  • Linear interpolation (each member equally
    likely)
  • Based on verification statistics
  • Kernel or other methods (Inclusion of some
    statist. bias-correction)

16
(No Transcript)
17
(No Transcript)
18
144 hr forecast
Poorly predictable large scale wave Eastern
Pacific Western US
Highly predictable small scale wave Eastern US
Verification
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
FORECAST EVALUATION
  • Statistical approach
  • Evaluates set of forecasts and not a single
    forecast
  • Interest in comparing forecast systems
  • Forecasts generated by same procedure
  • Sample size affects how fine stratification is
    possible
  • Level of details is limited
  • Size of sample limited by available obs. record
    (even for hind-casts)
  • Statistical significance in comparative
    verification
  • Error in proxy for truth
  • Observations or numerical analysis
  • Types
  • Forecast statistics
  • Depends only on forecast properties
  • Verification statistics
  • Comparison of forecast and proxy for truth in
    statistical sense
  • Depends on both natural and forecast systems
  • Nature represented by proxy
  • Observations (including observational error)

23
FORECAST VERIFICATION
  • Types
  • Measures of quality
  • Environmental science issues
  • Main focus here
  • Measures of utility
  • Multidisciplinary
  • Social economic issues, beyond environmental
    sciences
  • Socio-economic value of forecasts is ultimate
    measure
  • Approximate measures can be constructed
  • Quality vs. utility
  • Improved quality
  • Generally permits enhanced utility (assumption)
  • How to improve utility if quality is fixed?
  • Providers communicate all available information
  • E.g., offer probabilistic or other information on
    forecast uncertainty
  • Engage in education, training
  • Users identify forecast aspects important to them
  • Can providers selectively improve certain aspects
    of forecasts?

24
EVALUATING QUALITY OF FORECAST SYSTEMS
  • Goal
  • Infer comparative information about forecast
    systems
  • Value added by
  • New methods
  • Subsequent steps in end-to-end forecast process
    (eg., manual changes)
  • Critical for monitoring and improving operational
    forecast systems
  • Attributes of forecast systems
  • Traditionally, forecast attributes defined
    separately for each fcst format
  • General definition needed
  • Need to compare forecasts
  • From any system
  • Of any type / format
  • Single, ensemble, categorical, probabilistic, etc
  • Supports systematic evaluation of
  • End-to-end (provider-user) forecast process
  • Statistical post-processing as integral part of
    system

25
FORECAST SYSTEM ATTRIBUTES
  • Abstract concept (like length)
  • Reliability and Resolution
  • Both can be measured through different statistics
  • Statistical property
  • Interpreted for large set of forecasts
  • Describe behavior of forecast system, not a
    single forecast
  • For their definition, assume that
  • Forecasts
  • Can be of any format
  • Single value, ensemble, categorical,
    probabilistic, etc
  • Take a finite number of different classes Fa
  • Observations
  • Can also be grouped into finite number of
    classes like Oa

26
STATISTICAL RELIABILITY TEMPORAL AGGREGATE
STATISTICAL CONSISTENCY OF FORECASTS WITH
OBSERVATIONS
  • BACKGROUND
  • Consider particular forecast class Fa
  • Consider frequency distribution of observations
    that follow forecasts Fa - fdoa
  • DEFINITION
  • If forecast Fa has the exact same form as fdoa,
    for all forecast classes,
  • the forecast system is statistically consistent
    with observations gt
  • The forecast system is perfectly reliable
  • MEASURES OF RELIABILITY
  • Based on different ways of comparing Fa and fdoa

27
STATISTICAL RESOLUTION TEMPORAL EVOLUTION
ABILITY TO DISTINGUISH, AHEAD OF TIME, AMONG
DIFFERENT OUTCOMES
  • BACKGROUND
  • Assume observed events are classified into finite
    number of classes, like Oa
  • DEFINITION
  • If all observed classes (Oa, Ob,) are preceded
    by
  • Distinctly different forecasts (Fa, Fb,)
  • The forecast system resolves the problem gt
  • The forecast system has perfect resolution
  • MEASURES OF RESOLUTION
  • Based on degree of separation of fdos that
    follow various forecast classes
  • Measured by difference between fdos climate
    distribution
  • Measures differ by how differences between
    distributions are quantified

FORECASTS
OBSERVATIONS
EXAMPLES
28
CHARACTERISTICS OF RELIABILITY RESOLUTION
  • Reliability
  • Related to form of forecast, not forecast content
  • Fidelity of forecast
  • Reproduce nature when resolution is perfect,
    forecast looks like nature
  • Not related to time sequence of forecast/observed
    systems
  • How to improve?
  • Make model more realistic
  • Also expected to improve resolution
  • Statistical bias correction Can be statistically
    imposed at one time level
  • If both natural forecast systems are stationary
    in time
  • If there is a large enough set of
    observed-forecast pairs
  • Link with verification
  • Replace forecast with corresponding fdo
  • Resolution
  • Related to inherent predictive value of forecast
    system
  • Not related to form of forecasts
  • Statistical consistency at one time level
    (reliability) is irrelevant
  • How to improve?

29
CHARACTERISTICS OF FORECAST SYSTEM ATTRIBUTES
  • RELIABILITY AND RESOLUTION ARE
  • General forecast attributes
  • Valid for any forecast format (single,
    categorical, probabilistic, etc)
  • Independent attributes
  • For example
  • Climate pdf forecast is perfectly reliable, yet
    has no resolution
  • Reversed rain / no-rain forecast can have perfect
    resolution and no reliability
  • To separate them, they must be measured according
    to general definition
  • If measured according to traditional, narrower
    definition
  • Reliability resolution can be mixed
  • Function of forecast quality
  • There is no other relevant forecast attribute
  • Perfect reliability and perfect resolution
    perfect forecast system
  • Deterministic forecast system that is always
    correct
  • Both needed for utility of forecast systems

30
FORMAT OF FORECASTS PROBABILSITIC FORMAT
  • Do we have a choice?
  • When forecasts are imperfect
  • Only probabilistic format can be
    reliable/consistent with nature
  • Abstract concept
  • Related to forecast system attributes
  • Space of probability dimensionless pdf or
    similar format
  • For environmental variables (not those variables
    themselves)
  • Definition
  • Define event
  • Function of concrete variables, features, etc
  • E.g., temperature above freezing
    thunderstorm
  • Determine probability of event occurring in
    future
  • Based on knowledge of initial state and evolution
    of system

31
OPERATIONAL PROB/ENSEMBLE FORECAST VERIFICATION
  • Requirements
  • Use same general dimensionless probabilistic
    measures for verifying
  • Any event
  • Against either
  • Observations or
  • Numerical analysis
  • Measures used at NCEP
  • Probabilistic forecast measures ensemble
    interpreted probabilistically
  • Reliability
  • Component of BSS, RPSS, CRPSS
  • Attributes Talagrand diagrams
  • Resolution
  • Component of BSS, RPSS, CRPSS
  • ROC, attributes diagram, potential economic value
  • Special ensemble verification procedures
  • Designed to assess performance of finite set of
    forecasts
  • Most likely member statistics, PECA

32
FORECAST PERFORMANCE MEASURES
COMMON CHARACTERISTIC Function of both forecast
and observed values
MEASURES OF RELIABILITY DESCRIPTION Statisticall
y compares any sample of forecasts with sample of
corresponding observations GOAL To assess
similarity of samples (e.g., whether 1st and 2nd
moments match) EXAMPLES Reliability component
of Brier Score Ranked Probability
Score Analysis Rank Histogram Spread vs. Ens.
Mean error Etc.
MEASURES OF RESOLUTION DESCRIPTION Compares the
distribution of observations that follows
different classes of forecasts with the climate
distribution (as reference) GOAL To assess how
well the observations are separated when grouped
by different classes of preceding
fcsts EXAMPLES Resolution component of Brier
Score Ranked Probability Score Information
content Relative Operational Characteristics Relat
ive Economic Value Etc.
COMBINED (RELRES) MEASURES Brier, Cont. Ranked
Prob. Scores, rmse, PAC,
33
EXAMPLE PROBABILISTIC FORECASTS
RELIABILITY Forecast probabilities for given
event match observed frequencies of that event
(with given prob. fcst) RESOLUTION Many
forecasts fall into classes corresponding to high
or low observed frequency of given
event (Occurrence and non-occurrence of event is
well resolved by fcst system)
34
(No Transcript)
35
PROBABILISTIC FORECAST PERFORMANCE MEASURES
TO ASSESS TWO MAIN ATTRIBUTES OF PROBABILISTIC
FORECASTS RELIABILITY AND RESOLUTION Univariate
measures Statistics accumulated point by
point in space Multivariate measures Spatial
covariance is considered
BRIER SKILL SCORE (BSS)
EXAMPLE
COMBINED MEASURE OF RELIABILITY AND RESOLUTION
36
BRIER SKILL SCORE (BSS)
COMBINED MEASURE OF RELIABILITY AND RESOLUTION
  • METHOD
  • Compares pdf against analysis
  • Resolution (random error)
  • Reliability (systematic error)
  • EVALUATION
  • BSS Higher better
  • Resolution Higher better
  • Reliability Lower better
  • RESULTS
  • Resolution dominates initially
  • Reliability becomes important later
  • ECMWF best throughout
  • Good analysis/model?
  • NCEP good days 1-2
  • Good initial perturbations?
  • No model perturb. hurts later?
  • CANADIAN good days 8-10

May-June-July 2002 average Brier skill score for
the EC-EPS (grey lines with full circles), the
MSC-EPS (black lines with open circles) and the
NCEP-EPS (black lines with crosses). Bottom
resolution (dotted) and reliability(solid)
contributions to the Brier skill score. Values
refer to the 500 hPa geopotential height over the
northern hemisphere latitudinal band 20º-80ºN,
and have been computed considering 10
equally-climatologically-likely intervals (from
Buizza, Houtekamer, Toth et al, 2004)
37
BRIER SKILL SCORE
COMBINED MEASURE OF RELIABILITY AND RESOLUTION
38
RANKED PROBABILITY SCORE
COMBINED MEASURE OF RELIABILITY AND RESOLUTION
39
Continuous Rank Probability Score
CRP Skill Score is
Xo
100
Obs (truth)
Heaviside Function H
50
0
X
p07
p09
p08
p06
p03
p02
p01
p04
p05
p10
Order of 10 ensemble members (p01, p02,,p10)
40
ANALYSIS RANK HISTOGRAM (TALAGRAND DIAGRAM)
MEASURE OF RELIABILITY
41
ENSEMBLE MEAN ERROR VS. ENSEMBLE SPREAD
MEASURE OF RELIABILITY
Statistical consistency between the ensemble and
the verifying analysis means that the verifying
analysis should be statistically
indistinguishable from the ensemble members
gt Ensemble mean error (distance between ens.
mean and analysis) should be equal to ensemble
spread (distance between ensemble mean and
ensemble members)
In case of a statistically consistent ensemble,
ens. spread ens. mean error, and they are both
a MEASURE OF RESOLUTION. In the presence of bias,
both rms error and PAC will be a combined measure
of reliability and resolution
42
INFORMATION CONTENT
MEASURE OF RESOLUTION
43
RELATIVE OPERATING CHARACTERISTICS
MEASURE OF RESOLUTION
44
ECONOMIC VALUE OF FORECASTS
MEASURE OF RESOLUTION
45
PERTURBATION VS. ERROR CORRELATION ANALYSIS (PECA)
MULTIVATIATE COMBINED MEASURE OF RELIABILITY
RESOLUTION
  • METHOD Compute correlation between ens
    perturbtns and error in control fcst for
  • Individual members
  • Optimal combination of members
  • Each ensemble
  • Various areas, all lead time
  • EVALUATION Large correlation indicates ens
    captures error in control forecast
  • Caveat errors defined by analysis
  • RESULTS
  • Canadian best on large scales
  • Benefit of model diversity?
  • ECMWF gains most from combinations
  • Benefit of orthogonalization?
  • NCEP best on small scale, short term
  • Benefit of breeding (best estimate initial
    error)?
  • PECA increases with lead time
  • Lyapunov convergence
  • Nonlilnear saturation
  • Higher values on small scales

46
WHAT WE NEED FOR POSTPROCESSING TO WORK?
  • LARGE SET OF FCST OBS PAIRS
  • Consistency defined over large sample need same
    for post-processing
  • Larger the sample, more detailed corrections can
    be made
  • BOTH FCST AND REAL SYSTEMS MUST BE STATIONARY IN
    TIME
  • Otherwise can make things worse
  • Subjective forecasts difficult to calibrate

HOW WE MEASURE STATISTICAL INCONSISTENCY?
  • MEASURES OF STATIST. RELIABILITY
  • Time mean error
  • Analysis rank histogram (Talagrand diagram)
  • Reliability component of Brier etc scores
  • Reliability diagram

47
SOURCES OF STATISTICAL INCONSISTENCY
  • TOO FEW FORECAST MEMBERS
  • Single forecast inconsistent by definition,
    unless perfect
  • MOS fcst hedged toward climatology as fcst skill
    is lost
  • Small ensemble sampling error due to limited
    ensemble size
  • (Houtekamer 1994?)
  • MODEL ERROR (BIAS)
  • Deficiencies due to various problems in NWP
    models
  • Effect is exacerbated with increasing lead time
  • SYSTEMATIC ERRORS (BIAS) IN ANALYSIS
  • Induced by observations
  • Effect dies out with increasing lead time
  • Model related
  • Bias manifests itself even in initial conditions
  • ENSEMBLE FORMATION (INPROPER SPREAD)
  • Not appropriate initial spread
  • Lack of representation of model related
    uncertainty in ensemble
  • I. E., use of simplified model that is not able
    to account for model related uncertainty

48
HOW TO IMPROVE STATISTICAL CONSISTENCY?
  • MITIGATE SOURCES OF INCONSISTENCY
  • TOO FEW MEMBERS
  • Run large ensemble
  • MODEL ERRORS
  • Make models more realistic
  • INSUFFICIENT ENSEMBLE SPREAD
  • Enhance models so they can represent model
    related forecast uncertainty
  • OTHERWISE gt
  • STATISTICALLY ADJUST FCST TO REDUCE INCONSISTENCY
  • Unpreferred way of doing it
  • What we learn can feed back into development to
    mitigate problem at sources
  • Can have LARGE impact on (inexperienced) users
  • Two separate issues
  • Bias correct against NWP analysis
  • Reduce lead time dependent model behavior
  • Downscale NWP analysis
  • Connect with observed variables that are
    unresolved by NWP models

49
(No Transcript)
50
OUTLINE / SUMMARY
  • SCIENCE OF FORECASTING
  • GOAL OF SCIENCE Forecasting
  • VERIFICATION Model development, user feedback
  • GENERATION OF PROBABILISTIC FORECASTS
  • SINGLE FORECASTS Statistical rendition of pdf
  • ENSEMBLE FORECASTS NWP-based, case-dependent pdf
  • ATTRIBUTES OF FORECAST SYSTEMS
  • RELIABILITY Forecasts look like nature
    statistically
  • RESOLUTION Forecasts indicate actual future
    developments
  • VERIFICATION OF PROBABILSTIC ENSEMBLE FORECASTS
  • UNIFIED PROBABILISTIC MEASURES Dimensionless
  • ENSEMBLE MEASURES Evaluate finite sample
  • STATISTICAL POSTPROCESSING OF FORECASTS
  • STATISTICAL RELIABILITY Make it perfect
  • STATISTICAL RESOLUTION Keep it unchanged

51
http//wwwt.emc.ncep.noaa.gov/gmb/ens/ens_info.htm
l Toth, Z., O. Talagrand, and Y. Zhu, 2005 The
Attributes of Forecast Systems A Framework for
the Evaluation and Calibration of Weather
Forecasts. In Predictability Seminars, 9-13
September 2002, Ed. T. Palmer, ECMWF, pp.
584-595. Toth, Z., O. Talagrand, G. Candille,
and Y. Zhu, 2003 Probability and ensemble
forecasts. In Environmental Forecast
Verification A practitioner's guide in
atmospheric science. Ed. I. T. Jolliffe and D.
B. Stephenson. Wiley, p. 137-164.
52
BACKGROUND
53
NOTES FOR NEXT YEAR
  • Define predictand
  • Exhaustive set of events, eg
  • Continuous temperature
  • Precipitation type (Categorical)

54
SUMMARY
  • WHY DO WE NEED PROBABILISTIC FORECASTS?
  • Isnt the atmosphere deterministic? YES, but
    its also CHAOTIC
  • FORECASTERS PERSPECTIVE USERS PERSPECTIVE
  • Ensemble techniques Probabilistic description
  • WHAT ARE THE MAIN ATTRIBUTES OF FORECAST SYSTEMS?
  • RELIABILITY Stat. consistency with distribution
    of corresponding observations
  • RESOLUTION Different events are preceded by
    different forecasts
  • WHAT ARE THE MAIN TYPES OF FORECAST METHODS?
  • EMPIRICAL Good reliability, limited resolution
    (problems in new situations)
  • THEORETICAL Potentially high resolution, prone to
    inconsistency
  • ENSEMBLE METHODS
  • Only practical way of capturing fluctuations in
    forecast uncertainty due to
  • Case dependent dynamics acting on errors in
  • Initial conditions
  • Forecast methods
  • HOW CAN PROBABILSTIC FORECAST PERFORMANCE BE
    MEASURED?

55
OUTLINE
  • STATISTICAL EVALUATION OF FORECAST SYSTEMS
  • ATTRIBUTES OF FORECAST SYSTEMS
  • FORECAST METHODS
  • EMPIRICALLY BASED
  • THEORETICALLY BASED
  • LIMITS OF PREDICTABILITY
  • LIMITING FACTORS
  • ASSESSING PREDICTABILITY
  • Ensemble forecasting
  • VERIFICATION MEASURES
  • MEASURING FORECAST SYSTEM ATTRIBUTES
  • STATISTICAL POST-PROCESSING OF FORECASTS
  • IMPROVING STATISTICAL RELIABILITY

56
CRPS Decomposition
  • Yuejian Zhu
  • Environmental Modeling Center
  • NOAA/NWS/NCEP
  • Acknowledgements
  • Zoltan Toth EMC

57
Continuous Rank Probability Score
CRP Skill Score is
Xo
100
Obs (truth)
Heaviside Function H
50
0
X
p07
p09
p08
p06
p03
p02
p01
p04
p05
p10
Order of 10 ensemble members (p01, p02,,p10)
58
CRPS Decomposition

Xo
100
OBS (truth)
P-probability
50
General example
0
X
p07
p09
p08
p06
p03
p02
p01
p04
p05
p10
Order of 10 ensemble members (p01, p02,,p10)
59
CRPS Decomposition

Example of outlier (right)
Xo
100
OBS (truth)
P-probability
50
0
X
p07
p09
p08
p06
p03
p02
p01
p04
p05
p10
Order of 10 ensemble members (p01, p02,,p10)
60
CRPS Decomposition

Example of outlier (left)
Xo
100
OBS (truth)
P-probability
50
0
X
p07
p09
p08
p06
p03
p02
p01
p04
p05
p10
Order of 10 ensemble members (p01, p02,,p10)
61
CRPS Decomposition

Where
62
CRPS Decomposition
Time, space average

Observation frequency
100
CDF of
General example
P-probability
50
CDF of
0
X
p07
p09
p08
p06
p03
p02
p01
p04
p05
p10
Order of 10 ensemble members (p01, p02,,p10)
63
CRPS Decomposition
Reliability diagram
100
90
80
70
60
50
Observed relative frequency ()
40
30
20
10
0
0 10 20 30 40
50 60 70 80 90
100 Forecast probability ()
64
CRPS Decomposition
Reliability diagram
100
90
Left outlier
80
70
60
50
Observed relative frequency ()
40
30
100 unreliable
20
Right outlier
10
0
0 10 20 30 40
50 60 70 80 90
100 Forecast probability ()
65
CRPS Decomposition
Reliability diagram
100
100 reliable
90
80
70
60
50
Observed relative frequency ()
40
30
20
10
0
0 10 20 30 40
50 60 70 80 90
100 Forecast probability ()
66
CRPS Decomposition
CRPS 0 ----------------? 1.0 RELI
0 ---------------? 0.5 RESO 0
----------------? 1.0 UNCE 0
----------------? 1.0
67
Ranked Probabilistic Score
Ranked (ordered) Probability Score (RPS) is to
verify multi-category probability forecasts, to
measure both reliability and resolution which
based on climatologically equally likely bins
and
Verify Analysis
Ensemble Forecast
x
OBS On FCST PROB Pn
0
0
0
0
1
0
0
0
0
0
0
0
20
10
0
10
30
20
0
10
i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 k
number of categories
Example of 10 climatologically equally likely
bins, 10 ensembles
68
RMSE and Spread
Mean and absolute errors
10 meter wind (u-component) Less biased, There is
less room to improve the skill by bias-correction
only
CRPSS
69
24h improvement by NAEFS
RPSS .vs CRPSS
Winter 2006-2007 NH 2m temperature For NCEP raw
forecast (black) NCEP bias corrected forecast
(red) NAEFS forecast (pink)
ROC score
70
Brier Score (and decomposition)
See ltltStatistical Methods in the Atmospheric
Sciencegtgt by D. S. Wilks, Chapter 7 Forecast
Verification
1. BS (Brier Score)
Where y is a forecast probability and o is an
observation (probability), index k denotes a
number of the n forecast event/pairs. y and o are
limited from 0 to 1 in the probability sense.
BS0 is a perfect forecast, and BS1 is missing
everything
2. BSS (Brier Skill Score)
Resolution Reliability Uncertainty

ref is the reference which is mostly climatology,
BSperf0 for perfect forecast, BSS is ranged from
0-1.
71
Brier Score (and decomposition)
3. Algebraic Decomposition of the Brier Score
After some algebra, the Brier Score can be
expressed as three separated terms
Reliability Resolution
Uncertainty
where
Conditional probability of observed and sample
climatology
and
72
Brier Score (and decomposition)
4. Example for BS calculation
By considering three equally likely bins Cblt22,
22ltCnlt26 and Cagt26
The average Brier Score is 0.133 for this case,
BS0.133 (range from 0 to 1)
73
Brier Score (and decomposition)
5. Example for BS decomposition calculation
Rel0.0056, Res0.0889, Unc0.2222, BS0.1389 ()
74
Prob. Evaluation (multi-categories)
  • 4. Reliability and possible calibration ( remove
    bias )
  • For period precipitation evaluation

Calibrated forecast
Skill line
Raw forecast
Observed Frequency ()
Resolution line Climatological prob.
0.16
Write a Comment
User Comments (0)
About PowerShow.com