Title: Ensemble prediction research for ESRL and NOAA
1Ensemble prediction research for ESRL and NOAA
NOAA Earth System Research Lab
- Tom Hamill
- ESRL/PSD
- tom.hamill_at_noaa.gov, x3060
2- No forecast is complete without a description of
its uncertainty. - Effective incorporation of uncertainty
information will require a fundamental and
coordinated shift by all sectors of the
enterprise. - and NOAA expected to lead.
NRC, 2006
3The potential of ensemblesECMWF ensemble, T42h
forecast of Lothar
A reforecast using 2005 ECMWF
system. Deterministic forecast totally
misses damaging storm over France
some ensemble members forecast it well.
from Tim Palmers book chapter, 2006, in
Predictability of Weather and Climate.
3
4December 1999 Lothar Storm(France into central
Europe)
- 150 mph gust in Zurich
- 1 in every 12 schools in France damaged
- 100 billion estimated damage across Europe
- 140 lives lost
- Ensembles provided advanced warning
maximum gusts (km/h)
c/o Meteo France
5Outline
- How does the NOAA global ensemble compare
internationally? - Main challenges in ensemble forecasting
- Defining the initial conditions
- Model error
- A proposed ensemble research agenda for ESRL
Theme ensemble prediction is a discipline, with
numerical principles every bit as important for
success as in the design of dynamical cores,
parameterization schemes.
6Ensemble global models configurations
ECMWF ahead and staying ahead with
approximately double NCEP resolution.
source WGNE
7THORPEX Interactive Grand Global Ensemble (TIGGE)
TIGGE database useful for understanding
performance of global ensemble systems, and
potential of multi-model approaches
7
8NCEP is in the middle of the pack, worse
than ECMWF, Japan, and Britain, comparable with
Canada.
(each forecast verified against own analysis)
from Matsueda and Tanaka, 2008, SOLA,
DOI10.2151/sola.2008-020
9Extratropical cyclone error characteristics
NCEP
NCEP again in middle of pack
Performance of the individual TIGGE systems for
several characteristics of extra-tropical
cyclones (courtesy of Lizzie Froude). All results
are for the northern hemisphere extra-tropics and
cover the 6-month period from 1st February to
31st July 2008. The verification is the ECMWF
analysis. Upper left mean vorticity bias of the
ensemble (excluding the control). A positive
(negative) bias in vorticity corresponds to an
over (under) prediction of cyclone intensity.
Upper right mean propagation speed bias of the
ensemble (excluding the control). The negative
bias in propagation speed corresponds to the
forecast cyclones propagating too slowly. Lower
left RMSE of the ensemble mean cyclone position.
Lower right RMSE of the ensemble mean vorticity.
The ensemble mean error is calculated by taking
the mean of the ensemble member storm tracks and
then calculating the error compared to the
corresponding analysis storm track.
10TIGGE ensemble spread and error
11TIGGE ensemble spread and error
ECMWFs spread growth is more rapid than NCEPs.
NCEP spread actually starts out very large over
North Pole and decreases.
12TIGGE ensemble spread and error
ECMWFs error starts smaller than NCEPs and
grows somewhat less rapidly.
13Conclusion from this comparison opportunity (
responsibility) for ESRL
- NWS ensembles are intermediate in quality
relative to international peers (not much
prestige in beating Brazil). - ESRL can step up and lead NOAAs ensemble system
RD. Via - THORPEX / HMT
- HFIP
- DTC
- NEXGEN
- ESRL base?
- If we do step up, whats the most crucial RD to
do?
14Outline
- How does the NOAA global ensemble compare
internationally? - Main challenges in ensemble forecasting
- Defining the initial conditions
- Model error
- Limited-area ensemble issues
- A proposed ensemble research agenda for ESRL
Theme ensemble prediction is a discipline, with
numerical principles every bit as important for
success as in the design of dynamical cores,
parameterization schemes.
15Main challenges in ensemble prediction(1)
Defining the initial conditions
- There is an underlying theory for the correct
method for generating ensembles (under linear,
Gaussian assumptions, no model error). To
maximize explained forecast variance, ensembles
should project onto the leading analysis-error
covariance singular vectors. - With a limited-size ensemble (here, 1 perturbed
member) we choose the vector at the initial time,
consistent with initial analysis errors, that
maps to the largest perturbation at the final
time.
t0
t24h
1
2
1
analysis error covariance
2
forecast error covariance
Ref Ehrendorfer and Tribbia 1997 JAS, Hamill et
al. 2002 MWR
16Properties of analysis errors
analysis spectrum
72h
12h
Analysis errors tend have a red-white
spectrum and are a larger fraction of the
climatological variance at small scales than at
large scales. Still, there is more total error
in the large scales than in the small scales.
analysis error spectrum
Ref Hakim, MWR, March 2005 using analyses from
multiple centres
16
17Other properties of analysis errors
Analysis temporal variability (contour)
and analysis error error larger in data voids,
storm tracks. Analysis error EOFs (heavy
lines) and EOF of temporal variability (light
lines). Wind error EOFs max near
tropopause, temperature EOFs near surface.
Ref Hakim, MWR, March 2005
18How do ECMWFs total-energy singular vectors
compare?
- perturbations are those that grow most rapidly,
with initial and final size measured in total
energy - not consistent with 3D-Var analysis-error
statistics. - small in scale initially, little amplitude near
surface, tropopause, as with previous analysis
errors. - but total energy singular vectors grow very
quickly. - (not shown) leading SVs associated with storm
tracks, not data voids. - ECMWF now blends in some evolved singular
vectors (akin to bred perturbations) to resemble
analysis errors more.
Analysis-error covariance singular vectors
Total-energy singular vectors
48-h evolved
10
initial10
10
10
from Barkmeijer et al., QJRMS, 1999
19The ensemble Kalman filter a sound
theoretical method for initializing ensemble
forecasts?
(Parallel analysis and forecast
data assimilation cycles. Ensemble of analyses
provide estimate of analysis uncertainty, may
be useful for initializing ensemble forecasts)
20GDAS and EnKF analyses for Rita, 06 UTC 22
September 2005 (SLP, 4 hPA contours, 1000 hPa
thicker)
GDAS has less intense initial vortex, which is a
synthetic vortex that has been manually
relocated to be consistent with observed data.
EnKF has more intense analyzed vortex.
Colors denote spread in the ensemble,
larger around eye, where small change.
21NHC official track forecast(note official track
far south of actual track, south of EnKF
ensemble track forecast)
22observed position
72-h track forecasts from EnKF ensemble.
Intense vortices in several members
23Growth of EnKFs analysis-error covariance
singular vectorsas a function of localization
radius
- Growth rate of leading singular vector in the
span of EnKF ensemble using a two-level primitive
equation model, perfect-model assumptions.
Some of the parameters that are used to tune the
EnKF to minimize error, such as the covariance
localization radius, also have an effect on the
growth rate of perturbations. More imbalance in
the initial ensemble with small localization
radius. Underlying message we have
been concentrating on EnKFs for reducing analysis
errors. May need to adjust their configuration
for ensemble forecast purposes.
24Outline
- How does the NOAA global ensemble compare
internationally? - Main challenges in ensemble forecasting
- Defining the initial conditions
- Model error
- Limited-area ensemble issues
- A proposed ensemble research agenda for ESRL
Theme ensemble prediction is a discipline, with
numerical principles every bit as important for
success as in the design of dynamical cores,
parameterization schemes.
25Model error in context of ensemble prediction
- (1) Systematic drift
- (2) Treating inherently stochastic processes as
deterministic.
ensemble forecasts
reality
Two grid points with different sub-gridscale
details, equivalent gridscale thermodynamics. The
convective parameterization output is
deterministic, only a function of the
resolved-scale thermodynamics. Reality
may produce very different convective adjustments
for the two grid points. Lack of stochasticity
limits ensemble spread.
Convective Parameterization
X precipitation rate Y convective heating
Convective Parameterization
X precipitation rate Y convective heating
26Model error in context of ensemble prediction
- (3) Models overestimate of the amount of
numerical dissipation of kinetic energy, no
backscatter unrealistic power spectra,
lack of ensemble spread growth?
Power spectra of T799 ECMWF model before and
after CASBS
Cellular automaton stochastic backscatter algorith
m (CASBS)
from Shutts et al., QJRMS, 2005
27Main challenges (2) dealing with model error
- Possible remedies
- Improve the model(s) used.
- Multi-model ensembles.
- Stochastic treatments / stochastic backscatter
- Post-processing.
28Improving the modelhigher-resolution models,
faster error growth good for ensembles!
The reduced error-doubling time in recent
versions of the ECMWF model implies that
differences between ensemble members will grow
more quickly in high-resolution models. Given
lack of spread in ensembles, this is a good
thing. Higher-resolution, better models improve
the ensemble forecasts, too.
Simmons and Hollingsworth, QJ, 2002
29Multi-model approaches
Simulated reflectivity from ensembles used in US
NWS Storm Prediction Centers 2007 Spring
Experiment. Observed nicely bracketed by
ensemble, simulations suggest rotating severe
thunderstorms of type that may spawn tornadoes.
And (next slide)
yellow radar observed gt 40 dBZ
29
29
c/o Steve Weiss, Jack Kain, many others see
http//hwt.nssl.noaa.gov/Spring_2007/loops/wrfs/
30Indeed, there were tornadoes in region of
ensembles forecast supercells.
30
30
31Multi-model forecasts of pressure and surface
temperature
P(MSLP gt climatological mean)
P(2-m temp gt climatological mean)
- Each centers forecast bias corrected using
recent F/O - Not much benefit from multi-model for sea-level
pressure, where biases are relatively small. - More substantial benefit for temperature,
especially temperature extremes.
P(2-m temp gt 90th percentile)
Ref Johnson and Swinbank 2008, taken from
upcoming Bougeault et al. BAMS article on TIGGE.
32When will multi-model approaches help?
Spread (S) within a single model
Error (E)
Spread between models (M)
A graphical way of interpreting the
potential helpfulness of multi-model ensembles
33When will multi-model approaches help?
S
M
E
Here, the difference between-model spread
is orthogonal to the spread from a single model,
and the error is better explained with
multi-models.
34When will multi-model approaches not help?
M
S
E
Here, differences between models are small, and
dont add appreciably to the overall spread
growth, or the explanatory power of the overall
ensemble. Much of the error remains unexplained.
Example MJO. Poor multi-model forecasts
when none of the constituent models simulate it
well.
35Implications for ESRL research from this
- Multi-model (MM) ensemble usefulness is
constrained by accuracy of constituent models. - FIM-GFS MM ensemble will show increasing benefit
when FIM simulates some aspects of atmosphere
more realistically than GFS. - Dont let MM potential distract from task of
model improvement. - A few good models in MM better than many mediocre
ones.
36Stochastic approaches
- Stochastic convection
- Different parameterizations (US SREF, Canada)
- Different constants (trigger functions) in
parameterizations (e.g., Grell/Devenyi) - Multiplying diabatic tendencies by a random
number (Buizza et al. in ECMWF C. Penland
experimentation in GFS) - Stochastic backscatter (Shutts, Berner)
- Integration of stochastic differential equations
37Post-processing to correct model errors
The truth is where the sculptors chisel
chipped away the lie. They Might Be Giants
38Post-processing, general thoughts
- Stable training data sets (reforecasts from
frozen model) help even the best forecast models. - Impact of a large training data set gt impact of
type of calibration technique. - Wont solve every forecast deficiency out there,
but will make a lot of customers happier. - EMC, MDL, ESRL all wish to provide leadership on
post-processing. Working out roles funding now.
39GFS ECMWF, raw and post-processed
In this metric, calibrated 4-5 day forecasts now
as skillful as uncalibrated 1-day forecast.
Big impact of post-processing, even with
state-of-the-art ECMWF EPS
Note 5th and 95th ile confidence intervals very
small, 0.02 or less not plotted
39
Hagedorn, Hamill, Whitaker, MWR, July 2008
40Precipitationskill withmulti-decadalweekly,30-
day, and full trainingdata sets
Skill of raw ensemble not shown, but much
lower. Substantial benefit of multi-decadal weekl
y training data set relative to last 30-day
training data especially at high thresholds.
40
Hamill et al., MWR, July 2008
41Summertime convectionin US Great Plains.
- Week-long simulation of WRF model over US using
4-km grid spacing, explicit convection. - Forecast and observed Hovmollers shows eastward
propagating streaks of precipitation. This
eastward propagation is not forecast correctly in
models with convective parameterizations (not
shown see Davis et al. 2003) - Statistical post-processing wont help much in a
situation where the forecast model cant
correctly propagate the feature of interest. - For this mode of convection, there appears to be
little substitute for a high-resolution,
explicitly resolved ensemble.
41
41
Ref Trier et al., JAS, Oct 2006. See also Davis
et al., MWR, 2003.
42Outline
- How does the NOAA global ensemble compare
internationally? - Main challenges in ensemble forecasting
- Defining the initial conditions
- Model error
- Limited-area ensemble issues
- A proposed ensemble research agenda for ESRL
Theme ensemble prediction is a discipline, with
numerical principles every bit as important for
success as in the design of dynamical cores,
parameterization schemes.
43Lateral boundary condition issues for
limited-area models(and limited-area ensembles)
- With 1-way LBCs, small scales in domain cannot
interact with scales larger than some limit
defined by domain size. - LBCs generally provided by coarser-resolution
forecast models, and this sweeps in
low-resolution information, sweeps out developing
high-resolution information. - Physical process parameterizations for model
driving LBCs may be different than for interior.
Can cause spurious gradients. - LBC info may introduce erroneous information for
other reasons, e.g., model numerics. - LBC initialization can produce transient
gravity-inertia modes.
43
43
Ref Warner et al. review article, BAMS, November
1997
44Lateral boundary conditions(now universally
accepted that perturbed LBCs necessary in LAEFs)
Example SREF Z500 spread for a 19 May 98 case
of 5-member, 32-km Eta model ensemble. (only
small impact on precipitation field) Ref Du and
Tracton, 1999, WMO report for WGNE.
0-h
12-h
24-h
36-h
Perturb both IC LBC
Perturb LBC only
Perturb IC only
44
44
45Influence of domain size
T-126 global model driving lateral boundary
conditions for nests with 80-km and 40-km grid
spacing of limited-area model.
from Warner et al. Nov 1997 BAMS, and Treadon and
Peterson (1993), Preprints, 13th Conf. on Weather
Analysis and Forecasting
45
45
46Influence of domain size, continued
large nested domain
small nested domain
40-km nested domain in global model had thin,
realistic jet streak using large domain (left)
and smeared-out, unrealistic jet streak using
small domain (right). High resolution of small
interior domain not useful here because of
sweeping in of low-resolution information.
46
46
Ref ibid
47Outline
- How does the NOAA global ensemble compare
internationally? - Main challenges in ensemble forecasting
- Defining the initial conditions
- Model error
- Limited-area ensemble issues
- A proposed ensemble research agenda for ESRL
Theme ensemble prediction is a discipline, with
numerical principles every bit as important for
success as in the design of dynamical cores,
parameterization schemes.
48Make FIM a world competitor in global ensemble
prediction
- (1) EnKFs for high-quality initial conditions for
ensembles (ongoing Whitaker / Vukicevic /
Hamill). Major issues - Satellite-data assimilation bias correction
- Quality control
- Portable, modular software (separate forward
operator library) - Adequately rapid growth of spread?
- Testing and intercomparisons
- Characteristics relative to GEFS
- Suitability for global hurricane forecasts
- Resolution / ensemble size tradeoffs
- Supplying ICs / LBCs for local-area models and
EnKFs
49Make FIM a world competitor in global ensemble
prediction
- (2) Deal with model error in systematic way
- Basic research program in model error (new)
- Stochastic convection (Bao, THORPEX)
- Stochastic backscatter (collaboration with J.
Berner, NCAR?) - Stochastic model integration (C. Penland)
- Reforecast based on stable FIM model followed by
calibration research
50Limited-area, high-resolution ensemble research
- Pre-requisite good global system (FIM) to
provide LBCs. - Components of program
- Data assimilation and ensemble initialization at
the mesoscales. - Ensemble size/resolution/configuration tradeoffs.
- What are strengths/weaknesses of hi-res ensembles
(dynamical downscaling) and global
ensemblesstatistical downscaling? - Lagged ensembles vs. conventional multi-initial
condition synoptic ensembles. - Application-specific ensemble research (HMT,
HFIP, NEXGEN)
51Other research areas
- Ensemble verification
- Ensemble visualization
- Probabilistic forecaster editing tools
- Uncertainty communications research
- Post-processing technique development
- Ensemble nowcasting severe/aviation
applications.
52How does this get funded?
- THORPEX (1.2M, of which 400K comes directly to
ESRL for FY09, committed to existing projects.
Flexibility in FY10) - THORPEX/HMT (1.456M starting FY10, but must be
related to HMT interests). - HFIP
- DTC?
- NEXGEN?
- ESRL base?
- Alternative process?
53- Effective incorporation of uncertainty
information will require a fundamental and
coordinated shift by all sectors of the
enterprise. - and NOAA expected to lead.
- ESRL is capable of leading NOAA, but are we
willing to make it a priority?
NRC, 2006
54Thank you
- and thanks to Jeff Whitaker, Chris Snyder, Tomi
Vukicevic, Zoltan Toth, Craig Bishop, Eugenia
Kalnay, and many other local and international
colleagues who have stimulated my thinking on
ensembles.
55Example of covariance localization
obs location
- Background-error correlations estimated
- from 25 members of a 200-member
- ensemble exhibit a large amount of
- structure that does not appear to have any
- physical meaning. Without correction, an
- observation at the dotted location would
- produce increments across the globe.
- Proposed solution is element-wise
- multiplication of the ensemble estimates
- (a) with a smooth correlation function
- (c) to produce (d), which now resembles the
- large-ensemble estimate (b). This has
- been dubbed covariance localization.
-
from Hamill, Chapter 6 of Predictability of
Weather and Climate
back
56Spreads and errors as function of localization
length scale
(1) Small ensembles minimize error with short
length scale, large ensemble with longer
length scale. That is, less localization needed
for large ensemble. (2) The longer the
localization length scale, the less spread in the
posterior ensemble.
Ref Houtekamer and Mitchell, 2001, MWR, 129, pp
123-137.
57Spread growth in EnKF
- Designed (in current version) to produce quality
analyses, not optimized for ensemble forecast
performance. Slow growth of spread (due to
covariance localization? due to transient model
error?), but initial ensemble consistent with
analysis error statistics
Hunch will need to configure EnKF to provide
larger spread growth. Im working on this
currently.
58Problems caused by using outer domain convective
parameterization with explicit convection in nest
Simulation of nested domains, explicitly resolved
convection on inner (3.3 km grid spacing, various
parameterized convection on outer (10, 30 km).
Rainfall on inner domain affected by choice of
what is done on outer domain. E123 is explicit on
each domain, KF12E3 is Kain-Fritsch on 12,
explicit on 3.
58
58
Warner and Hsu, MWR, July 2000
59Paul Nutter et al.s experiments with nested
ensemble forecasts
Experiments using modified barotropic channel
model with smaller interior domains. 25-km grid
spacing. Coarser resolution of driving model for
LBCs simulated by filtering.
59
59
from Nutter et al., Oct. 2004 MWR
60Nutter et al.s experiments, contd.
Simulating the effects of initializing high-resolu
tion interior domain with coarse resolution
analysis. Here, as a baseline, the model
forecasts throughout the full channel are cycled
for a while at high resolution. The variance
spectra in the M domain is calculated (global).
The simulation is then repeated, but initial and
LBC information provided to the M domain is
filtered to remove scales below 150 km,
simulating initialization with a
coarse-resolution analysis and coarse-resolution
information from LBCs. Variance spectra in M
domain is recalculated (LAM). Ratio of the
filtered/unfiltered in LAM is plotted. The
small grid spacing on the interior is useless at
first, inheriting global ensembles without small
scales. Even after a long time, there is not
much variance at the small scales that develops
due to the higher interior resolution. The extra
resolution is largely wasted.
Variance ratio (LAM/global)
Variance ratio (LAM/global)
60
60
61Nutter et al.s experiments, contd.
Variance ratio (LAM/global)
Here the initial condition does initially contain
all scales of motion, but M domain thereafter
receives filtered lateral boundary conditions.
Even with a quality initialization, the small
scales are swept away with time by the
lower-resolution information coming in from the
LBCs.
t0
Variance ratio (LAM/global)
61
61
62Nutter et al.s experiments, contd. Effects of
using linear temporal interpolation with 3-hourly
boundary conditions
Sc domain. Here interior, exterior resolutions
are the same, but correct LBCs are used only
every third hour, and otherwise interpolated, as
is commonly practiced. Shading is vorticity
error, contours are streamfunction error. Notice
pulsing of errors, reduced on boundaries at 0, 3,
6, 9, but larger at in- between times. However,
errors generally grow as a result of temporal
interpolation.
62
62