Title: PQPF using ensemble forecasts and reforecasts
1PQPF using ensemble forecasts and reforecasts
NOAA Earth System Research Laboratory
- Tom Hamill
- NOAA Earth System Research Lab
- Boulder, CO
- tom.hamill_at_noaa.gov
2Two grand successes of NWP (1) Improved,
high-resolution forecast models
We now have convection-permitting forecast
models that produce forecasts that look, for the
first time, somewhat like radar images of
precipitation.
3Grand success (2) ensemble forecastsMultiple
simulations of the weather from slightly
different initial conditions, perhaps different
forecast models
Deterministic forecast totally
misses damaging storm over France
some ensemble members forecast it well.
Probabilities commonly estimated from
frequency of event in the ensemble.
from Tim Palmers book chapter, 2006.
4Combining successes high-resolution ensembles!
From NCEP/SPCs Spring Experiment, a test of
high-resolution (4-km grid spacing)
ensembles. With some phenomena like supercells,
they simply wont exist in lower-resolution
models.
4
4
c/o NCEP/SPC Spring Experiment (Weiss, Kain, et
al., 2007)
5- Problem with current ensemble forecast systems
- Forecasts may be biased and/or deficient in
spread, so that - probabilities are mis-estimated. More so for
surface temp precip. than Z500
Heavy rain in an area where none of the ensemble
members predicted it.
http//www.spc.noaa.gov/exper/sref/
6Sources of errors inensemble systems
- (1) Initial condition errors would like ensemble
to sample distribution of plausible analysis
states given current past observations. - (2) Model errors. Insufficient resolution,
faulty parameterizations, use of limited-area
models, etc.
7Model error at mesoscaleExample cloud
microphysical processes
Conversion processes, like snow to graupel
conversion by riming, are very difficult to
parameterize but very important in convective
clouds.
Especially for snow and graupel the particle
properties like particle density and fall speeds
are important parameters. The assumption of a
constant particle density is questionable.
Aggregation processes assume certain collision
and sticking efficiencies, which are not well
known.
Most schemes do not include hail processes like
wet growth, partial melting or shedding (or only
very simple parameterizations).
The so-called ice multiplication (or
Hallet-Mossop process) may be very important, but
is still not well understood
7
from Axel Seifert presentation to NCAR ASP summer
colloquium
8Model error at mesoscaleSummary of
microphysical issuesin convection-resolving NWP
- Many fundamental problems in cloud microphysics
are still unsolved. - The lack of in-situ observations makes any
progress very slow and difficult. - Most of the current parameterization have been
designed, operationally applied and tested for
stratiform precipitation only. - Most of the empirical relations used in the
parameterizations are based on surface
observation or measurements in stratiform cloud
(or storm anvils, stratiform regions). - Many basic parameterization assumptions, like
N0const., are at least questionable in
convective clouds. - Many processes which are currently neglected,
or not well represented, may become important in
deep convection (shedding, collisional breakup,
...). - One-moment schemes might be insufficient to
describe the variability of the size
distributions in convective clouds. - Two-moment schemes havent been used long
enough to make any conclusions. - Spectral methods are overwhelmingly complicated
and computationally expensive. Nevertheless, they
suffer from our lack of understanding of the
fundamental processes.
8
from Axel Seifert presentation to NCAR ASP summer
colloquium
9Lateral boundary condition issues for
limited-areaensemble forecast systems
- With 1-way LBCs, small scales in domain cannot
interact with scales larger than some limit
defined by domain size. - LBCs generally provided by coarser-resolution
forecast models, and this sweeps in
low-resolution information, sweeps out developing
high-resolution information. - Physical process parameterizations for model
driving LBCs may be different than for interior.
Can cause spurious gradients. - LBC info may introduce erroneous information for
other reasons, e.g., model numerics. - LBC initialization can produce transient
gravity-inertia modes.
9
9
Ref Warner et al. review article, BAMS, November
1997
10Influence of domain size
T-126 global model driving lateral boundary
conditions for nests with 80-km and 40-km grid
spacing of limited-area model.
from Warner et al. Nov 1997 BAMS, and Treadon and
Peterson (1993), Preprints, 13th Conf. on Weather
Analysis and Forecasting
10
10
11Influence of domain size, continued
large nested domain
small nested domain
40-km nested domain in global model had thin,
realistic jet streak using large domain (left)
and smeared-out, unrealistic jet streak using
small domain (right). High resolution of
interior domain not useful here because of
sweeping in of low-resolution information.
11
11
Ref ibid
12Model error at mesoscale(1) errors from
insufficient grid spacing
- George Bryan (NCAR) tested convection in simple
models with grid spacings from 8 km to 125 m
12
Ref http//www.mmm.ucar.edu/people/bryan/Presenta
tions/bryan_2007_nssl_resolution.pdf
134 km, 1 km, 0.25 km
- Across the squall line vertical cross section for
25 ms-1 wind shear. Shading mixing ratio (g
kg-1) contours (vertical velocity (every 4
ms-1). - Dramatic changes in structure of squall line,
updraft, positioning of cold pool.
13
Ref http//www.mmm.ucar.edu/people/bryan/Presenta
tions/bryan_2007_nssl_resolution.pdf
144 km, 1 km, 0.25 km
- Along the squall line vertical cross section for
20 ms-1 wind shear. Shading mixing ratio (g
kg-1) contours (vertical velocity (every 4
ms-1). - Updrafts increase in number and intensity with
increasing resolution, decrease in size.
14
Ref http//www.mmm.ucar.edu/people/bryan/Presenta
tions/bryan_2007_nssl_resolution.pdf
154 km, 1 km, 0.25 km
- System propagation approximately converged at 1
km for high-shear cases. - For low-shear environment (more weakly forced)
resolutions above 1 km are increasingly
inadequate.
15
Ref http//www.mmm.ucar.edu/people/bryan/Presenta
tions/bryan_2007_nssl_resolution.pdf
16What are the implications for hydrological PQPF?
17Example 1-2 day lead hydrologic forecast for a
basin in Northern Italy
Hydrologic model forced with multi-model weather
ensemble data.
Skill of hydrologic forecast tied to the skill of
the precipitation/temperature forecasts. Here,
all forecasts missed timing of rainfall event, so
subsequent hydrologic forecasts missed event.
Reservoir regulation, hydrologic model may have
also had effects.
17
Source A meteo-hydrological prediction system
based on a multi-model approach for ensemble
precipitation forecasting. Tomasso Diomede et
al, ARPA-SIM, Bologna, Italy.
18In an ideal world what I think hydrologists want
- An ensemble of data to feed into ensemble
streamflow models, rather than just probability
forecasts. Hydrologists will then run ensembles
of streamflow models. - Ensembles must be reliable (when the frequency of
this ensemble says P90, it happens 90 of the
time), even (especially!) for extreme events - Sharpness (more 0 and 100, less of
climatological probability, if still reliable). - Geographic specificity, to the extent its
predictable (e.g., more snow in west Boulder than
east Boulder). - Correct spatial and temporal correlations.
19Possible paths forward
- Use CPU resources to rapidly develop
higher-resolution ensembles with improved
physical veracity. Improve methods of generating
initial conditions, generate ways of dealing with
uncertainty of the forecast model itself. What
weve been doing - Use those CPU cycles to run a fixed model and
data assimilation system, albeit an older,
low-resolution one. Run real-time, plus many past
forecast cases. Diagnose the forecast error
characteristics and generate statistically
adjusted forecasts (reforecasting) - (3) Compromise between the two.
20Approach 1(Building and continually improving
a highest-resolution ensemble)
- ADVANTAGES
- (1) CPU cycles dedicated to forecasts at highest
resolution, with best physics. - (2) Small-scale features may actually be resolved
by the model, rather than inferred from
larger-scale conditions and statistical voodoo. - (3) As soon as improved model is developed, it
can be implemented. - (4) Some phenomena require high resolution
forecasts, and no amount of statistical
post-processing can get around this (next page). - DISADVANTAGES
- (1) Raw probability forecasts biased. And dont
expect bias ? 0 with the next implementation. - (2) Correction of model problems difficult for
human (or computer) to estimate without a long,
careful look. - (3) Rapid changes ? little experience with model
before next version. - (4) Resolving a feature ? successfully predicting
a feature. You may be led into a sense of
overconfidence by high-resolution model.
21Summertime convectionin US Great Plains.
- Week-long simulation of WRF model over US using
4-km grid spacing, explicit convection. - Forecast and observed Hovmollers shows eastward
propagating streaks of precipitation. This
eastward propagation is not forecast correctly in
models with convective parameterizations (not
shown see Davis et al. 2003) - Statistical downscaling wont help much in a
situation where the forecast model cant
correctly propagate the feature of interest. - For this mode of convection, there appears to be
little substitute for a high-resolution,
explicitly resolved ensemble.
21
21
Ref Trier et al., JAS, Oct 2006. See also Davis
et al., MWR, 2003.
22Approach 2Reforecasting (correcting our
mistakes)
- ADVANTAGES
- (1) Preliminary results show that the equivalent
of gt 10 yrs. of NWP model development can be
obtained through judicious forecast calibration
with a large set of reforecasts. - (2) Can nearly eliminate bias spread
deficiencies, downscale. - (3) End users like a stable, known product, and
the forecast characteristics of reforecast-based
products wont change often. - DISADVANTAGES
- (1) Major improvements may not be able to be
implemented quickly. If new model, must take the
time to run reforecasts (expensive). - (2) Processes that form precipitation, like
thunderstorms, cant be resolved, and must be
parameterized. - (3) You learn much about the error
characteristics of an old model, not a new one.
23Rest of todays talk
- Wont talk about
- Approach 1, developing and improving hi-res.
models. Youre probably well-educated already. - Climate forecasting and reforecasting. Marginal
skill, not low-hanging fruit. Anyway, Kevin
Werner will talk about this. - Will talk about
- Reforecasting for shorter-range forecasts, 1 day
to several weeks. Here is where there is a large
gain from statistical post-processing. - How reforecasting may fit into NWS plans.
24Do we really need reforecasts extending over
years or decades?
Consider training with a short sample in a
climatologically dry region. How could you
calibrate this latest forecast?
youd like enough training data to have
some similar events at a similar time of year to
this one.
25Why not boost sample size by compositing
statistics over different locations?
Probably a good idea, if done with care.
However, even nearby grid points may have
different forecast errors.
Panels (a) and (b) provide the cumulative density
function (CDF) of 1-day forecasts of
precipitation for 1 January (CDFs determined from
reforecast data and observations in Dec-Jan).
Panel (a) is for a location on the CA coast, just
north of San Francisco, and panel (b) is for
Sacramento, CA. Panel (c) provides the implied
function for a bias correction from the forecast
amount to a presumed observed amount. Note the
very different corrections implied at two nearby
locations.
26NOAAs reforecast data set
- Model T62L28 NCEP GFS, circa 1998
- Initial Conditions NCEP-NCAR Reanalysis II plus
7 /- bred modes. - Duration 15 days runs every day at 00Z from
19781101 to now. (http//www.cdc.noaa.gov/people/j
effrey.s.whitaker/refcst/week2). - Data Selected fields (winds, hgt, temp on 5
press levels, precip, t2m, u10m, v10m, pwat,
prmsl, rh700, heating). NCEP/NCAR reanalysis
verifying fields included (Web form to download
at http//www.cdc.noaa.gov/reforecast). - Real-time probabilistic precipitation forecasts
http//www.cdc.noaa.gov/reforecast/narr
27Theory underlying analog calibration technique
the probability distribution of the true state
given todays forecast, where
Here, before simplification, xT refers to the
true state vector (presumably high-resolution),
and xf refers to the (lower-resolution)
ensemble- forecast state vector.
28A simplified construct for calibration using
reforecasts
- Most of the information in our GFS reforecast
ensemble - contained in the ensemble mean, so
- (2) Lets find the distribution of the observed
conditional upon - the part of the forecast state thats nearby
i.e., dont worry about - Washington, DC when making a forecast for
Washington, State.
29Producing a distribution of observed given
forecast using analogs
On the left are old forecasts similar to todays
ensemble- mean forecast. For making
probabilistic forecasts, form an ensemble from
the accompanying analyzed weather on
the right-hand side.
30Producing a distribution of observed given
forecast using analogs
On the left are old forecasts similar to todays
ensemble- mean forecast. For making
probabilistic forecasts, form an ensemble from
the accompanying analyzed weather on
the right-hand side.
31Verified over 25 years of forecasts skill
scores use conventional method of calculation
which may overestimate skill (Hamill and Juras
2006, QJRMS, Oct).
32Comparison against NCEP medium-range T126
ensemble, ca. 2002
the improvement is a little bit of increased
reliability, a lot of increased resolution.
33Analog example Day 4-6 heavy precipitation in
California, 0000 UTC 29 December 1996 - 0000
UTC 1 January 1997
This sort of spatial detail added by reforecast
calibration can be expected in regions of complex
terrain, where precipitation climatology varies a
lot.
34(Will reforecasts still add value when using a
much improved model? Yes.)
Here we also have T255 ECMWF reforecasts
35Effect of training sample size
ECMWF reforecasts were available once per
week over a 20-year period. Compared skill of
calibrated forecasts relative to using last 30
days of forecasts.
36Real-time products from GFS
37Whats next for reforecasting?
- Growing interest from NWP centers worldwide
- ECMWF operational with once-weekly, 5-member
ensemble reforecasts, past 18 years. - Canadians hoping to do 5-year ensemble
reforecasts - NCEP envisioning 1-member, real-time reforecast
for bias correction. - NOAA/ESRL hoping to develop more complete
2nd-generation reforecast data set that would be
used to determine a long-term strategy for how
reforecasts would be implemented into operations.
38Research questions
- Given computational expense of reforecasts, how
do we best - Limit the number of reforecasts that we need to
do (fewer ensemble members, not every day, etc.) - Can we do things like composite the data across
different locations to boost sample size? - Do we need a new reanalysis every time we do a
new reforecast? - Do the benefits of reforecasts propagate down to
users like hydrological forecasters? - We welcome your thoughts and requirements for
next-generation reforecast system.
39References
Hamill, T. M., J. S. Whitaker, and X. Wei, 2003
Ensemble re-forecasting improving medium-range
forecast skill using retrospective forecasts.
Mon. Wea. Rev., 132, 1434-1447.
http//www.cdc.noaa.gov/people/tom.hamill/reforeca
st_mwr.pdf Hamill, T. M., J. S. Whitaker, and
S. L. Mullen, 2005 Reforecasts, an important
dataset for improving weather predictions. Bull.
Amer. Meteor. Soc., 87, 33-46. http//www.cdc.noaa
.gov/people/tom.hamill/refcst_bams.pdf
Whitaker, J. S, F. Vitart, and X. Wei, 2006
Improving week two forecasts with multi-model
re-forecast ensembles. Mon. Wea. Rev., 134,
2279-2284. http//www.cdc.noaa.gov/people/jeffrey.
s.whitaker/Manuscripts/multimodel.pdf Hamill,
T. M., and J. S. Whitaker, 2006 Probabilistic
quantitative precipitation forecasts based on
reforecast analogs theory and application. Mon.
Wea. Rev., in press. http//www.cdc.noaa.gov/peopl
e/tom.hamill/reforecast_analog_v2.pdf Hamill,
T. M., and J. Juras, 2006 Measuring forecast
skill is it real skill or is it the varying
climatology? Quart. J. Royal Meteor. Soc., 132,
2905-2923 . http//www.cdc.noaa.gov/people/tom.ham
ill/skill_overforecast_QJ_v2.pdf . Wilks, D. S.,
and T. M. Hamill, 2006 Comparison of
ensemble-MOS methods using GFS reforecasts. Mon.
Wea. Rev., 135, 2379-2390. http//www.cdc.noaa.gov
/people/tom.hamill/WilksHamill_emos.pdf Hamill,
T. M. and J. S. Whitaker, 2006 White Paper.
Producing high-skill probabilistic forecasts
using reforecasts implementing the National
Research Council vision. Available at
http//www.cdc.noaa.gov/people/tom.hamill/whitepap
er_reforecast.pdf . Hagedorn, R., T. M. Hamill,
and J. S. Whitaker, 2008 Probabilistic forecast
calibration using ECMWF and GFS reforecasts. Part
I Two-meter temperatures. Mon. Wea. Rev., 136,
2608-2619. http//www.cdc.noaa.gov/people/tom.hami
ll/ecmwf_refcst_temp.pdf Hamill, T. M., R.
Hagedorn, and J. S. Whitaker, 2008 Probabilistic
forecast calibration using ECMWF and GFS
reforecasts. Part I Precipitation. Mon. Wea.
Rev., 136, 2620-2632. http//www.cdc.noaa.gov/peop
le/tom.hamill/ecmwf_refcst_ppn.pdf
40Framing the calibration problem
- Suppose the climate were stationary (unchanging
from decade to decade). - Suppose that we had quality weather observations
going back many millennia - Suppose we had an ensemble of reforecasts
available back over those many millennia. - Then how might we utilize the reforecasts to
improve todays forecast?
41Estimating the conditional distribution with
analogs
Suppose we have old forecasts that are
identical to todays
Then
estimated from
42Asymptotic behavior of analog technique
- Q What happens as corr(F,O) ? 0 ? A Ensemble of
observed analogs becomes random draw from
climatology. - Q What happens as corr(F,O) ? 1 ? A Ensemble
of observed analogs looks just like todays
forecast. Sharp, skillful forecasts.
43Nov 06 OR-WA floods, 3-6 day forecast
44Bias, spread, and downscaling corrections in
analog technique
raw ens
refcst analogs
Cant find any other reforecast analogs
with precip as heavy. But introduce large scatter
by taking associated observed analogs.
Again, few close reforecast analogs.
But observed data recognizes overforecast bias.
Here there are close reforecast analogs.
Observed data introduces spread, increases amount.