Title: Verification of weather parameters
1Verification of weather parameters
Anna Ghelli, ECMWF
2Overview
- Deterministic forecast performance for different
weather parameters - Precipitation forecast scores and their
confidence - SYNOP on the GTS
- Precipitation analysis
- Ensemble prediction System its performance
relative to precipitation
32m Temperature Skill (rmse) for different
forecast ranges
North America
Top panel North America
Higher skill in winter. The positive trend of the
timeseries for both winter and summer periods
indicates continuous forecast improvements.
Europe
Bottom panel Europe
Higher skill in winter, interrupted in 2005 by a
period of strong inversion at low levels in
Central Europe which has not been represented
properly by the model..
4Skill for different forecast ranges
Europe
Top panel Specific humidity
Higher skill (MAE) in winter. Last four winters
have consistently kept higher level of
performance
Europe
Bottom panel 10m wind speed
Skill (RMSE) Changes in the forecasting model
have not greatly improved the performance of the
model in forecasting wind speed.
5Skill (rmse) plotted vs forecast day for
different parameters.
1998
The skill has been averaged over a year
2001
Total Cloud Cover black 2m temperature blue
61. FREQUENCY BIAS INDEX
3. HIT RATE
2. TRUE SKILL SCORE
3. FALSE ALARM RATE
7t42
Europe
2mm/24h
24 hour accumulated precipitation verified
against SYNOP on GTS
The forecast is reduced to a yes/no event by
selecting thresholds. Confidence intervals have
been plotted for each TSS value.
t66
25mm/24h
High thresholds have large confidence intervals,
important to remember when assessing performance
of the system
8Europe
t42 solid shading t66 dotted shading
24 hour accumulated precipitation verified
against SYNOP on GTS
FBI measures the ratio between the frequency of
the forecast events and the frequency of the
observed events. FBIgt1 over-estimate FBIlt1
under-estimate
The forecasting system over-estimate the number
of events for thresholds of 1mm/24h. A decrease
of FBI was observed when in the autumn 1999, when
vertical resolution was increased and a new
convection scheme was implemented.
9Europe
5mm/24h
t42 solid shading t66 dotted shading
24 hour accumulated precipitation verified
against SYNOP on GTS
FBI decreases to values closer to 1 as we
increase the threshold, but higher thresholds
have larger confidence intervals!
10Precipitation analysis for Europe
- High density networks in Europe (Member and
Co-operating states) - Upscaling (simple box averaging to obtain a areal
precipitation value)
11Each grid box will contain a certain number of
stations. The number of stations will not be
constant every day. The number of stations per
grid box indicates how representative the
analysis is for the specific grid point.
120.25mm/24h
Europe
- FBI plotted for two thresholds (0.25mm/24h, and
10mm/24h) - Verification against precipitation analysis
(yellow shading), - Verification against SYNOP on GTS (blue dotted)
10mm/24h
FBI values are higher (lower) in the verification
against SYNOP on the GTS (analysis) for lower
(higher) thresholds.
Forecast range t42
130.25mm/24h
Europe
- TSS (threshold 0.25mm/24h) plotted for two
forecast ranges t42 (top) and t90 (bottom) - Verification against precipitation analysis
(yellow shading), - Verification against SYNOP on GTS (blue dotted)
t42
t90
TSS values decrease as we increase forecast
range. In January 2003 there was a model change
improved cloud scheme numerics, revised cloud
scheme and convection
14Europe
5mm/24h
- TSS plotted for two thresholds (5mm/24h, and
15mm/24h) - Verification against precipitation analysis
(yellow shading), - Verification against SYNOP on GTS (blue dotted)
TSS values are higher for winter months.
Confidence intervals become larger as threshold
increases.
15mm/24h
Forecast range t90
15Timeseries of Brier Skill Score for Europe
The BSS is written as 1- BS/BSref Sample climate
is the reference system BS measures the mean
squared difference between forecast and
observation in probability space. Equivalent to
MSE for deterministic forecast
Forecast vs. observations
Increased resolution
Improvements back in Autumn 1999 High
thresholds performance down at the beginning of
2005 linked to drier conditions over Europe?
16Timeseries of Brier Skill Score for Europe
The BSS is written as 1- BS/BSref Sample climate
is the reference system BS measures the mean
squared difference between forecast and
observation in probability space. Equivalent to
MSE for deterministic forecast C
Forecast vs proxy
Increased resolution
17Europe
Rainy season October to April Forecast range
t96 Verification against SYNOP on GTS
Consistent picture for the two seasons
2003-2004 1mm/24h BS0.153
2004-2005 1mm/24h BS0.157
18Europe
Rainy season October to April Forecast range
t96 Verification against SYNOP on GTS
Consistent picture for the two seasons. Higher
thresholds better reliability
2003-2004 10 mm/24h BS0.045
2004-2005 10 mm/24h BS0.04
19Europe
Rainy season October to April Forecast range
t96 Verification against SYNOP on GTS
Consistent picture for the two seasons.
2003-2004 5 mm/24h
2004-2005 5 mm/24h
Full symbol T511 Shape T255
20Europe
ROC Area Verification against SYNOP on GTS for
t96
Drier conditions over Europe?
Increased resolution
21Conclusion
- 2m Temperature positive trends show increased
skills for Europe and North America. Strong
inversion in the winter was not properly forecast
by the T511. - Specific humidity shows increased skills. Winters
more skilful than summers - Wind The changes in the model have not brought
large improvements in the wind speed forecast - TCC small improvements in forecast skill. New
cloud scheme was introduced in April 2005
- Importance of confidence intervals
- Precipitation forecast improvements are slow, but
evident. - FBI indicates over-estimation of small threshold
events ? verification against precipitation
analysis shows a better picture. - Precipitation analysis can be used for
verification in a delayed mode. The number of
station per grid box gives and indication on how
representative is the analysis at any grid point
22Conclusion
- Brier skill score and ROC area increase in
resolution has improved the system. In recent
year the system has maintained its good
performance. The drier conditions of the recent
winter show up in the timeseries ? small sample
size effects? - Reliability diagrams for the last two rainy
seasons show a skilful system