Title: Spatial extremes in climate analysis
1Spatial extremes in climate analysis
Workshop, Summerschool St. Petersburg,
Russia 09.-13. September 2007
2Outline
- What are climate extremes?
- R and RCLIM
- Methodology of spatial extremes
- Factors controlling extremes
- Clustering of extremes
- Teleconnections of extremes
3What are climate extremes?
4Examples of wet and windy extremes
Hurricane
Convective severe storm
Extra-tropical cyclone
Polar low
Extra-tropical cyclone
5Examples of dry and hot extremes
Drought
Dust storm
Wild fire
Dust storm
6What do we mean by extreme?
- Large meteorological values
- Maximum value (i.e. a local extremum)
- Exceedance above a high threshold
- Record breaker (thresholdmax of past values)
- Rare event
- (e.g. less than 1 in 100 years p0.01)
- Large losses (severe or high-impact)
- (e.g. 200 billion if hurricane hits Miami)
- risk p(hazard) x vulnerability x exposure
7IPCC 2001 definitions
- Simple extremes
- individual local weather variables
- exceeding critical levels on a continuous
- scale
- Complex extremes
- severe weather associated with particular
- climatic phenomena, often requiring
- a critical combination of variables
- Extreme weather event
- An extreme weather event is an event
- that is rare within its statistical reference
- distribution at a particular place.
- Definitions of "rare" vary, but an extreme
- weather event would normally be as
- rare or rarer than the 10th or 90th percentile.
- Extreme climate event
8Origin of Extreme Events
- 1. Rapid growth due to instabilities
- Fast growth of weather systems caused by positive
feedbacks - e.g. convective instability, baroclinic growth,
etc. - 2. Displacement
- Survival of a weather system into a new spatial
region or time period - e.g. transition of a tropical cyclone into
mid-latitudes. - 3. Conjunction
- Simultaneous supposition of several non-rare
events - e.g. freak waves.
- 4. Intermittency
- Varying variance of a process in space or time
- e.g. precipitation.
- 5. Persistence or frequent recurrence
- Chronic weather conditions leading to a climate
extreme - e.g. drought, unusually stormy wet season,
persistent blocking.
1
3
4
9R and RCLIM
10What is R?
- R is an integrated suite of software facilities
for data manipulation, calculation and graphical
display. Among other things it has - an effective data handling and storage facility,
- a suite of operators for calculations on arrays,
in particular matrices, - a large, coherent, integrated collection of
intermediate tools for data analysis, - graphical facilities for data analysis and
display either directly at the computer or on
hardcopy - a well developed, simple and effective
programming language (called S) which includes
conditionals, loops, user defined recursive
functions and input and output facilities.
(Indeed most of the system supplied functions are
themselves written in the S language.)
11Why R?
- It is free and based on S-PLUS!
- http//www.r-project.org
- Works on several platforms
- Windows
- Mac
- UNIX/Linux
- Several packages available
- http//cran.r-project.org
- A large and very active community working and
updating the software
12What is RClim?
- Initiative by C. A. S. Coelho, C. A. T. Ferro, D.
B. Stephenson and D. J. Steinskog. - Goals
- to develop statistical methods for doing spatial
extremes. - read/write large gridded fields in netcdf format
- do nice geographical contour maps
- do general climate analysis at many grid points
13Development
- Spring 2005 Initiative started
- Spring 2006 Completed as it is today
implemeted in KNMIs climate explorer - August 2006 Paper submitted to Journal of
Climate - Future development Will be updated and expanded
in the future methods for daily data.
14How to get started with RClim?
- Webpage http//www.secam.ex.ac.uk/index.php?nav6
99 - Packages that must be installed
- rNetCDF
- evd
- ismev
- maps
- mapdata
- mapproj
15Installation
- Source the rclim.txt to install all functions.
- Can be found at http//www.nersc.no/dagjs/rcours
e_nzu/DJS/Day4/rclim.txt - Command
- source(rclim.txt)
16Complete course
- A complete course was given in Beijing, China in
August 2007 - Webpage
- www.nersc.no/dagjs/rcourse_nzu
- Made by Hans Wackernagel and Dag Johan Steinskog
17Extreme value analysis
18Motivation
- Weather and climate time series on large
grid-point arrays can be analysed in many ways - Composites
- Correlation maps
- Principal component analysis
- Isolate leading patterns of climate variability
(ENSO, NAO,) - Why not use these methods when analysing extremes?
19- These methods are based on the whole distribution
of a certain variable
And mask the extreme events in the tail of the
distribution
20How are tails
21related to the whole animal?
PDF Probability Density Function Or
Probable Dinosaur Function??
22Generalized Pareto Distribution
For sufficiently large thresholds, the
distribution of values above a sufficiently large
threshold u approximates the Generalized Pareto
Distribution (GPD)
Shape -0.4 upper cutoff Shape 0.0
exponential tail Shape 10 power law tail
Probability density function
23Example Central England Temperature
- n 3082 values
- Min -3.1C
- Max 19.7C
- 90th quantile 15.6C
24GPD fit to values above 15.6C
- Location parameter u15.6C
- Maximum likelihood estimates
- Scale parameter 1.38 /- 0.09C
- Shape parameter -0.30 /- 0.04C
- ? Upper limit estimate
25How good is the model fit?
upper limit 20.3C
u15.6C
? Good fit to 308 observed exceedances
26Return level plot
Deg C
(years)
? Fit can be used to make predictions of return
values
27Methodology of spatial extremes
28Need for extreme value theory methods
- New tools of extreme value theory (EVT)
introduced (Coelho et al., 2007) - Probability theory and statistical science that
deals with the modelling and inference for
extreme values - Two main approaches
- Generalized extreme value (GEV)
- Maximum of blocks of data
- Generalized Pareto distribution (GPD)
- Values above a high threshold
29Dataset used in this example
- HadCRUT2v Monthly mean gridded surface
temperature (Jones and Moberg, 2003) - Available from http//www.cru.uea.ac.uk/cru/data/
temperature/ - Time span January 1870 to December 2005
- Regular 5x5 global grid
- Long variation 136 years
- Grid points with more than 50 missing values and
SH are omitted.
30Example
- European heat wave 2003
- Estimated mortality to be 35000-50000
- Map of temperature anomaly
- Beniston (2003) Normal summer late in this
century
31Definition of extreme events
- Maximum value
- Not very reliable summary of the distribution of
extreme events - Non-resistant to outliers
- Excesses above a pre-defined threshold (t-u)
- Peak-over-threshold method (Coles 2001, chapter
4) - What about the block maxima approach?
- Annual blocks is not appropiate in this example,
only 12 observations available each year - Larger blocks? Decades? Reduce the sample size
for estimation of GEV distribution parameters - More appropriate for daily temperatures
32Choice of Threshold
- Climate data have a seasonal and trend component
- Without taking this into account in choice of
threshold, a bias will occur - Strategies for avoiding this
- Detrend the time series
- Time varying threshold
33Choice of threshold cont.
- Time varying threshold
- Approximately constant exceedance frequency
- Analysis is not biased towards the warmer climate
- Excesses are yielded relative to contemporary
climate
34Choice of threshold cont.
- The threshold chosen in the course is
- 75th quantile time varying threshold
- Definition of threshold
- Long term trend component
- Mean annual cycle
- Constant increment to have a of the observed
values above the threshold
35GPD scale parameter estimate
? Large over extra-tropical land regions
36GPD shape parameter estimate
Generally negative ? finite upper temperature
limit
37Upper limit for excesses
? Largest over high-latitude land regions
38Return periods
- Using the GP distribution, it is possible to
estimate the return period
- Return period is the frequency with which one
would expect on average a given event to recur.
39Return periods for August 2003 event
? Central Europe return period of 133 years (c.f.
Schar et al 46000 years!)
40Note about return periods
- In Schär et al. (2004), a return period of the
heat wave in Europe was estimated to be 46000
years - We estimated 133 years
- So who is right?
- Schär et al. (2004) used no EVD analysis!!
- Assumed the data had a Gaussian distribution
41Factors controlling extremes?
42Factors controlling extremes
- The relationship between extremes and factors
(e.g. time and ENSO) can be examined by modelling
the shape and scale parameters of the GP
distribution as functions of these factors. - For instance the following model can be used to
analyse how the variability of summer
temperature excesses is related to ENSO
43The role of large-scale modes
? ENSO effect on temperature extremes in NH
44Clustering of extremes
45Temporal clustering of extreme events
- Annual frequency of extreme events is a proxy for
clustering of extremes - Average number of summer exceedances is given to
be
- The binary variable e1 if an extreme event is
observed and e0 if an extreme event is not
observed - N is total number of summers with at leat one
observed exceedance
46Average number of exceedances
47Teleconnections of extremes
48Teleconnections between extreme events
- xdependence - Compute extreme dependence measures
between a given p x q x n three-dimensional array
and a given time series of length n. - Assume that we are interested e.g. in
investigating how extreme temperature at one
place are related to extreme temperature at
another place
- The statistics provides a measure of extreme
dependence for asymptotically dependent
distributions.
49Teleconnections cont.
- However, X fails to provide information of
discrimination for asymptotically independent
distributions (Coles, 2001). - Alternative method suggested
- Defined for the threshold on the range 0ltult1. The
statistics ranges from -1 to 1.
50Teleconnections between extremes
511-point association map for extreme events
? association with extremes in subtropical
Atlantic
52Methods in RCLIM
53Methods in RClim
- acs - Compute average cluster size for a given
three-dimensional p x q x n array of excesses.
First two dimensions p and q are space dimensions
(e.g. longitude and latitude). Third dimension n
is time. - boundexcesses - Compute upper bound of excesses
for a given 2 x p x q array of Generalized Pareto
distribution parameters. First index of the first
dimension of the array represents the scale
parameter. Second index of the first dimension of
the array represents the shape parameter. - mygpd.fit - Same as gpd.fit function from ismev
package but with standard error calculation
disactivated. - returnperiod - Compute return period for a given
p x q matrix of excesses and a given 2 x p x q
array of Generalized Pareto distribution
parameters. - tvt - Compute time-varying threshold for a given
monthly time series. - xdependence - Compute extreme dependence measures
between a given p x q x n three-dimensional array
and a given time series of length n.
54Methods in RClim cont.
- xdependence1 - Same as above, but also allows
specification of fraction of non-missing values
for the computation of the statistics. Grid
points with larger fraction of missing values
than specified are excluded. - xexcess - Compute mean excess and variance of
excess for a given n x p x q three-dimensional. - xgev - Compute location, shape and scale
parameters of a Generalized Extreme Value
Distribution for block annual maxima or minima of
a given p x q x n three-dimensional array. - xindex - Compute the intervals estimator for the
extremal index, an index for time clusters, for a
given time series and threshold. - xindexfield - Compute the intervals estimator for
the extremal index at each grid point of a p x q
x n three-dimensional array. - xpareto - Compute shape and scale parameters of a
Generalized Pareto Distribution for a given p x q
x n three-dimensional array.
55Methods in RClim cont.
- xparetotvt - Fit Generalized Pareto distribution
with time-varying threshold at each grid point
for a given p x q x n three-dimensional array of
montly data. - xparetotvtcov - Fit Generalized Pareto
distribution with time-varying threshold at each
grid point for a given p x q x n
three-dimensional array of montly data. Allows
linear modelling of the paramters.
56Reference
- Coelho, C. A. S., C. A. T. Ferro, D. B.
Stephenson and D. J. Steinskog Exploratory tools
for the analysis of extreme weather and climate
events in gridded datasets, Under revision
Journal of Climate - Contact info
- David Stephenson, d.b.stephenson_at_reading.ac.uk
- Dag Johan Steinskog, dag.johan.steinskog_at_nersc.no