Title: TRAINING SESSION ON HOMOGENISATION METHODS
1TRAINING SESSION ON HOMOGENISATION METHODS
A first glance at homogeneity
Bologna, 17th-18th May 2005
Maurizio Maugeri, University of Milan
2Introduction
- The problem the real climatic signal, that we
try to reconstruct studying long (secular)
records of meteorological data, is generally
hidden behind non-climatic noise caused by
actual errors and station relocation, changes
in instruments, changes in observing times,
observers, and observing regulations, algorithms
for the calculation of means and so on. - climatic time series should not be used for
climate research unless there is a clear
knowledge about the state of the data in terms of
quality and homogeneity.
3Quality
A Quality of the observation and real data
availability
- Classification of the institutions
observatories, high schools, etc - Data sources hand-written original
observations, annuals, pre-existing data-sets,
etc - Time resolution yearly, monthly, daily, etc
B Actual errors (original sources/digital
versions)
- Identifying and applying methods to check each
individual data
4Homogeneity
5Homogeneity
The problem is not easy to manage
Meteorological series can be tested for
homogeneity and homogenised both by direct and
indirect methodologies. The direct approach is
based on objective information that can be
extracted from the station history or from some
other sources, the indirect one uses statistical
methods, generally based on comparison with
other series. Both direct and indirect
methodologies have remarkable limits
6Homogeneity
Direct methodologies are not easy to use as 1)
it is generally very difficult to recover
complete information on the history of the
observations (metadata)2) even when available,
metadata hardly give quantitative estimates of
the inhomogeneities in the measures. Also
indirect methodologies have important
deficits1) they require some hypotheses about
the data (e.g. homogeneous signals over the same
region)2) inhomogeneities and errors are
present in all meteorological series, so it is
often difficult to decide where to apply
corrections and, when the results are not clear,
the risk of applying subjective corrections is
very high.
7Homogeneity
- How to overcome the intrinsic limit of indirect
homogenisation methods is, at present, still an
open question. - The possibilities range from homogenising all
suspect periods, to correcting the series only if
the results of the statistical methods are very
clear and also supported by metadata.
8So, at present, a universal approach to the
problem is still missing
- At present, we have to acknowledge that there are
not universally accepted (or objective)
procedures to deal with suspect or poorly
reliable data. - Lanzante et al., 2003 address the task of
possibly defining a priori and/or a posteriori
objective criteria for detecting artificial
changepoints and real climatic signals. The
difficulty of this problem results from the fact
that the time history of instruments, which is
not always known, is unique to a specific
country, and sometimes to a particular station
within a country.
Lanzante et al., 2003 Temporal Homogenisation
of Monthly Radiosonde Temperature Data. Part I
Methodology, J. Climate, vol. 16, 224-240.
9So, at present, a universal approach to the
problem is still missing
Around the mid 1990s, a group of Italian
researchers set up a wide research program with
the aim of gaining a better understanding of the
evolution of the Italian climate in the last
100/150 years. Within this program we have
developed a methodology to manage the quality and
homogeneity problem
10Getting started with homogenisation
- Get a clear picture of
- history of the observation (i.e. to understand
the evolution of the meteorological network and
to reconstruct the history of each available
station in the data-set) - Actual data availability.
The reconstruction of the network evolution is
not only important to get a picture of the data
quality but also as homogenisation is generally
based on statistical methods. This methods often
fail in identifying breaks that affect a high
fraction of stations within a short period. This
happens when breaks are due to changes in
instruments and methods caused by new standards
imposed by the network management, for example,
as a consequence of new national or international
standards
11Getting started with homogenisation the
preliminary quality checks
Example daily temperatures
- Removal of absurd daily values, e.g. t_max lt -15
C or t_max gt 49 C t_min lt -25 C or
t_min gt 45 C - Flipping of the values whenever t_max lt t_min
(check with neighbouring stations) - Check with neighbouring stations whenever daily
temperature range DTR gt 25 C - Removal of daily temperatures if their absolute
value differs for more than 10 C from the ones
of neighbouring stations - Check with neighbouring stations if t_max or
t_min stays the same for 5 or more consecutive
days - Check with neighbouring stations if t_max and
t_min stays the same for 3 or more consecutive
days - Check with neighbouring stations if t_max t_min
12...but the best is to check each individual datum
An example from HISTALP
Checking outliers by representing spatial
anomalies patterns
13Checking outliers by representing temporal
evolution
Rovigo T_min
14Checking outliers by representing temporal
evolution
Rovigo T_max
15Our methodological approach to homogenisation
1) Collecting as many data and information
(metadata) as possible ? History of the
National network and of the involved
institutions ? History of instruments and
observation methods ? History of every
meteorological station 2) Performing a first
homogenisation by means of direct
methodologies3) Performing final
homogenisation by means of indirect methodologies