Title: JOINT RESEARCH CENTRE
1Some comments on modifiable areal units
Transforming data from one to another
geographical system
Javier.gallego_at_jrc.it
2The modifiable areal unit problem
- Various problems linked with the change from one
set of areal units to a different one - How to aggregate data?
- Simple addition (easy in principle)
- Smoothing (making maps easy to interpret)
- How to disaggregate data?
- Other phenomena e.g. what happens with
correlations when data are aggregated?
3Data disaggregation
- Yi is known for geographic units Ai, i1I .
- We want Yk for subunits Bk
- So that
- Different situations
- Only the target variable Y for units Ai is known.
- A covariable Z is known for units Bk, but limited
information on the link between Y and Z. - A covariable Z is known for units Bk, and the
link between Y and Z is rather well known . - Individual data with co-ordinates known
(generally confidential) - Possibly with a covariable
4Disaggregation without additional information
- First step
- Ask yourself Are you really sure that you do not
have any additional information? - If you are sure you have several options, but
none of them is usually good - Attribution proportional to area
- Smoothing
- Good for map simplification
- May be used for disaggregation if the target
variable tends to be geographically smooth.
5Simple areal weighting illustration
Simulated example of administrative units and
catchments
- Aim attributing to catchments values of a
statistical magnitude known for administrative
units - Method attributing to each intersection an
amount proportional to the area and reaggregating
per catchment
6Effect when the item is spatially concentrated
The representation by administrative unit gives a
poor picture, but reallocating to catchments
worsens things
7Item with homogeneous distribution in each
administrative unit
Reallocation gives a completely wrong picture.
8Effect when the item is homogeneous per catchment
Representation by administrative unit is quite
bad, but reallocating to catchments does not
improve things.
9Covariable Z known for sub-units with good
information on the link Y-Zj.
- Examples of covariables thematic maps (land
cover, soil, DEM, etc.) - Areal weighting with coefficients proportional
to known Uj - For subunit Bk
10Example of disaggregation with good information
from co-variables
- Target variable Yuse of fertilizers
- Co-variable Z CORINE Land Cover
- We assume we have reliable data by NUTS 2
- We need
- Approx. input per ha of crop in the area
- Proportion of area of each crop in each CLC class
(can be estimated from LUCAS)
11Raw profiles of CLC classes from LUCAS (EU15
except Sweden)
12But things are not so easy.
- CLC profiles with LUCAS need to be improved
- Cleaning noise from co-location inaccuracy
- Adaptation to different geographical areas.
- Input per ha of a given crop is not homogeneous.
- Data per NUTS2 are not necessarily reliable
- Etc
- But perfect is sometimes an enemy of good
13Covariable Z known for sub-units with little
information on the link Y-Z.
- Disaggregation based on a model with parameters
estimated using Y and Z - Defining a mask
- EM algorithm
- Iterative estimation with several levels of
aggregation - Etc
- Examples of covariables thematic maps (land
cover, soil, DEM, etc.)
14Simple areal weighting combined with a mask
The mask improves the mapping, but reaggregating
in a different system degrades it again.
15Example of disaggregation with an iterative
algorithm
- Target variable population (available by
commune) - Co-variable land cover map (CLC)
- Output estimated population density map with the
resolution of CLC.
16Disaggregating population density. Principle of
the iterative algorithm
Known levels
To be estimated
17Iterative algorithm
- Pretend that you only know data at the highest
level (NUTS2) - Disaggregate with your covariable (CLC) and an
initial set of coefficients to commune level - Measure disagreement with known commune data
- Get new coefficients that reduce the disagreement
- Repeat until the disagreement becomes stable
- Apply the estimated coefficients to the commune
data.
18Individual data known (e.g. area frame survey)
- Aggregating data to a different spatial system is
easy in principle - Posible impact on the variance
- If a covariable is known Small area estimators
(Bayesian technique), that uses - Sample units inside the small area
- Link between sample and co-variable everywhere
- Same spatial system desirable for co-variable and
results.
19General comments to disaggregation
- It is always possible to disaggregate and produce
a map. - A different question is the quality of the
disaggregation - The key point is using pertinent covariables
- A number of algorithms can be used
- Assess how precise is the link between the
co-variable and the target variable.