Title: Robert Beatty Northern Ireland Statistics and Research Agency
1Robert BeattyNorthern Ireland Statistics and
Research Agency
2Designing small area geographies for census and
social data
- David Martin
- School of Geography, University of Southampton
- ESRC/JISC 2001 Census Programme
3Overview
- Background why design a new geography?
- 2001 EW output area design
- Output area characteristics
- Output area applications
- Development towards Super Output Areas and
Neighbourhood Statistics
4EW Geography briefing
- Unit postcode mean 15 addresses
- Postcode and census geographies incompatible
- 1991 Census enumeration district (ED) mean 419
persons - 2001 Census output area (OA) mean 297 persons
- NB England/Wales, Scotland, NI differences
transferability comments
5Why design a new geography?
- Overtaken by population change
- Unsuitability of enumeration geography for data
output - Population range
- Sub-threshold populations
- No linkage to postal geography/EDs unique to
census - Boundary placement/internal variation
6Case for redesign
- Demands from census users
- Everyday and statutory geographies
- Uniformity of population sizes (all above
threshold) - Control over shape (observe settlement pattern
and topographic features) - Internal homogeneity of population
- Compatibility with previous census geographies!
7Output area design overview
- Synthetic building block polygons
- Unit postcodes nesting within wards (Dec 2002)
and parishes, incorporating topographic features - Automated zoning procedure
- Iteratively recombination of building blocks
seeking best trade-off of design constraints
(Openshaw, 1977)
8Address-based Thiessen polygons
- Thiessen polygons around individual
ADDRESS-POINTS intersected with ED, ward, parish
boundaries and road centrelines
9Unit postcode building blocks
- Address polygon boundaries dissolved to form
unit postcode polygon building blocks
10Transferability issues (1)
- Unit postcodes as building blocks because they
represent everyday geography - Could be street blocks or any other small unit to
which social data can be referenced - Building from existing digital datasets but not
a trivial task!
11Automated output area design
Initial Random Aggregation of Building Blocks
Iterative Recombination
Design Constraints (Contiguity,
Thresholds, Shape, Size, Homogeneity)
2001 Output Areas
12OA design (1)
Initial random aggregation of postcodes into
potential output areas
13OA design (2)
Choose one postcode at random as candidate for
swapping into a different output area
14OA design (3)
Make the swap and evaluate the impact on the
overall solution
15OA design (4)
If swap does not result in an improvement, go
back to the previous configuration
16OA design (5)
Choose another postcode at random as candidate
for swapping into another output area
17OA design (6)
If the swap results in an overall improvement,
keep it as part of the solution and examine a new
potential swap
18Contiguities, thresholds, urban/rural
- Output areas assembled from contiguous postcodes
(NB treatment of stacked postcodes) - Output areas above 100 person and 40 household
thresholds (NB treatment of sub-threshold
parishes) - Initial postcode classification to urban/rural
based on DETR boundaries
19Size and shape
- Output areas should be as uniformly sized as
possible target 125 households - minimize S(OApop-target)2
- Output areas should be as compact as possible
- minimize distance to OA centroid
20Distance to OA centroid
PC mean centroid OA centroid
21Intra-area correlations
- Maximize intra-area correlations (IAC) ratio of
area level to individual level variance - Higher correlations greater internal
homogeneity - Tenure (4) and dwelling type (7) categories used
22Combination of constraints
- All constraint statistics recomputed at each
iteration - Must always meet contiguity and threshold
requirements urban/rural if possible above
threshold - Population, shape and homogeneity constraints
combined with equal weighting
232001 Output Areas (n175,434)
England and Wales
24OA sizes summarized
25 1991 Ward 1991 ED Code-Point
Portswood, Southampton
262001 OA 1991 Ward Code-Point
Portswood, Southampton
27Portswood, Southampton
282001 OA 1991 Parish 1991 ED
South Molton, Devon
29County District Ward Output Area
30Neighbourhood statistics
- In support of national strategy for neighbourhood
renewal - Aggregated administrative source records -
updated - Durable core geography
31Developments towards Super Output Areas
- OAs as building blocks Three-tier geography
- Tier one (pop mean 1500 n35000)
- Automated zone design, nesting within census
wards now created - Tier two (pop mean 7500 n7000)
- Automated design, not constrained to wards, test
areas consultation - Tier three (pop mean 25k)
- Methodology under consideration
32Greenwich lower layer SOAs
33Conclusion designing small area geographies
- Recognition that collection geographies are
rarely the best output geographies - User demand for small area statistics related to
everyday use (postcodes, addresses) - Benefits of stable geographical base
- Knock-on implications for official statistics
34ESRC/JISC 2001 Census Programme
http//census.ac.uk