The United Kingdom National Area Classification of Output Areas - PowerPoint PPT Presentation

1 / 23

About This Presentation

Title:

The United Kingdom National Area Classification of Output Areas

Description:

Comparing the Core cities. Voting patterns. Deconstructing Rural England ... NISRA in Northern Ireland. More than just a problem of stitching the tables together: ... – PowerPoint PPT presentation

Number of Views:35

Avg rating:3.0/5.0

Slides: 24

Provided by: pgdv

Category:

more less

Transcript and Presenter's Notes

Title: The United Kingdom National Area Classification of Output Areas

1
The United Kingdom National Area
Classification of Output Areas

Daniel Vickers
with Phil Rees Mark Birkin
School of Geography, University of Leeds

2
PopFest 2004Was held 22nd - 24th June at the
School of Geography, University of Leeds

Presentations and abstracts can be viewed online
at
http//www.geog.leeds.ac.uk/conferences/popfest200
4/

2
3
What will I be talking about today?

Introduction to Area Classification and Output
Areas
How the Classification system was made including
What data goes in?
Methods of standardisation
Issues of cluster number selection
Cluster selection
Cluster Creation
Naming the clusters
How well does the classification discriminate
Census data
Comparing the Core cities
Voting patterns
Deconstructing Rural England
Mapping the Classification
Focus on Leeds
A look around the country

3
4
What is an Area Classification?

A segmentation system which groups similar
neighbourhoods into categories, based on the
characteristics of their residents a
simplification of complex datasets.

What is an Output Area?

The smallest area for census output
223, 060 in the UK
EW 174,434 min size 40 hholds 100 people
Scotland 42,604 min size 20 hholds 50 people
NI 5,022 min size 40 hholds 100 people

4
5
What Goes In?

41 Census Variables covering
Demographic attributes
Including - age, ethnicity, country of birth and
population density
Household composition
Including - living arrangements, family type and
family size.
Housing characteristics
Including - tenure , type size, and
quality/overcrowding
Socio-economic traits
Including - education, socio-economic class, car
ownership commuting and health care.
Employment attributes
Including - level of economic activity and
employment class type.
How many data inputs are involved?
223,060 Output Areas, 41 Variables
9,145,460 data points

5
6
The Three Census Problem

The Census in the UK is run by three separate
agencies
ONS in England Wales
GROS in Scotland
NISRA in Northern Ireland
More than just a problem of stitching the tables
together
Some tables given different numbers
Some of the questions on each table are different
Some of the questions on the tables are in
different places

6
7
Standardising the Data

Log Transformation

Why?
Reduces the effect of extreme values (outliers)
Why?
Range standardisation between 0 -1 Problems
will occur if there are differing scales or
magnitudes among the variables. In general,
variables with larger values and greater
variation will have more impact on the final
similarity measure. It is therefore necessary to
make each variable equally represented in the
distance measure by standardising the data.
7
8
Issues of Cluster Number Selection

When choosing the number of clusters to have in
the classification there were three main issues
which need to be considered.
Issue 1 Analysis of average distance from
cluster centres for each cluster number option.
The ideal solution would be the number of
clusters which gives smallest average distance
from the cluster centre across all clusters.
Issue 2 Analysis of cluster size homogeneity for
each cluster number option. It would be useful,
where possible, to have clusters of as similar
size as possible in terms of the number of
members within each.

8
9
Issues of Cluster Number Selection

Issue 3 The number of clusters produced should
be as close to the perceived ideal as possible.
This means that the number of clusters needs to
be of a size that is useful for further analysis.
At the highest level of aggregation, the cluster
groups should be about 6 in number to enable good
visualisation and these clusters should also be
given descriptive names.
At the next level of aggregation, the number of
groups should be about 20. This would be good for
conceptual customer profiling.
At the next level of aggregation, the number of
groups should be about 50. This can be used for
market propensity measures from the larger
commercial surveys.
(Personal Communication 2003, from Martin
Callingham, Independent Market Research
Consultant and Birkbeck College, co-editor of
Qualitative Market Research Principle and
Practice, Sage, 2003)

9
10
Cluster Selection

A three tier hierarchy 7, 21 52 clusters

First Level target 6, 7 selected based on
analysis of, average distance from cluster centre
and size of each cluster.
Second Level target 20, 21 selected based on
analysis of, average distance from cluster centre
and size of each cluster.
Third Level target 50, 52 selected based on size
of each cluster. Split into either 2 or 3 groups

10
11
Cluster Creation

Modified K-means clustering
First level run as standard k-means
Second level, first level is split into separate
files and each file is clustered separately
Third level, second level is split into separate
files and each file is clustered separately

11
12
Cluster Creation
12
13
Naming the Clusters
The naming of the clusters is a near impossible
task and one that always provokes much debate.
However, the task is very important, as if it is
done wrongly it can create a false impression of
the people within a cluster. The naming must
follow two general principles 1. Must not
offend residents 2. Must not contradict other
classifications or use already established names.

13
14
How Well Does It Discriminate?
Detached Housing
14
15
How Well Does It Discriminate?
Population Density
15
16
How Well Does It Discriminate?
Indian, Pakistani Bangladeshi
16
17
How Well Does It Discriminate?
Unemployed
17
18
Comparing the Core Cities
18
19
Who do Each Type Vote for?
19
2001 Election Data courtesy of Ed Fieldhouse,
CCSR, University of Manchester
20
Deconstructing Rural England (Devon case study)
Devon Average 31 UK Average 12.5
20
21
Focus On Leeds
Map appears in forthcoming book Twenty-First
Century Leeds Geographies of a Regional City
edited by Rachael Unsworth John Stillwell
Boundaries Community Areas, as defined by Pete
Shepherd, School of Geography, University of
Leeds (built from Output Areas)
21
22
Consultation
62 respondents so far, 33 Academics, 28 Local
Government Two most confused types 4 Blue
Collar Communities 6 Constraints of
Circumstance Easiest type to identify 5 Idyllic
Countryside Consultation to end 4/10/2004
Results as at 10/9/04
22
23
Where would you like to go?
Belfast Brighton Birmingham Bradford Bristol Cambr
idge Cardiff Carlisle Derby Dundee Edinburgh Exete
r Glasgow Hull Ipswich
Leeds Leicester Lincoln Liverpool London Mancheste
r Newcastle Norwich Nottingham Oxford Plymouth She
ffield Southampton Swansea York-
23

Write a Comment

User Comments (0)