Title: Rich Harris Scott Orford
1Rich HarrisScott Orford
- Theres space in them lines
- The geography of multilevel modelling
- Downloadable from www.geodemographics.info
2Introduction
- We argue that geographers should take numbers and
statistics seriously not because they are
necessarily natures own language, which was
their justification in the quantitative
revolution, but because they are a crucial
component in the construction of social reality
(Barnes, 2001 379). - The still apparent dismissal of quantitative
approaches under the label of positivism (by
some) hides the complex spatial formations
implicit or explicit to specific techniques.
3About multilevel modelling (MLM)
- Addresses the geographical paradox of traditional
regression models - Traditional regression modelling spatial
autocorrelation (localized patterns) in the
residuals are a violation of the regression
assumptions. - In contrast, multilevel modelling
- In conceptual terms, it can be seen that within a
multilevel approach heterogeneity and difference
both between people and contexts are seen as
the norm, not an aberration (Duncan et al., 1998
104, emphasis added). - In geographical context this usually the
difference(s) between people at one scale and
places at another.
4Conceptual model
j
y
x
i
5which (in model terms) is equivalent to
j
y
x
i
6Spatial structure
- Vertical not horizontal linkage
- i.e. individuals are placed into groups at a
higher level - The physical distances between members within
groups makes no difference in algorithmic terms - Discrete object (vector) view of space
- Contrasts with e.g. Geographically Weighted
Regression (GWR) - yi ?0(ui,vi) ? ?k(ui,vi)xik ?I
- ?k(u,v) is a continuous function (at a fixed
level) - Regression coefficients are not assumed to be
random but a deterministic function of location
in geographical space - A model of spatial interaction is assumed a
priori - i.e. the (x, y) location of an observation is
another attribute (measurement) that is used in
the calculation - Discrete/Field distinction is not clear-cut.
7k
j
i
Wij ? 1/dc
8Uses of multilevel modelling
- Often looking for neighbourhood (/contextual)
effects - We would expect spatial differences anyway
- Compositional relationship
- Y ? X
- So if x1 gt x2 then y1 gt y2
- But, we are looking for second order effects
- yij ? (xi(j), J)
- These could be exogenous (e.g. regional
effects) - Or, endogenous (spatial interaction or
collective effects) - The underlying rationale of area-based policies
is that concentrations of deprivation give rise
to problems greater than the sum of its parts
(McCulloch, 2001 687).
9Partitioning variance
- There is nothing magical about multilevel
models the principle difference between them and
simple OLS regression models is that multilevel
models permit complex error terms (i.e., variance
components) by using sophisticated computational
algorithms (Oakes, 2003 1934). - There is an individual-level, micro-model which
represents the within-place equation, and an
ecological, macro-model in which the parameters
of the within-place model are the responses in
the between-places models. This simultaneous
specification allows for the separation, in a
quantitative sense, of the compositional from the
contextual (Duncan et al., 1998 102). - yij ?0j ?1jxij eij
- ?0j ?0 u0j
- ?1j ?1 u1j
- N(0,?)
10Example of some recent research questions
- In a system where parents have constrained choice
as to which schools their children attend - What factors are associated with a pupil
attending a near school? - Is a consequence of pupils not attending a near
school to increase observed ethnic segregation at
the school level over and above that expected
from the neighbourhood from which they are drawn? - Does increased segregation (if it occurs) affect
school performance?
11General model structure
- Cross-classified, multilevel model
12Random intercepts random slope model
1
p(NR2)
X
0
Grand mean
Departure from mean for given Mosaic type
Departure from mean for given school
13(for white pupils in settlement A)
14C20 Asian Enterprise
C26South Asian Industry
G42 Low Horizons
D23 Industrial Grit
G42 Low Horizons
D23 Industrial Grit
C26South Asian Industry
C20 Asian Enterprise
Crudely this is sensitivityto local whiteness
15Some (technical) benefits of MLM
- Explicitly models spatial autocorrelation and
spatial heterogeneity (which are both forms of
geographical context) - Improved (precision weighted) estimates and
adjusted standard errors - Individual level parameters estimated using a MLM
will differ from those estimated using a single
level model in the presence of SA - Hence some aspect of context (geography) is
built-into (implicitly expressed by) the
individual level MLM estimates (i.e. components) - Even if there were no higher / second order
effects, in the presence of spatial
autocorrelation you would still need a MLM
16Critiques of multilevel modelling (1)
- The Modifiable Areal Unit / definition of
neighbourhood hierarchy problems - There is no single, generalisable
interpretation of the neighbourhood (Kearns
Parkinson, 2001 2103). - In all applications of multilevel modelling we
are required to believe that (a) people live out
their lives within a fixed spatial hierarchy,
that (b) we can identify and quantify that
hierarchy, and that (c) this hierarchy will be
appropriate for our whole sample population
(Mitchell, 2001 1358).
17Critiques of multilevel modelling (2)
- The missing variable problem
- If studies of neighbourhood effects fail to
control adequately for the influence of
neighbourhood and household characteristics, they
may attribute to neighbourhoods what are really
the effects of the omitted household and
individual variables - Some relevant individual characteristics are
harder to observe, however, and are generally not
captured in empirical research (McCulloch, 2001
670, emphasis added).
18Critiques of multilevel modelling (3)
- The slicing scales problem
- Although human geography has worked for some
time with the truism that people make places,
just as places make people, it is striking how
often the story of health and place stops short
of embracing this mutuality (Smith Easterlow,
2005 176, emphasis added). - Multilevel approaches ask us to make a formal
distinction between characteristics which are
individual and those which are area based this
is a step backwards in terms of our understanding
of how people and place are related (Mitchell,
2001 1358).
19Critiques of multilevel modelling (4)
- The C-word dualism
- The distinction between composition and context
may not be as conceptually clear or as useful as
may appear at first glance - Composition and context are frequently
treated as unproblematic and obvious
distinctions, and the underlying casual models
are often implicit - Context is often treated as a residual
category, containing those factors which remain
once individual compositional characteristics
are taken into account (Macintyre et al., 2002
129)
20Critiques of multilevel modelling (5)
- The confounding / selection problem
- The gaze of MLM tends to be top down (i.e. how
the characteristics of neighbourhoods affect
those of individuals) - What is missing is a sense of how biographical
outcomes are themselves influenced by health
trajectories the possibility that people whose
health is already compromised might actively be
placed into deprivation is rarely entertained
(Smith Easterlow, 2005 177). - With respect to neighbourhood effects research,
the trouble with observational designs is that
people are selected into neighbourhoods they
are not randomly distributed (Oakes, 2003 1932,
emphasis added). - There can be no question that social structures
and relations impact health and that disturbing
disparities exist. And it is patently obvious
that health varies with neighbourhood. The
problem is that such phenomena are, per force,
dependent happenings and as such render
ineffective (multilevel) regression models aiming
to identify independent effects (Oakes, 2003
1944).
21Critiques of multilevel modelling (6)
- The over-emphasising context problem
- The indication from these quantitative studies
Pickett Pearl, 2001 was that area variations
in health are incidental rather than fundamental
that similar people have similar health
experiences no matter where they live that,
statistically, composition explains (much) more
than context (Smith Easterlow, 2005 175,
emphasis added).
22Critiques of multilevel modelling (7)
- Who or what is the underlying population?
- MLM establishes significance estimates of second
order (neighbourhood) effects by treating the
higher level units as a sample of an underlying
population - But what if your sample (size of dataset) means
it essentially is the population? - What does (long term) probability mean within an
open and dynamic (social) system?
23Critiques of multilevel modelling (8)
- Theories of neighbourhood effects are
underdeveloped - Peer-group norms, the absence of successful role
models, access to community-based social capital,
real and perceived opportunity costs, and both
personal efficacy and collective efficacy might
all play important roles in explaining any
neighbourhood effects (McCulloch, 2001 1367). - Different groups of people living in the same
places may have different experiences and
concepts of neighbourhood (e.g.. males and
females in a particular place may have different
local neighbourhoods borne-out by differences
in day-to-day experiences).
24Critiques of multilevel modelling (9)
- All these critiques build up to challenge the
ability to establish cause - The validity and generalisability of
neighbourhood effects remain open to question,
and as yet there has been little empirical
investigation of the causal pathways by which
social environments translate into biological
states of health and disease (Pickett Pearl,
2001 111). - The causal effect of neighbourhood contexts on
health continues to confuse and elude us (Oakes,
2003 1930, original emphasis).
25Critical fallacy (1)
- Because there are no apparent contextual
effects there is no geography / geography does
not matter - A compositional explanation for observed area
variations in social and economic problems agues
that areas of concentrated disadvantage arise
solely because of the varying distribution of
types of people whose individual characteristics
influence their social and economic outcomes.
That is, similar types of people will have
similar experiences no matter where they live. It
is therefore argued that people rather than areas
should be targeted (McCulloch, 2001 668,
emphasis added). - People often dont just live anywhere
- The distribution of people is the geography
(Mitchell, 2001 1358) - Ironically, multilevel practice may actually
obscure that geography by focusing on higher
order contextual effects
26Critical fallacy (2)
- If there is no evidence of a second order
contextual effect then a MLM is not needed - Even without evidence of a second order
contextual effect the modelling procedure needs
to accommodate the geography of composition (i.e.
the spatial autocorrelation) - While the technique represents a powerful means
of investigating complex forms of contextuality,
it carries no built in assumption regarding the
importance of context. Multilevel models are just
as capable of showing that context does not
matter when theoretical and empirical work has
suggested it might, (Duncan et al., 1998 109
after Mason, 1991).
27Critical fallacy (3)
- Multilevel modelling cant be longitudinal
(spatio-temporal). - Assuming the data are available the movement of
individuals between places over time can be
investigated
28Spatio-temporal model
j(t1)
j(t0)
i
29Conclusions
- MLM research has highlighted that geographers
still have problems with basic concepts such as
context, scale, neighbourhood etc. - Problems of making operational these vague and
contested concepts in quantitative research
(especially health research) - Some issues very old (ie MAUP, temporal
dimensions, the problems with dealing with
processes (ie dynamic v static), identifying and
measuring cause and effect) - Methodological Issues
- What spatial scales are appropriate (and what are
actually available) - What time scales are appropriate (e.g. with
respects to effects on health) - Better theorisation of what is meant by context
and how it is expressed at the different levels
of the ML hierarchy.
30- A better theoretical understanding of how
socio-economic process operate and the scales
(spatial and temporal) at which they operate. - When to use MLM and when not to use MLM (eg
discrete object v field conceptualisations) - Critical analysis of ML research in different
subject areas (eg health, education, housing,
voting) in order to gain a better understanding
of compositional / contextual effects. - The dependency of MLM on iterative computation,
simulation, approximation and (probability based)
solution finding also raises interesting
questions on research practices within
statistical sciences, e.g. what it means to
prove anything