Title: Centre for Market and Public Organisation
1 Centre for Market and Public Organisation
Measuring socio-economic position in ALSPAC Liz
Washbrook, CMPO Liz.Washbrook_at_bristol.ac.uk ESRC/
ALSPAC Large Grant Meeting 5th November 2008
2But first! US cohort studies
- Early Child Longitudinal Study Birth Cohort
(ECLS-B) - 10,000 children born 2001, nationally
representative when weighted - Over-samples of low birth weight babies, twins,
some ethnic groups (e.g. Native Americans,
Chinese) - Samples from birth certificates, follow-ups at 9
months, 2 years, Fall prior to kindergarten
(4y), Fall of kindergarten year (5y). But no
more! - Data from parent CAPI, direct child assessments,
child care providers and teachers. Some resident
and non-resident father questionnaires. - Early Child Longitudinal Study Kindergarten
Cohort (ECLS-K) - 20,000 children starting kindergarten in 1998 (b.
1992/3) - Children sampled from 1277 schools in 100
counties. Target 24 children per school.
Nationally representative when weighted. - Follow ups at Fall Spring kindergarten year
(5-6y), Fall Spring 1st grade (6-7y), Spring
3rd grade (9y), 5th grade (11y), 8th grade
(14y) - Data from direct child assessments, parental
phone interviews, teacher and school
administrator questionnaires. - Data is publicly available (on CD). See
http//nces.ed.gov/ECLS/index.asp
3US cohort studies
- Fragile Families
- 5000 children born 1998-2000 in large US cities
- Designed to follow children born to unmarried
parents but includes control sample of married
parent families (25). - Focus on deprived families 44 mothers at
baseline black, 35 Hispanic, 27 teenagers, 79
high school or less - Detailed information on fathers roles and
involvement - Parent interviews in hospital at birth, follow
ups at 1, 3, 5 and 9. Includes direct in-home
child assessments. - Data publicly available www.fragilefamilies.princ
eton.edu/index.asp
4Aims
- Aim to stimulate discussion about the
construction of an index of parental
socio-economic position (SEP) from the ALSPAC
data - Talk will cover
- The range of indicators available and their
features - Sample selection/missingness issues (multiple
imputation) - Combining the indictors into a single index
(principal components analysis) - Illustrated using a case study Measures of
social inequality in Key Stage 2 exam results
(age 11) - Would a standard SEP variable available to all
ALSPAC researchers be useful? - If so, how should it be constructed?
- Input, feedback, discussion would be appreciated!
5What is SEP?
- Extensive literature on theories of social
stratification (Galobardes, Lynch and Davey
Smith, 2007 Bradley and Corwyn, 2002). - Socially derived economic factors that influence
what positions individuals or groups hold within
the multiple-stratified structure of society
(Galobardes et al) - In practice researchers have used a multitude of
individual indicators to measure SEP, each of
which captures a different aspect of
stratification - Composite SEP is a relative measure, whereas some
indicators (income, education) measure absolute
levels of resources. This may have implications
when thinking about policy.
6Why measure parental SEP?
- SEP as a summary measure of family background
that defines sub-groups of the population. - Social mobility/life chances
- Nature vs. nurture
- Example Joint CMPO project on the role of
attitudes and aspirations in explaining the
educational deficits of children in poverty - SEP as a way of capturing long-term access to
resources over the life course, e.g. permanent
income in economics - To classify deprived or vulnerable groups in a
way that captures the idea of multiple risks - As a control for confounding influences (e.g.
studying the effects of smoking)? Disaggregated
sets of control variables may be more appropriate
7SEP indicators in ALSPAC
- How is the indicator constructed from multiple
pieces of information? (High frequency of
measurement in ALSPAC) - How is the indicator distributed? (E.g.
discrete/continuous) - For whom is it available? (Differential
missingness) - How well does it distinguish between high- and
low-performing children? (KS2 is an example
relationships will differ with different
outcomes)
8The sample
- 11 071 children with
- A valid Key Stage 2 score
- Minimum of 2 (out of 10) non-missing SEP
indicators (30 complete cases) - Sample is 69 of the eligible birth cohort (15
994 in NPD) - Key Stage 2 score derived from exam marks in
English, maths and science in Year 6 (age 11).
National tests compulsory in all state schools. - Test scores are averaged and normalised to mean
zero, standard deviation 1 on the full eligible
population of 15 994 - The working sample is not randomly selected
- Mean KS2 (SD)
- Working sample (N11071) 0.11 (0.95)
- lt2 SEP indicators (N4923) -0.26 (1.05)
9Household income
Measures Take home weekly family income at 33,
47, 85, 97 months 11 years
Proportion of valid responses in bands
Failure to update the bands means that the
usefulness of the 85 and 97 month income measures
is limited.
10Household income
The age 11 income measure is better
11Household income
- The SEP index uses
- Log average real equivalised weekly take home
income at 33 47 mths - Median income for band imputed using FES data for
households containing a child of the cohort
members age, in the relevant year and income
interval - Adjustment made for housing benefit income if
respondent reports zero housing expenditures and
lives in rented accommodation (predicted value
from FES for HB recipients in the Southwest,
varying with year, lone parent status and number
under 16s in household) - Expressed in 1995 prices using All Items RPI
- Equivalised using modified OECD scale
- Averaged and logged
- Nominal banded income at 85 months
- Nominal continuous income at 11 years, using band
midpoints
12Average KS2, by preschool income quintiles
13Average KS2, by nominal income at age 7
14Average KS2, by nominal income quintiles at age 11
15Parental education
- Measures Mother and partner reports for both
spouses qualifications antenatal, 61 and 97
months. - The SEP index uses maternal reports of own and
partners highest qualification at 32 weeks
gestation. - Issues
- Non-response to the question is coded as no
qualifications (dont know, no quals and no
partner were all possible responses) - Possible discrepancies between own and partner
report - Possible changes in the identity of the partner
over time - Possible changes in qualifications over time
16Average KS2, by mothers highest qualification
17Average KS2, by partners highest qualification
18Parental social class
- Measures Mother reports of own and partners
occupation antenatal, 8 and 97 months. Partner
reports more frequent but not coded. - The SEP index uses maternal reports of own and
partners social class at 32 weeks gestation. - Question related to occupation in current or last
job - Occupations coded according to 1991 SOC
classification - Used to derive Registrar Generals Social Class
this is what is available in the datafiles.
Hierarchical measure. - No other data on occupation is currently coded
19Average KS2, by mothers social class
20Average KS2, by partners social class
21Housing tenure
- Measures Mother reports of tenure 8, 21, 33 and
61 months. - The SEP index uses a derived variable
- Always owner-occupier mortgaged/owned
outright/buying from council at all 4 dates - Ever in social housing council rented/Housing
Association rented at any of 4 dates - Other not otherwise classified and at least
one valid response (other responses private
rented furnished/unfurnished, other). Includes
all people with a missing value who were never
observed in social housing, as well as renters.
22Average KS2, by housing tenure 8-61 months
23Local deprivation/affluence
- Measures Ward-level Index of Multiple
Deprivation (IMD) currently matched at birth, age
5 and age 8, but postcodes available on an annual
basis - The SEP index uses the (continuous) rank of the
IMD for ward at birth - IMD provided by government statistics. Derived
from data in 6 domains income, education,
employment, housing, health, access to services - Wards in England (approx. 5500 individuals)
ranked on basis of deprivation from 1 to 8414.
This allows definition of national quantiles. - Can be matched to ALSPAC via postcode data
24Average KS2, by national quintiles of IMD
25Subjective financial hardship
Measures Mother-completed financial difficulties
questionnaires at 8, 21, 33, 61 and 85 months
Format How difficult at the moment do you find
to afford these items food clothing heating
rent/mortgage things for child? Very (3)
Fairly (2) Slightly (1) Not difficult
(0) Responses to the 5 items at each date summed
to give to score between 0 and 15
- The SEP index uses the mean score across the 5
dates - The 61 and 85 month measures include questions on
educational courses, medical care, child care and
other things - Do not pay for this/DSS pays options for rent
and heating coded as 0 - The distribution of the resulting variable in
highly skewed
26Average KS2, by quintiles of financial
difficulties score
27Multiple Imputation by Chained Regression
SEP indicators missing (out of 10)
- Iterative multivariable regression technique
switching regression - Statas ice command
- Specify a prediction equation for each variable
- Randomly allocate values to missing cases
- Predict values for missing cases
- Update RHS variables and repeat cycle (10 times)
- Options allow choice of estimation method,
passive imputation and substitution of RHS
dummies, constrained intervals for predicted
values
Current method Imputation carried out using 10
SEP variables only does not use other
information Only a single imputed dataset created
28The ice command
29Prediction equations
30Principal components analysis
- PCA provides a way of combining (weighting) the
individual components into a single index - PCA conducted on the 10x10 polychoric correlation
matrix - Standard PCA techniques assume continuous,
normally distributed variables. - Polychoric correlation can be used when there are
binary and categorical components (e.g.
education). - It assumes that ordinal variables obtained by
categorizing an normally distributed underlying
variable. - PCA extracts a single component that maximises
the explained proportion of the variation in the
(standardised) components - Each component is assigned a scoring coefficient
that is used as a weight in the construction of
the SEP index
31Principal components analysis
Scoring coefficients
SEP index explains 46 of total variation in
components
32Average KS2, by quintiles of SEP index
33Summary
- ALSPAC contains numerous indicators that can be
used to construct an SEP index - Indicators vary in
- The type of resources they measure
- The sections of the population they distinguish
(e.g. tenure appears good at picking out the very
disadvantaged, but does not discriminate at the
top of the distribution) - The likelihood of non-response by different
groups - Issues that need to be considered when
constructing an index - Which components should be included? (Should
education be separate?) - How should observations at multiple dates/by
multiple respondents be treated? - How should missing values be dealt with?
- How should the components be combined?