Analysis of Cross-National Longitudinal Data Marc Callens (CBGS, Brussels - PowerPoint PPT Presentation

1 / 53
About This Presentation
Title:

Analysis of Cross-National Longitudinal Data Marc Callens (CBGS, Brussels

Description:

Programme Quantitative Methods in the Social Sciences' (QMSS) Seminar: ... lack of parsimony, what if N large? how to introduce country-level covariates zj ? ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 54
Provided by: Call72
Category:

less

Transcript and Presenter's Notes

Title: Analysis of Cross-National Longitudinal Data Marc Callens (CBGS, Brussels


1
Analysis of Cross-National Longitudinal
DataMarc Callens (CBGS, Brussels KULeuven,
Leuven)
EUROPEAN SCIENCE FOUNDATION Programme
Quantitative Methods in the Social Sciences
(QMSS) Seminar Theory and Practice in the
Analysis of Cross-National Cross-Sectional
Data 25 - 26 August 2005, University of Oxford,
United Kingdom
2
Overview
  • contextual regression
  • multilevel logistic regression
  • Models
  • Estimation
  • Retrospective data
  • The Impact of Education on Third Births (FFS)
  • Panel data
  • Poverty Dynamics in Europe (ECHP)

3
I. Contextual regression
  • Callens, M. (2004), Regression Modelling of
    Cross-National Data, Brussels PSPC.

4
Contextual regression - 1
  • a nested data structure
  • - j 1,, N countries
  • - i 1,, nj individuals
  • yij responses depend on
  • - xij individual level explanatory variable
  • - zj country level explanatory variable
  • yij are correlated within each country
  • ordinary multiple regression is not adequate
    here
  • - independence assumption for yij violated
  • - how to include country-level expl. var. zj ?

5
Contextual regression - 2
  • Key Methodological issues
  • Small N
  • technical problems (few degrees of freedom,
    )
  • only simple models possible
  • Galton problem
  • nations are not independent (cultural
    diffusion, )
  • Black box
  • how to explain the impact of country-level
    variables?
  • equivalence (See Harkness et al., 2003)
  • how comparable are the data?

6
  • Contextual regression - 3
  • 1. non-hierarchical models (e.g., separate
    regressions, pooled regression)
  • - ignore hierarchical data structure
  • 2. classical contextual models (e.g., ancova)
  • - acknowledge hierarchical data structure
  • - fixed intercepts aj and slopes ßj
  • 3. modern multi-level models (e.g., multilevel
    analysis)
  • - acknowledge hierarchical data structure
  • - random intercepts aj and slopes ßj

7
Separate regressions - 1
  • N models (one for each country) yiC aC ßC
    xiC eiC
  • parameters to estimate
  • for each country a specific intercept aC
  • for each country a specific slope ßC
  • lack of parsimony, what if N large?
  • how to introduce country-level covariates zj ?

8
Pooled regression - 1
  • one (global) model
  • parameters to estimate
  • one (global) intercept a
  • one (global) slope ß
  • country membership is ignored
  • how to introduce country-level covariates zj ?

9
Analysis of Covariance - 1
  • one (global) model yij aj ß xij eij
  • countries enter the model as N-1 dummies
  • parameters to estimate
  • for each country a specific intercept aj
  • one global slope ß
  • all countries have the same slope unrealistic
  • how to introduce country-level covariates zj ?

10
Multilevel Regression - 1
  • one model yij aj ßj xij eij
  • aj and ßj are random variables, following a
    normal distribution
  • random intercept aj N (a , s0²)
  • random slope ßj N (ß , s1²)
  • parameters to estimate
  • one average intercept a
  • one average slope ß
  • intercept variance s0²
  • slope variance s1²
  • intercept-slope covariance c01
  • inclusion of country level covariates zj
    possible !!!!

11
Multilevel Regression - 2
  • - Empty model yij aj (eij )
    random intercept
  • - Random intercept model yij aj ß xij
    (eij ) indiv. covar.
  • - Random slope model yij aj ßj xij
    (eij )
  • random slope
  • - Extended random models yij aj ßj xij ?
    zj, dxij zj (eij )
    country covar.

12
Contextual regression summary
Separate Pooled Ancova Multilevel
models countries 1 1 1
parms very large large large small
variances no no no yes
country vars? no 1 1 yes
complex? no no no yes
13
II. Multilevel logistic regression
  • Callens, M. and C. Croux (to appear), Performance
    of Likelihood-based Estimation Methods for
    Multilevel Binary Regression Models, Journal of
    Statistical Computation and Simulation.

14
Multilevel Logistic Regression
  • binary responses yij depend on
  • - xij individual-level explicative variable
    - zj group-level explicative variable
  • a nested data structure two levels
  • - level 2 N groups (j 1, ..., N)
  • - level 1 in each group nj individuals (i
    1, ..., nj)
  • - in each group responses yij are correlated
  • standard logistic regression model not adequate
  • - independence assumption is violated here
  • - inclusion of multiple group-level covariates?

15
Models for
  • standard logistic regression model
  • random logistic regression models
  • extended random logistic regression models

random intercepts
random slopes
cross-level interaction
group-level explicative variable
16
multilevel discrete-time hazard models for event
data
  • binary responses yijt are event data
  • - yijt 1 event occurrence for individual i in
    group j at time t
  • - yijt 0 event non-occurrence for individual
    i in group j at time t
  • (a third birth in 1995?, in 1996?, .)
  • event data analysis may be complicated by
    censoring
  • - e.g., for some i, an event may occur after the
    survey
  • - hazard models can take censoring into account
  • recurrent events for i are possible
  • (becoming poor)

17
  • maximum likelihood estimation
  • - via equivalent model (Allison, 1982)
  • - requires data in person-period format

pit Pr(yit) yit, indicator for event
occurrence in time-period t
18
performance of estimation methods
  • performance of
  • generally available estimation methods?
  • maximum likelihood estimation
  • - step 1. compute likelihood L
  • - step 2. maximise L with respect to model
    parameters
  • difficulty likelihood intractable integral





random effects u Normal distribution
responses yij Bernouilli distribution
19
  • estimation ? three methods
  • - penalised quasi-likelihood
  • - non-adaptive gaussian quadrature
  • - adaptive gaussian quadrature
  • how good? ? four performance indicators
  • - numerical convergence
  • - bias
  • - mean-squared error (mse)
  • - computational efficiency
  • simulation setting ? fractional factorial
    experiment
  • - full experiment would take 6 months
  • - fractional experiment only 1 month
  • Key findings Penalised quasi-likelihood performs
    (surprisingly) well

20
III. Application for retrospective data
  • Callens, M. and C. Croux (to appear), The Impact
    of Education on Third Births, A Multilevel
    Discrete-Time Hazard Analysis, Models, Journal of
    Applied Statistics

21
III. Application for Retrospective Data The
Impact of Education on Third Births. A
Multilevel Discrete-Time Hazard Analysis
Callens, M. and C. Croux, Journal of Applied
Statistics (to appear)
22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
  • key findings for multilevel models
  • - negative effect for education
  • - positive effect for Nordic countries

32
IV. Application for panel dataComparative
Poverty DynamicsA Multilevel Discrete-Time
Recurrent Hazard AnalysisCallens, M. and C.
Croux, (preprint)
33
poverty theoretical perspectives
  • general poverty theory?? (McKernan, 2002)
  • individual structural perspectives (Iceland,
    2003)
  • individual theories the poor create their
    poverty
  • life cycle hypothesis (e.g., Rowntree, 1901)
  • life cycle events (e.g., a divorce)
  • human capital theory (Becker, 1975)
  • education, age, gender,

34
poverty theoretical perspectives
  • structural theories econ., soc. policy systems
  • skills mismatch hypothesis (Kasarda, 1990)
  • deindustrialisation
  • technological change
  • - welfare state regimes (Esping-Andersen, 1999)
  • - level and design of welfare benefits
  • - relative role of state and market
  • - four types social-democratic, conservative,
    liberal, southern

35
individual hypothesis for poverty entry
  • h1 demographic and labour market events
  • h2 for women effect of demographic events is
    stronger

event Effect
marriage -
divorce
employment -
unemployment
event effect
marriage - -
divorce
36
structural hypothesis for poverty entry
  • ranking of welfare regimes dominant income
    changes
  • h3 non-earned income changes dominate
  • h4 earned income changes dominate
  • 4 most dynamic

dominant income changes dominant income changes
regime non-earned earned
southern 4 1
liberal 3 2
conservative 2 3
social-democratic 1 4
37
structural hypothesis for poverty entry
  • h5 skills mismatch

mismatch effect
deindustrialisation
technological change
38
two longitudinal EU databases (linked at nuts1
level)
  • individual level panel data
  • European Community Household Panel
  • yearly individual and household panel (94-98)
  • income, employment, housing, healthcare,
  • regional level time series
  • regio database, New Cronos
  • regional time series
  • demography, unemployment, education,

39
poverty status in a year
  • compare income with relative poverty line

country-specific poverty thresholds
equivalised individual income
40
discrete-time hazard models
  • binary responses y1ijt are discrete-time event
    data
  • - y1ijt 1 poverty entry for individual i in
    region j in year t
  • - y1ijt 0 no poverty entry for individual i
    in region j at year t
  • (poverty entry in 1995?, in 1996?, .)
  • event data analysis may be complicated by
    censoring
  • - e.g., for some i, an event may occur after the
    survey
  • - hazard models can take censoring into account

41
  • discrete-time hazard
  • proportional odds model
  • estimation via equivalent logistic regression
    model (Allison, 1982)

Tij is time of occurrence of event y1ijt for
individual i belonging to region j
conditional on being at risk
at set of t intercepts, one for each
discrete-time unit
42
multilevel discrete-time recurrent hazard
analysis
  • extension 1 poverty entry is a recurrent event
  • here, an individual may experience two poverty
    entries in a row
  • so, we simultaneously model k 2 discrete-time
    hazards
  • extension 2 dependent individuals ? multilevel
    model

Extension 2 dependency of individuals in a
region is modelled by random intercepts
Extension 1 we allow for event-specificity
for the baseline hazard k 1, 2
43
explicative variables at individual level i
  • xijt demographic events (changes in a year)
  • - marriage from never married to married
  • - divorce from married to
    divorced/separated
  • xijt labour market events (changes in a year) -
    employment from unemployed to employed
  • - unemployment from employed to unemployed

44
control variables at individual level i
  • xij current status
  • education ltsec, sec, third, at school
  • xijt
  • age 16-40, 41-50, 51-60, 60
  • civil status married, sep./div, wid., unmarried
  • cohabiting status no, yes
  • activity status inactive, unemployed, working
    (15)
  • health status bad, good
  • household type single, singlechild,
    couplechild, oth.
  • duration - 4, -3, -2, -1, 0, 1.

45
explicative variables at regional level j
  • zj
  • - welfare regime
  • southern es, it, gr, pt
  • liberal uk, ie
  • conservative be, fr, ge
  • social-democratic dk (ref)
  • - deindustrialisation
  • employment service sector (relative to dk)
  • zjt
  • - technological change
  • employment RD business sector (relative to
    dk)

46
control variables at regional level j
  • zj
  • employment rate working in total pop (rel. to
    dk)
  • zjt
  • unemployment rate unemp. In active pop (rel.
    to dk)
  • relative gdp of EU 15 average (rel. to dk)
  • gdp growth log differences

47
poverty entry results at individual level
  • odds ratio
  • p lt 0.05 p lt 0.01 p lt 0.001

women women women men men men
event effect OR p effect OR p
marriage (-) 0.75 () 1.30
divorce 5.28 () 1.27
employment 1.57 () 1.20
unemployment 1.34 1.77
48

key results for individual level
  • individual level strong impact
  • for women
  • demographic events gt labour market events
  • for men
  • labour market events gt demographic events

49
poverty entry results at regional level
  • Welfare regimes
  • odds ratio
  • p lt 0.05 p lt 0.01 p lt 0.001

women women women men men men
regime effect OR p effect OR p
southern - 0.61 - 0.55
liberal () 1.12 (-) 0.85
conservative -- 0.59 -- 0.53
social-democratic (ref) 1 (ref) 1
50
poverty entry results at regional level
  • skills mismatch

women women women men men men
skills mismatch effect OR p effect OR p
deindustrialisation () 1.00 () 1.01
technological change () 1.02 () 1.11
odds ratio p lt 0.05 p lt 0.01 p lt
0.001
51
key results for regional level
  • regional level variation small, but relevant
  • welfare regimes have an impact
  • but, theoretically ambiguous

52
conclusion
  • individual gt regional
  • women demographic events
  • men labour market events
  • welfare regime important, but how??

53
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com