Title: SC968: Panel Data Methods for Sociologists
1SC968 Panel Data Methods for Sociologists
Introducing panel data
2Overview
- Panel data
- What it is
- How to get to know the data
- Change over time
- Tabulating
- Calculating transition probabilities
3What is panel data?
- A data set containing observations on multiple
phenomena observed at a single point in time is
called cross-sectional data - A data set containing observations on a single
phenomenon observed over multiple time periods is
called time series data - Observations on multiple phenomena over multiple
time periods are panel data - Cross sectional and time series data are one-
dimensional, panel data are two-dimensional
4Using panel data in Stata
- Data on n cases, over t time periods, giving a
total of n t observations - One record per observation
- i.e. long format
- Stata tools for analyzing panel data begin with
the prefix xt - First need to tell Stata that you have panel data
using xtset
5Complete and incomplete person-wave data
6Telling Stata you have time series data
Unique cross-wave identifier
Time variable
. xtset pid wave panel variable pid
(unbalanced) time variable wave, 1 to
15, but with gaps delta 1 unit
7Cases not observed for every time period
. xtset pid wave panel variable pid
(unbalanced) time variable wave, 1 to
15, but with gaps delta 1 unit
Period between observations in units of the time
variable
8Describing the patterns in panel data
9Examining change over two waves
10Calculating transition probabilities
- The transition probability is the probability of
transitioning from one state to another
So to calculate by hand,
Cell count
Row total
11Transition probability matrix
12Transition probability matrices in Stata
Mean transition probabilities for all waves t to
t1 when you leave out the if statement
13Change in a categorical variable over timeA
decision tree
empl
0.91
empl
0.03
unemp
0.06
0.90
olf
empl
0.26
unemp
0.03
0.49
empl
unemp
0.25
olf
0.04
empl
0.10
olf
0.03
unemp
0.87
olf
14Change in a continuous variable over time
- Size transition matrix
- Quantile transition matrix
- Mean transition matrix
- Median transition matrix
15Size transition matrix
- Absolute mobility
- e.g. movement in and out of poverty
- Boundaries set exogenously i.e. predetermined
- e.g poverty defined a priori as an income below
5,000 - Does not depend on distribution under
investigation - e.g comparing mobility in 1990s and 2000s
- incorporates both movements of positions of
individuals and economic growth
16Quantile transition matrix
- Mobility as a relative concept
- Same number of individuals in each class
- Only records movements involving reranking
- Cannot take account of economic growth, for
example when comparing matrices - Cannot draw a complete picture if comparing
mobility in different cohorts/countries/welfare
regimes
17Mean/median transition matrices
- Both absolute and relative approaches
incorporated into matrices - Class boundaries defined as percentages of mean
or median income of the origin and destination
distributions - Example
- 25, 50, 75 of median income
- Note that this is not the same as quartiles
18Example income 1991-1992
19Category boundaries for each method
20Warning!
- Measurement error
- Causes an over-estimation of mobility
- If mothers and babys weight are reported to
nearest half pound can affect which band the
observations falls in - A respondent may describe their marital status as
separated in year 1 and single in year 2
21Finally..
- Greater challenges to understanding and checking
panel data - Transition matrices a good way to summarise
mobility patterns - Different methods of constructing matrices lead
to distinct interpretations - May need to take account of measurement error
when modelling change