Title: Intermediate Social Statistics Lecture 5
1Intermediate Social StatisticsLecture 5
- Hilary Term 2006
- Dr David Rueda
2Today Models for Count Data. Poisson Regression
and Negative Binomial Models.
- Models for Count Data.
- Poisson Regression setup, interpretation and
analysis. - Negative Binomial setup, interpretation and
analysis.
3Models for Count Data (1)
- Count data simply refers to variables that can be
measured by counts or summary frequency data. - Even counts are variables that for each
observation have a number of occurrences (of the
event) in a fixed domain. - The fixed domain for each observation can be
time-related (day, month, year) or space
(geographic unit, individual, etc). - The observations are non-negative, integer
valued, and generally contain a small number of
meaningful values with high proportion of zeros. - Count data are abundant in many disciplines and
in many applications in social science. - In Political Science very common in IR (conflict
between countries, etc), presidential vetoes per
year, number of Parliamentary representatives
that switch parties, months a government lasts,
etc, etc. - In all these cases, our observations would be
non-negative, integer valued, and often would
contain a small number of meaningful values with
high proportion of zeros.
4Models for Count Data (2)
- We model the data in order to describe and
predict the counts. - The preponderance of zeros and the small values
and discrete nature of the dependent variable
make OLS estimation not appropriate. - Instead we can use two different methods Poisson
regression and negative binomial.
5Poisson Regression Setup (1)
- Interesting early application of Poisson
regression the number of soldiers kicked to
death by horses in the Prussian army
(Bortkewitsch 1898). - In general, the dependent variable Y is the
number of occurrences of an event we are
interested in. - The Poisson regression model specifies that each
yi is drawn from a Poisson distribution with
parameter ?i (which is related to the regressors
xi). - What does this mean? It means the probability of
observing Yik is - P(Yik) ((exp -?i) ?ik) / k!, k0,
1, 2, - The most common specification for ?i is the
log-linear model - ln ?i ß xi
- The expected number of events per period is
- EYi xi eß xi
6Poisson Regression Setup (2)
- We use maximum likelihood to estimate the
parameters of the regression model. - Note that the mean in the log linear model (ln ?i
ß xi) is nonlinear, which means that the
effect of a change in xi will depend not only on
ß (as in the classical linear regression), but
also on the value of xi. - What do we gain from using a Poisson
distribution? - Imagine the probability of the number of soldiers
kicked to death by horses in the Prussian army. - The following is the plot of the Poisson
probability density function for four values of ?.
7(No Transcript)
8Poisson Regression Setup (2)
- We use maximum likelihood to estimate the
parameters of the regression model. - Note that the mean in the log linear model (ln ?i
ß xi) is nonlinear, which means that the
effect of a change in xi will depend not only on
ß (as in the classical linear regression), but
also on the value of xi. - What do we gain from using a Poisson
distribution? - Imagine the probability of the number of soldiers
kicked to death by horses in the Prussian army. - The following is the plot of the Poisson
probability density function for four values of
?. - However
- What are assuming in a Poisson regression?
Equidispersion. - This means that the mean equals the variance.
- This may not be the case, and then a Poisson
regression is not appropriate.
9Poisson Regression Interpretation (1)
- In Stata, we will obtain a z test value and a p
value associated to a two-tailed significance
level test of P gt z . The null hypothesis is
bi 0. - Interpretation of the coefficients
- A positive coefficient means a one-unit increase
in the independent variable has the effect of
increasing the predicted number of events. - Often we want to compare the rate at which events
occur, we can do this by calculating incidence
ratios (in Stata irr). This means we can
estimate the incidence ratio associated with a
one-unit increase an independent variable
(keeping the rest constant). - We can also compute percentage changes (more in
computer session).
10Poisson Regression Interpretation (2)
- More intuitive (as usual).
- We can also calculate the predicted number of
events associated with a particular set of
independent variable values. For example
predicted number of events when x133, x20, x30
and x4 0. - Predicted number of events exp
(bob133b20b30b40). - For goodness of fit
- The chi-square test tells us whether all the
estimates in the model are insignificant (the
usual likelihood ratio test). - Stata also provides a Pseudo R squared.
- We can also perform a likelihood ratio
chi-squared statistic test comparing our model
with a model taking into consideration all
possible effects of the variables (more in
computer session).
11Poisson Regression Analysis (1)
- Data well look at
- Los Angeles High School data.
- 316 students at two Los Angeles high schools.
- Example taken from UCLAs Statistical Computing
Resources http//www.ats.ucla.edu/stat/stata/stat
130/count2.htm - Explanatory variables we are using
- gender female1, male2.
- mathpr Math Exam Score (percentile rank).
- langpr Language Exam Score (percentile rank).
- Dependent variable
- daysabs Number of days absent.
12Poisson Regression Analysis (2)
- Theoretical claims
- We think that (controlling for academic
attainment) being a male is associated with
higher number of days absent. - We will test our hypotheses with a Poisson
regression analysis. - Should we do OLS?
- Lets see a histogram
13(No Transcript)
14Poisson Regression Analysis (2)
- Theoretical claims
- We think that (controlling for ethnic origin, and
academic attainment) being a male is associated
with higher number of days absent. - We will test our hypotheses with a Poisson
regression analysis. - Should we do OLS?
- Lets see a histogram
- The data are strongly skewed to the right, there
are a large number of 0s, OLS would be
inappropriate. - Lets do a Poisson regression.
15Poisson Regression Analysis (3)
- Poisson regression
Number of obs 316 -
LR chi2(3) 175.27 -
Prob gt chi2 0.0000 - Log likelihood -1547.9709
Pseudo R2 0.0536 - --------------------------------------------------
---------------------------- - daysabs Coef. Std. Err. z
Pgtz 95 Conf. Interval - -------------------------------------------------
---------------------------- - gender -.4009209 .0484122 -8.281
0.000 -.495807 -.3060348 - mathnce -.0035232 .0018213 -1.934
0.053 -.007093 .0000466 - langnce -.0121521 .0018348 -6.623
0.000 -.0157483 -.0085559 - _cons 3.088587 .1017365 30.359
0.000 2.889187 3.287987 - --------------------------------------------------
---------------------------- - More interpretation in computer session, but
16Poisson Regression Analysis (4)
- Problems? In a Poisson distribution, the mean and
the variance are the same. - In a preliminary way, we can test this by
checking our dependent variable - number days absent
- --------------------------------------------------
----------- - Percentiles Smallest
- 1 0 0
- 5 0 0
- 10 0 0 Obs
316 - 25 1 0 Sum of Wgt.
316 - 50 3 Mean
5.810127 - Largest Std. Dev.
7.449003 - 75 8 35
- 90 14 35 Variance
55.48764 - 95 23 41 Skewness
2.250587 - 99 35 45 Kurtosis
8.949302 - The variance is nearly 10 times larger than the
mean.
17Poisson Regression Analysis (5)
- In a more systematic way, we can test this with a
likelihood ratio chi-squared statistic test
comparing our model with a model taking into
consideration all possible effects of the
variables. - If the test is significant, the Poisson
regression is not appropriate - Goodness of fit chi-2 2234.546
- Prob gt chi2(312) 0.0000
- The large value of the chi-square is another sign
that the poisson distribution is not a good
choice. - What do we do now?
18Negative Binomial Setup (1)
- Negative Binomial is used to estimate counts of
an event when the event has overdispersion
(extra-Poisson variation). - Some details about the negative binomial
distribution (in general) - The number of successes is fixed and we're
interested in the number of failures before
reaching the fixed number of successes. - The experiment consists of a sequence of
independent trials. - Each trial has two possible outcomes, S or F.
- The probability of success, pP(S), is constant
from one trial to another. - The experiment continues until a total of r
successes. - A random variable X which follows a negative
binomial distribution is denoted XNB (r, p) .
Its probabilities are computed with the formula -
- The expected value and the variance are
19Negative Binomial Setup (2)
- We assume that the model is the same as the one
described in the Poisson Regression case, except
the variation is greater. - The log of the mean, ?, is a linear function of
some independent variables - log(?) intercept b1X1 b2X2 .... b3Xm,
- This means that ? is the exponential function of
independent variables - ? exp(intercept b1X1 b2X2 .... b3Xm).
- Before, we assumed that the distribution of Y
(the number of occurrences of an event) was
Poisson. - A negative binomial distribution can be
understood as a gamma mixture of Poisson random
variables (for more details, see Long, Regression
Models for Categorical and Limited Dependent
Variables, Sage 1997). - We use maximum likelihood to estimate the
parameters of the regression model.
20Negative Binomial Interpretation (1)
- Same as with Poisson regression.
- In Stata, we will obtain a z test value and a p
value associated to a two-tailed significance
level test of P gt z . The null hypothesis is
bi 0. - Interpretation of the coefficients
- A positive coefficient means a one-unit increase
in the independent variable has the effect of
increasing the predicted number of events. - Often we want to compare the rate at which events
occur, we can do this by calculating incidence
ratios (in Stata irr). This means we can
estimate the incidence ratio associated with a
one-unit increase an independent variable
(keeping the rest constant). - We can also compute percentage changes (more in
computer session).
21Negative Binomial Interpretation (2)
- More intuitive (as usual).
- We can also calculate the predicted number of
events associated with a particular set of
independent variable values. For example
predicted number of events when x133, x20, x30
and x4 0. - Predicted number of events exp
(bob133b20b30b40). - For goodness of fit
- The chi-square test tells us whether all the
estimates in the model are insignificant (the
usual likelihood ratio test). - Stata also provides a Pseudo R squared.
- We also get an estimate for a parameter measuring
overdispersion - Stata provides a maximum likelihood test for this
estimate. Significance in the p value means that
the data are not a Poisson distribution (when the
parameter is not significantly different from 0,
NB and Poisson are equivalent).
22Negative Binomial Analysis (1)
- Data well look at, same as before
- Los Angeles High School data.
- 316 students at two Los Angeles high schools.
- Explanatory variables we are using
- gender female1, male2.
- mathpr Math Exam Score (percentile rank).
- langpr Language Exam Score (percentile rank).
- Dependent variable
- daysabs Number of days absent.
- This time we estimate a negative binomial model
23Negative Binomial Analysis (2)
- Negative binomial regression
Number of obs 316 -
LR chi2(3) 20.74 -
Prob gt chi2 0.0001 - Log likelihood -880.87312
Pseudo R2 0.0116 - --------------------------------------------------
---------------------------- - daysabs Coef. Std. Err. z
Pgtz 95 Conf. Interval - -------------------------------------------------
---------------------------- - gender -.4311844 .1396656 -3.087
0.002 -.704924 -.1574448 - mathnce -.001601 .00485 -0.330
0.741 -.0111067 .0079048 - langnce -.0143475 .0055815 -2.571
0.010 -.0252871 -.003408 - _cons 3.147254 .3211669 9.799
0.000 2.517778 3.776729 - -------------------------------------------------
---------------------------- - /lnalpha .2533877 .0955362
.0661402 .4406351 - -------------------------------------------------
---------------------------- - alpha 1.288383 .1230871 10.467
0.000 1.068377 1.553694 - --------------------------------------------------
---------------------------- - Likelihood ratio test of alpha0 chi2(1)
1334.20 Prob gt chi2 0.0000