Title: Power 14 Goodness of Fit
1Power 14Goodness of Fit Contingency Tables
2II. Goodness of Fit Chi Square
- Rolling a Fair Die
- The Multinomial Distribution
- Experiment 600 Tosses
3The Expected Frequencies
4The Expected Frequencies Empirical Frequencies
Empirical Frequency
5Hypothesis Test
- Null H0 Distribution is Multinomial
- Statistic (Oi - Ei)2/Ei, observed minus
expected squared divided by expected - Set Type I Error _at_ 5 for example
- Distribution of Statistic is Chi Square
One Throw, side one comes up multinomial
distribution
P(n1 1, n2 0, n3 0, n4 0, n5 0, n6 0) n!/
P(n1 1, n2 0, n3 0, n4 0, n5 0, n6 0)
1!/1!0!0!0!0!0!(1/6)1(1/6)0 (1/6)0 (1/6)0 (1/6)0
(1/6)0
6Chi Square x2 ? (Oi - Ei)2 6.15
75
11.07
Chi Square Density for 5 degrees of freedom
8Contingency Table Analysis
- Tests for Association Vs. Independence For
Qualitative Variables
9Does Consumer Knowledge Affect Purchases? Frost
Free Refrigerators Use More Electricity
10Marginal Counts
11Marginal Distributions, f(x) f(y)
12Joint Disribution Under Independence f(x,y)
f(x)f(y)
13Expected Cell Frequencies Under Independence
14Observed Cell Counts
15Contribution to Chi Square (observed-Expected)2/E
xpected
Upper Left Cell (314-324)2/324 100/324 0.31
Chi Sqare 0.31 0.93 0.46 1.39
3.09 (m-1)(n-1) 111 degrees of freedom
165
5.02
17Conclusion
- No association between consumer knowledge about
electricity use and consumer choice of a
frost-free refrigerator
18Using Goodness of Fit to Choose Between Competing
Probability Models
- Men on base when a home run is hit
19Men on base when a home run is hit
20Conjecture
21Average of men on base
Sum of products np 0.2980.2500.081 0.63
22Using the binomialkmen on base, n of trials
- P(k0) 3!/0!3! (0.21)0(0.79)3 0.493
- P(k1) 3!/1!2! (0.21)1(0.79)2 0.393
- P(k2) 3!/2!1! (0.21)2(0.79)1 0.105
- P(k3) 3!/3!0! (0.21)3(0.79)0 0.009
23Assuming the binomial
- The probability of zero men on base is 0.493
- the total number of observations is 765
- so the expected number of observations for zero
men on base is 0.493765377.1
24Goodness of Fit
25Chi Square, 3 degrees of freedom
5
7.81
26Conjecture Poisson where mnp 0.63
- P(k3) 1- P(k2)-P(k1)-P(k0)
- P(k0) e-m mk /k! e-0.63 (0.63)0/0! 0.5326
- P(k1) e-m mk /k! e-0.63 (0.63)1/1! 0.3355
- P(k2) e-m mk /k! e-0.63 (0.63)2/2! 0.1057
27Average of men on base
Sum of products np 0.2980.2500.081 0.63
28Conjecture Poisson where mnp 0.63
- P(k3) 1- P(k2)-P(k1)-P(k0)
- P(k0) e-m mk /k! e-0.63 (0.63)0/0! 0.5326
- P(k1) e-m mk /k! e-0.63 (0.63)1/1! 0.3355
- P(k2) e-m mk /k! e-0.63 (0.63)2/2! 0.1057
29Goodness of Fit
30Chi Square, 3 degrees of freedom
5
7.81
31Likelihood Functions
- Review OLS Likelihood
- Proceed in a similar fashion for the probit
32Likelihood function
- The joint density of the estimated residuals can
be written as - If the sample of observations on the dependent
variable, y, and the independent variable, x, is
random, then the observations are independent of
one another. If the errors are also identically
distributed, f, i.e. i.i.d, then
33Likelihood function
- Continued If i.i.d., then
- If the residuals are normally distributed
- This is one of the assumptions of linear
regression errors are i.i.d normal - then the joint distribution or likelihood
function, L, can be written as
34Likelihood function
- and taking natural logarithms of both sides,
where the logarithm is a monotonically increasing
function so that if lnL is maximized, so is L
35Log-Likelihood
- Taking the derivative of lnL with respect to
either a-hat or b-hat yields the same estimators
for the parameters a and b as with ordinary least
squares, except now we know the errors are
normally distributed.
36Probit
- Example expenditures on lottery as a of
household income - lotteryi a bincomei ei
- if lotteryi gt0, i.e. a bincomei ei gt0,
then Berni , the yes-no indicator variable is
equal to one and ei gt- a - bincomei - this determines a threshold for observation i in
the distribution of the error ei - assume
37 i
38 i
Area above the threshold is the probability of
playing the lottery for observation i, Pyes
39 i
Area above the threshold is the probability of
playing the lottery for observation i, Pyes
Pno for observation i
40Probit
- Likelihood function for the observed sample
- Log likelihood
41(No Transcript)
42 i
Area above the threshold is the probability of
playing the lottery for observation i, Pyes
Pno for observation i
43Probit
- Substituting these expressions for Pno and Pyes
in the ln Likelihood function gives the complete
expression.
44Probit
- Likelihood function for the observed sample
- Log likelihood
45(No Transcript)
46Outline
- I. Projects
- II. Goodness of Fit Chi Square
- III.Contingency Tables
47Part I Projects
- Teams
- Assignments
- Presentations
- Data Sources
- Grades
48Team One
- Project choice
- Data Retrieval
- Statistical Analysis
- PowerPoint Presentation
- Executive Summary
- Technical Appendix
- Graphics (Excel, Eviews, other)
49Assignments
- 1. Project choice Markus Ansmann
- 2. Data Retrieval Theodore Ehlert
- 3. Statistical Analysis David Sheehan
- 4. PowerPoint Presentation Qun Luo
- 5. Executive Summary Steven Comstock
- 6. Technical Appendix Alan Weinberg
- 7. Graphics Gregory Adams
50PowerPoint Presentations Member 4
- 1. Introduction Members 1 ,2 , 3
- What
- Why
- How
- 2. Executive Summary Member 5
- 3. Exploratory Data Analysis Members 3, 7
- 4. Descriptive Statistics Member 3, 7
- 5. Statistical Analysis Member 3
- 6. Conclusions Members 3 5
- 7. Technical Appendix Table of Contents, Member
6
51Executive Summary and Technical Appendix
52(No Transcript)
53Grades
54Data Sources
- FRED Federal Reserve Bank of St. Louis,
http//research.stlouisfed.org/fred/ - Business/Fiscal
- Index of Consumer Sentiment, Monthly (195211)
- Light Weight Vehicle Sales, Auto and Light Truck,
Monthly (1976.01) - Economagic, http//www.economagic.com/
- U S Dept. of Commerce, http//www.commerce.gov/
- Population
- Economic Analysis, http//www.bea.gov/
55Data Sources (Cont. )
- Bureau of Labor Statistics, http//stats.bls.gov/
- California Dept of Finance, http//www.dof.ca.gov/