Title: Chi-Squared Hypothesis Testing
1Chi-Squared Hypothesis Testing
- Using One-Way and Two-Way Frequency Tables of
Categorical Variables
2?2 Hypothesis Test
Goodness-of-Fit
Independence
Homogeneity
3Analyzing an Exam Question
- How does a teacher determine if students were
clueless on an exam question vs. students were
unprepared for that particular exam question?
4Goodness-of-Fit Test
- If you need to test whether populations are
distributed evenly (or preset proportions),
then use Goodness-of-Fit test. - This requires a one-way frequency (count) table.
- Random sample is required for counts.
- Expected cell counts greater than 5.
Whats an expected cell count?
5Expected Cell Count?
A B C D E
68 53 78 42 59
- Suppose 300 students answered a multiple choice
question with the following distribution. Did
the students randomly select answers (I.e. are
the answers equally distributed)? - The expected cell count for A is 300(1/5) 60.
As the same is true for B thru E. If we assume
the answers are equally distributed (null
hypothesis), then we share the 300 responses
equally.
6Observed vs. Expected
A B C D E
Observed 68 53 78 42 59
Expected 60 60 60 60 60
- The observed values are the actual sampled counts
(occurrences). - The expected values are the hypothesized outcomes
based on the null hypothesis. - In this example, we are assuming the each answer
was equally selected by students.
7?2 Statistic
- The computer (or calculator) will calculate the
chi-squared statistic for you, and determine the
degrees of freedom and p-value.
What is degrees of freedom?
8Chi-Squared Statistic and p-value
- ?2 6.5, df 4, P(?2 gt 6.5) .16479
9?2 Statistic
A B C D E
Observed 68 53 78 42 59
Expected 60 60 60 60 60
- Ho ?A ?B ?C ?D ?E
- Ha at least one ? is different
- ?2 12.7, df 4, P(?2 gt 12.7) .0128
10Goodness-of-Fit Test
- What if the hypothesized proportions were not all
the same? - Example
- Does the color of your car influence the chance
it will be stolen? Suppose it is known that all
cars in the world consist of 15 white, 30
black, 35 red, 15 blue, and 5 other colors.
11Color of Stolen Car
White Black Red Blue Other
Obsv 140 230 270 100 90
Expect 124.5 249.0 290.5 124.5 41.5
- Ho ?W .15, ?B .30, ?R .35, ?U .30, ?E
.05 - Ha at least one ? is different
?2 66.33, df 4, P(?2 gt 66.33) 1.3x10-13
12Two-Way Tables
- Homogeneitytests for equal category proportions
for all populations (because separate random
samples were used to collect information).
- Independencetests for an independence (no
association) between 2 categorical variables.
Dont worry same test!
13College Students Drinking Levels
- The data on drinking behavior for independently
chosen random samples of male and female students
was collected. - Does there appear to be a gender difference with
respect to drinking behavior?
14Homogeneity Test
Gen der
Drinking Men Women
None 140 186
None (158.6) (167.4)
Low 478 661
Low (554.0) (585.0)
Moderate 300 173
Moderate (230.1) (242.9)
High 63 16
High (38.4) (40.6)
15College Students Drinking Levels
- Ho True proportions for the 4 drinking levels
are the same for males and females. - Ha At least one true proportion is different.
- ?2 96.53, df (4 1)(2 1) 3
- P(?2 gt 96.53) 8.68 x 10-21
- Reject Ho data indicates that males
- and females differ with respect
- to drinking levels.
16Sexual Risk-Taking Factors Among Adolescents
- Each person in a random sample of sexually active
teens was classified according to gender and
contraceptive use. - Is there a relationship between gender and
contraceptive use by sexually active teens?
17Independent (No Association) Test
Gen der
Contraceptive Use Female Male
Rarely/Never 210 350
Rarely/Never (224) (336)
Sometimes/ Most Times 190 320
Sometimes/ Most Times (204) (306)
Always 400 530
Always (372) (558)
18Sexual Risk-Taking Factors Among Adolescents
- Ho Gender and contraceptive use have no
association (independent). - Ha Gender and contraceptive use have an
association (dependent). - ?2 6.572, df (3 1)(2 1) 2
- P(?2 gt 6.572) .035
- Reject Ho and conclude there is an association
between gender and contraceptive use.
19Expected (Cell) Countfor Two-Way Tables
20Conditions (Requirements) for ?2 Test with 2-Way
Tables
- Random Sample
- At least 80 of Expected Cell Counts are greater
than 5. - All Expected Cell Counts and Observed values are
greater than or equal to 1.
21Titanic
- Moviemakers of Titanic imply that lower-class
passengers were treated unfairly. - Was that accurate?
22Likelihood of Survival on Titanic?
Children Women Men
Observed 57 296 146
Expected 41.269 152.199 305.533
- Ho ?C 109/1318, ?W 402/1318, ?M 807/1318
- Ha at least one ? is different
- ?2 225.16, df 2, P(?2 gt 225.16) 0.000
- Reject Ho and conclude at least one proportion is
different.