Title: Chi-Square Part II
1Chi-Square Part II
2Chi-Square Part II
- Let us see how this works in another example.
Attitudes towards Research Attitudes towards Research Attitudes towards Research
Attitudes Towards Statistics Favorable Neither favorable nor unfavorable Unfavorable Row Totals
Favorable 9 26 13 48
Neither favorable nor unfavorable 19 75 83 177
Unfavorable 16 56 110 182
Col. Totals 44 157 206 407
3Chi-Square Part II
- It has been argued that people with favorable
attitudes towards research tend to have favorable
attitudes towards statistics. - Question If we knew the attitudes towards
research of a respondent, can we predict the
attitude toward statistics?
4Chi-Square Part II
- Step 2
- H1 Knowledge of attitudes toward research does
help us predict attitudes towards statistics. - Step 1
- HO Knowledge of attitudes toward research does
not help us predict attitudes towards statistics.
5Chi-Square Part II
- Selecting a significance level Lets use ?.05.
This gives us a ?2 critical of 9.488. Your book
says the ?2 critical of 9.5. - Step 4 Collect and summarize sample data.
- We will use the chi-square test with 4 degrees of
freedom. - Why four? df(r-1) X (c-1)
- We have 3 rows and 3 columns.
- so we get df (3-1) X (3-1) 2 X 24
6Chi-Square Part II
- If we find a ?2 greater than or equal to 9.5 we
reject the null hypothesis and conclude that
attitudes towards research can predict attitudes
towards statistics. - If we find a ?2 less than 9.5 we fail to reject
the null hypothesis and conclude attitudes
towards research cannot predict attitudes towards
statistics.
7Calculation of Expected Frequencies
- Expected frequencies (Row total) X (Column
Total) - Grand Total
8Calculation of Expected Frequencies
- Cell a Favorable attitudes towards both
research and statistics. - (44) X (48) 5.18
- 407
9Calculation of Expected Frequencies
- Cell b Neither favorable or unfavorable
attitudes towards research, favorable attitudes
towards statistics. - (157) X (48) 18.51
- 407
10Calculation of Expected Frequencies
- Cell c Unfavorable attitudes towards research,
favorable attitudes towards statistics - (206) X (48) 24.29
- 407
11Calculation of Expected Frequencies
- Cell d Favorable attitudes towards research,
neither favorable or unfavorable attitudes
towards statistics - (44) X (177) 19.13
- 407
12Calculation of Expected Frequencies
- Cell e - Neither favorable or unfavorable
attitudes towards both statistics and research - (157) X (177) 68.27
- 407
13Calculation of Expected Frequencies
- Cell f Unfavorable attitudes towards research,
neither favorable or unfavorable attitudes
towards statistics - (206) X (177) 89.58
- 407
14Calculation of Expected Frequencies
- Cell g Favorable attitudes towards research,
unfavorable attitudes towards statistics - (44) X (182) 19.67
- 407
15Calculation of Expected Frequencies
- Cell h - Neither favorable or unfavorable
attitudes towards research, unfavorable attitudes
towards statistics - (157) X (182) 70.20
- 407
16Calculation of Expected Frequencies
- Cell i Unfavorable attitudes towards both
research and statistics - (206) X (182) 92.11
- 407
17So we set up our chi-square table
Cell f observed f expected f observed-f expected (i.e., RESIDUALS) (f observed-f expected)2 (f observed-f expected)2/f expected
a 9 5.18 3.82 14.59 2.81
b 26 18.51 7.49 56.1 3.03
c 13 24.29 -11.29 127.46 5.25
d 19 19.13 -0.13 0.17 0.008
e 75 68.27 6.73 45.29 0.6
f 83 89.58 -6.58 43.29 0.5
g 16 19.67 -3.67 13.46 0.67
h 56 70.20 -14.2 201.64 2.87
i 110 92.11 17.89 320.05 3.5
Total 407 407.00 0.00 20.2
18Hypothesis Testing with Chi-Square
- Step 5 Making a decision
- ?2 observed 20.2
- ?2 critical 9.488.
- Decision REJECT HO, and conclude that attitudes
towards research allow us to predict attitudes
towards statistics.
19Hypothesis Testing with Chi-Square
- Notes about chi-square
- (1) S (f observed - f expected)0.
- The RESIDUALS ALWAYS SUM TO ZERO.
- If S (f observed - f expected) does not equal
zero (within rounding error), you have made a
calculation error. Recheck your work.
20Hypothesis Testing with Chi-Square
- The chi-square test itself cannot tell us
anything about directionality. One way to get
directionality in the chi-square is to look at
the (f observed- f expected) column. We see that
certain cells occur much less frequently than we
would expect.
21Hypothesis Testing with Chi-Square
- For example cell c (unfavorable attitudes towards
research but favorable attitudes towards
statistics) occurs much less frequently than we
would expect on the basis of chance.
22Analysis of Residuals
Cell f observed f expected f observed-f expected (i.e., RESIDUALS) (f observed-f expected)2 (f observed-f expected)2/f expected
a 9 5.18 3.82 14.59 2.81
b 26 18.51 7.49 56.1 3.03
c 13 24.29 -11.29 127.46 5.25
d 19 19.13 -0.13 0.17 0.008
e 75 68.27 6.73 45.29 0.6
f 83 89.58 -6.58 43.29 0.5
g 16 19.67 -3.67 13.46 0.67
h 56 70.20 -14.2 201.64 2.87
i 110 92.11 17.89 320.05 3.5
Total 407 407.00 0.00 20.2
23Hypothesis Testing with Chi-Square
- We can also see that three cells that capture
consistency of attitudes between research and
statistics (cell a favorable attitudes for both,
cell e neither favorable or unfavorable attitudes
towards both, cell i unfavorable attitudes for
both) all have a positive values for (f observed-
f expected). - Those three cells are consistent with the
(unstated and untested) hypothesis that
individuals tend to have similar attitudes for
both research and statistics
24Hypothesis Testing with Chi-Square
- Only by examining the (f observed- f expected)
can we give any statement on the directionality
of the relationship. We could also analyze the
column percentages as we move across categories
of the independent variable to give us insight on
directionality.
25Hypothesis Testing with Chi-Square
- 3) In this example, why do we get statistical
significance? We can say that the cells d, e, f
and g do not contribute to the statistical
significance of the overall relationship. The
individual chi-square values for these four cells
are all very small. The overall relationship is
significant because of the other cells.
26Analysis of Residuals
Cell f observed f expected f observed-f expected (i.e., RESIDUALS) (f observed-f expected)2 (f observed-f expected)2/f expected
a 9 5.18 3.82 14.59 2.81
b 26 18.51 7.49 56.1 3.03
c 13 24.29 -11.29 127.46 5.25
d 19 19.13 -0.13 0.17 0.008
e 75 68.27 6.73 45.29 0.6
f 83 89.58 -6.58 43.29 0.5
g 16 19.67 -3.67 13.46 0.67
h 56 70.20 -14.2 201.64 2.87
i 110 92.11 17.89 320.05 3.5
Total 407 407.00 0.00 20.2
27Hypothesis Testing with Chi-Square
- Chi-square allows us to decompose the overall
relationship into its component parts. This
decomposition allows us to assess whether all
categories contribute to the significance of the
overall relationship.
28Hypothesis Testing with Chi-Square
- Limitations for ?2
- So far we have stressed the virtues for ?2 such
as weak assumptions, and a statistical
significance test appropriate for nominal level
data. This is why chi-square is so popular. - There are two limitations for ?2, one minor and
one major.
29Hypothesis Testing with Chi-Square
- Minor Limitation
- When the expected cell frequency is less than 5,
?2 rejects the null hypothesis too easily. (Note
this means the EXPECTED frequency and NOT the
OBSERVED frequency). - Solution Use Yates' correction
- Yates correction
- Take the (f observed- f expected) -0.5
30Hypothesis Testing with Chi-Square
- Major Limitation
- We have set up a null hypothesis that there is no
relationship between two variables and have tried
to reject this hypothesis. - We refer to a relationship as being statistically
significant when we have established, subject to
the risk of type I error, that there is a
relationship between two variables. - But does rejecting the null hypothesis mean the
relationship is significant in the sense of being
a strong or an important one? - Not necessarily.
31Hypothesis Testing with Chi-Square
- Remember significance levels are dependent upon
sample size. - Let us say that you wanted to investigate the
relationship between gender and level of
tolerance. You had no money to investigate this
relationship, so you handed out questionnaires
around UML and found the following
32Hypothesis Testing with Chi-Square
Gender Gender
Attitudes towards racial tolerance Males Females Row Totals
High 24 26 50
Low 26 24 50
Column Totals 50 50 100
33Hypothesis Testing with Chi-Square
- Is there a significant relationship between
gender and attitudes towards racial tolerance? - Let us use a.05.
- We have one degree of freedom.
- ?2 critical3.8. ?2 observed0.16.
- Since ?2 observed (0.16) lt ?2 critical (3.8), we
FAIL to reject the null hypothesis and conclude
that gender does not help us predict to attitudes
towards racial tolerance.
34Now let us say you had an extremely ambitious
study and you found the following relationship
Gender Gender
Attitudes towards racial tolerance Males Females Row Totals
High 2400 2600 5000
Low 2600 2400 5000
Column Totals 5000 5000 10000
35Hypothesis Testing with Chi-Square
- Is there a significant relationship between
gender and attitudes towards racial tolerance? - Let us use a.05.
- We have one degree of freedom.
- ?2 critical3.8, ?2 observed16.0.
- Since ?2 observed (16.0) gt ?2 critical (3.8), we
easily reject the null hypothesis and conclude
that gender does help us predict to attitudes
towards racial tolerance.
36Hypothesis Testing with Chi-Square
- ?2 is sensitive to the number of cases in the
sample. Even though the proportions in the cells
remain unchanged, the new ?2 is 100 times the old
chi-square because we have 100 times the number
of cases.
37Hypothesis Testing with Chi-Square
- Corrections for the sample size problem
- Pearson's contingency coefficient (You can ask
for the Contingency Coefficient with SPSS
CROSSTABS output).
38Hypothesis Testing with Chi-Square
- C ?2
- ?2 N
- where Ntotal number of cases in sample
- Problem with C Cannot attain 1.0 in perfect
relationship. - As the syllabus says, there is no ideal solution
to the sample size problem with chi-square.