Title: Chapter 13: Categorical Data Analysis
1Statistics
- Chapter 13 Categorical Data Analysis
2Where Weve Been
- Presented methods for making inferences about the
population proportion associated with a two-level
qualitative variable (i.e., a binomial variable) - Presented methods for making inferences about the
difference between two binomial proportions
3Where Were Going
- Discuss qualitative (categorical) data with more
than two outcomes - Present a chi-square hypothesis test for
comparing the category proportions associated
with a single qualitative variable called a
one-way analysis - Present a chi-square hypothesis test relating two
qualitative variables called a two-way analysis
413.1 Categorical Data and the Multinomial
Experiment
- Properties of the Multinomial Experiment
- The experiment consists of n identical trials.
- There are k possible outcomes (called classes,
categories or cells) to each trial. - The probabilities of the k outcomes, denoted by
p1, p2, , pk, where p1 p2 pk 1, remain
the same from trial to trial. - The trials are independent.
- The random variables of interest are the cell
counts n1, n2, , nk of the number of
observations that fall into each of the k
categories.
513.2 Testing Categorical Probabilities One-Way
Table
- Suppose three candidates are running for office,
and 150 voters are asked their preferences. - Candidate 1 is the choice of 61 voters.
- Candidate 2 is the choice of 53 voters.
- Candidate 3 is the choice of 36 voters.
- Do these data suggest the population may prefer
one candidate over the others?
613.2 Testing Categorical Probabilities One-Way
Table
- Candidate 1 is the
- choice of 61 voters.
- Candidate 2 is the
- choice of 53 voters.
- Candidate 3 is the
- choice of 36 voters.
- n 150
713.2 Testing Categorical Probabilities One-Way
Table
Reject the null hypothesis
813.2 Testing Categorical Probabilities One-Way
Table
- Test of a Hypothesis about Multinomial
Probabilities - One-Way Table
- H0 p1 p1,0, p2 p2,0, , pk pk,0
- where p1,0, p2,0, , pk,0 represent the
hypothesized values of the multinomial
probabilities - Ha At least one of the multinomial probabilities
does not equal its hypothesized value - where Ei np1,0, is the expected cell count
given the null hypothesis.
913.2 Testing Categorical Probabilities One-Way
Table
- Conditions Required for a Valid ?2 Test
- One-Way Table
- A multinomial experiment has been conducted.
- The sample size n will be large enough so that,
for every cell, the expected cell count E(ni)
will be equal to 5 or more.
1013.2 Testing Categorical Probabilities One-Way
Table
Example 13.2 Distribution of Opinions About
Marijuana Possession Before Television Series has
Aired
Table 13.2 Distribution of Opinions About
Marijuana Possession After Television Series has
Aired
1113.2 Testing Categorical Probabilities One-Way
Table
1213.2 Testing Categorical Probabilities One-Way
Table
Expected Distribution of 500 Opinions About
Marijuana Possession After Television Series has
Aired
1313.2 Testing Categorical Probabilities One-Way
Table
Expected Distribution of 500 Opinions About
Marijuana Possession After Television Series has
Aired
1413.2 Testing Categorical Probabilities One-Way
Table
Expected Distribution of 500 Opinions About
Marijuana Possession After Television Series has
Aired
Reject the null hypothesis
1513.2 Testing Categorical Probabilities One-Way
Table
- Inferences can be made on any single proportion
as well - 95 confidence interval on the proportion of
citizens in the viewing area with no opinion is
1613.3 Testing Categorical Probabilities Two-Way
Table
- Chi-square analysis can also be used to
investigate studies based on qualitative factors. - Does having one characteristic make it more/less
likely to exhibit another characteristic?
1713.3 Testing Categorical Probabilities Two-Way
Table
The columns are divided according to the
subcategories for one qualitative variable and
the rows for the other qualitative variable.
1813.3 Testing Categorical Probabilities Two-Way
Table
1913.3 Testing Categorical Probabilities Two-Way
Table
- The results of a survey regarding marital status
and religious affiliation are reported below
(Example 13.3 in the text).
Religious Affiliation
Marital Status
H0 Marital status and religious affiliation are
independent Ha Marital status and religious
affiliation are dependent
2013.3 Testing Categorical Probabilities Two-Way
Table
- The expected frequencies (see Figure 13.4) are
included below
Religious Affiliation
Marital Status
The chi-square value computed with SAS is 7.1355,
with p-value .1289. Even at the ? .10 level,
we cannot reject the null hypothesis.
2113.3 Testing Categorical Probabilities Two-Way
Table
2213.4 A Word of Caution About Chi-Square Tests
2313.4 A Word of Caution About Chi-Square Tests
Be sure