Title: Comparison and control: The difference between two proportions
1Lecture 13
- Comparison and controlThe difference between
two proportions
2Overview
- You can already calculate and interpret
- confidence intervals
- hypothesis tests
- for a single population proportion p
- e.g., proportion of applicants who get hired
- Today youll learn the same thing
- for the difference between two population
proportions(p1 p2) - e.g., difference between
- proportion p1 of female applicants who get hired
- proportion p2 of male applicants who get hired
- Well also reinforce how to use a control group
as a comparison
3Sex discrimination
- In government
- "I'm not for women, frankly, in any job. I don't
want any of them around. Thank God we don't have
any in the Cabinet." - "I don't think a woman should be in any
government job whatever. I mean, I really don't.
The reason why I do is mainly because they are
erratic. And emotional." - Richard Nixon, 37th President of the U.S.,
1969-1974 - In orchestras
- I just dont think women should be in an
orchestra. - Zubin Mehta
- conductor, LA Symphony, 1964-1978
- conductor, NY Philharmonic, 1978-1980
- There is one woman in the Vienna Philharmonic.
- out of 100 musicians
4Difference in population proportions
- Over the last 50 years,
- both women (population 1) and men (population 2)
- have auditioned for symphony orchestras
- Some proportion of female auditioners has been
hired p1 - and some proportion of male auditioners has been
hired p2 - Let the null hypothesis be absence of
discrimination - State the hypotheses in formal symbols
- H0 p1- p20 (no discrimination)
- H1 p1- p2lt0 (discrimination against women)
- One tail or two?
5Difference in sample proportions(adapted from
Goldin Rouse 2000)
- old audition records of certain major orchestras
- summarized in a bivariate table
- basically two parallel frequency tables,one for
women, one for men
aka contingency table (contab) aka
cross-tabulation
Note Table cells (white) sum to table margins
(black) Questions How many men were
rejected? How many women tried out? How many
musicians were hired? How many musicians tried
out?
6Bivariate table Another view
589/ 59998.33 1072/110297.28 1661/170197.65
98.331.67100 etc. but 98.3397.28?97.65
Questions What of women were hired? What
of men? What overall? What was the difference
in success between women and men?
7Difference in sample proportions
- Let women be group 1, men group 2
- In the sample,
- p1.0167 of women were hired, vs.
- p2.0272 of men
- The difference in sample proportions was
- p1- p2. 0167-.0272 -.0105, or 1.05
- Interpret that. Is it a big difference?
8Difference in population proportions
- The difference in sample proportions
- p1- p2. 0167-.0272 -.0105
- suggests that women were less successful.
- But what about the difference in population
proportions? - p1- p2
- Lets test the hypotheses
- H0 p1- p20 (no discrimination)
- H1 p1- p2lt0 (discrimination against women)
- One tail or two?
9Comparison of formulas
10Steps of a hypothesis test (review)
- State null and research hypotheses
- Collect sample
- Calculate test statistic
- If the null hypothesis were true,how
extreme/unusual would the test statistic be? - Interpret
11Example 1
- Hypotheses.
- women are population 1, men population 2
- p1, p2 are the proportions hired in each
population - H0 p1- p20 (no discrimination)
- H1 p1- p2lt0 (discrimination against women)
- Sample (after hypotheses)
- p1.0167, p2.0272
- N1599, N21102
12Example 1 3. Test statistic
13Example 14. If H0 were true, would the sample
be extreme/unusual?
What is the p value?
14Example 1 5. Interpretation
- We have borderline evidence that women were less
successful than men (plt.075). - If the male and female populations were equally
successful (i.e., if H0 were true) - less than 7.5 of samples would show men with
this large a lead (or larger).
15Example 1b 95 confidence interval guess
- In Example 1, comparing men and women, we didnt
quite reject - H0 p1- p20
- Will a 95 confidence interval for p1- p2 contain
0?
16Example 1b 95 confidence interval calculation
Interpretation In the population of auditioners,
were 95 sure that women were between 2.46
less likely than men and .36 more likely than
men to be hired.
17Example 1bConfidence interval hypothesis test
- Consistent with the hypothesis test,
- the interval contains 0though just barely.
- The value we didnt quite reject
- H0p1-p20
- is just barely in the range of plausible values
for p1-p2
CI
H0
p1- p2
0
-2
point estimate
18Hypothesis tests confidence intervals
Relationship (review)
- A confidence interval is a range of plausible
values - for p1- p2
- A hypothesis test evaluates whether
- 0 is a plausible value for p1- p2
- If the test rejects H0 p1- p2 0
- then 0 will not be in the confidence interval
- If the test accepts H0 p1- p2 0
- then 0 will be in the confidence interval
- but 0 is only one of many plausible values
19Discussion
- Suppose men were more successful than women.
- Would that prove discrimination?
20Control group
- Suppose men were more successful than women.
- Would that prove discrimination?
- Between 1952 and 1986
- after the auditions weve looked at
- the orchestras switched to a blind audition
format - Lets compare
- women in blind auditions (control group)
- women in nonblind auditions
- This isolates the effect of judges knowing the
players sex
21Example 2
- Hypotheses.
- women in nonblind auditions are group 1
- women in blind auditions are group 2
- p1, p2 are the proportions hired in each group
- H0 p1- p20 (no discrimination)
- H1 p1- p2lt0 (discrimination against women)
- Sample data (after hypotheses)
- p1.0167, p2.0270
- N1599, N2445
22Example 2 3. Test statistic
23Example 2 4. Hypothesis test
What is the p value? Would we reject H0?
24Example 2 Interpretation of test
- 7. Interpretation
- The evidence is not very convincing that blinding
improved womens prospects. - Does this mean there was no discrimination?
25Example 2b 80 confidence interval guess
- In Example 2, comparing blind and nonblind
auditions, we accepted - H0 p1- p20
- with .15gtpgt.10
- Will an 80 confidence interval contain 0?
26Example 2b 80 confidence interval calculation
Interpretation In the population of female
auditioners, those in nonblind auditions were
between 2.22 less likely and .16 more likely
to be hired.
27Example 2bConfidence interval hypothesis test
- Consistent with the hypothesis test,
- the interval does contain 0though just barely.
- The value we accepted H0p1-p20
- is (just) in the range of plausible values for
p1-p2 - but its just one plausible value
28Discrimination in orchestras Conclusion
- We have borderline evidence that women dont get
hired as much as men - but we cant convince a skeptic it was because of
discrimination - We will revisit this issue in Lecture 14
29Summary
- For the difference between two population
proportions (p1 p2) - youve learned
- confidence intervals
- hypothesis tests
- and their relationship
- If you reject H0 p1 p2 0
- then the CI wont contain p1 p2 0
- If you accept H0 p1 p2 0
- then the CI will contain p1 p2 0
- The converse is also true.
- The most useful comparison is often with a
control group - which is like the group of interest
- except for the factor youre most interested in
30Summary Last 2 lectures
- Procedures for comparing population proportions
- are like those for population means, except
- slightly different standard error formulas
- different distributions
- proportions use standard normal
- means use t