Title: Chapter 13 Comparing Two Population Parameters
1Chapter 13Comparing Two Population Parameters
- AP Statistics
- Hamilton and Mann
2Lipitor or Pravachol
- Which drug is more effective at lowering bad
cholesterol? - To figure this out, researchers designed a study
they called PROVE-IT. - They used 4000 people with heart disease as
subjects. These people were randomly assigned to
one of two treatment groups Lipitor or
Pravachol. - At the end of the study, researchers compared the
mean bad cholesterol levels for the two groups.
For Pravachol it was 95 mg/dl versus 62 mg/dl
for Lipitor. Is this difference statistically
significant? - This is a question about comparing two means.
3Lipitor or Pravachol
- The researchers also compared the proportion of
subjects in each group who died, had a heart
attack, or suffered other serious consequences
within two years. - For Pravachol, the proportion was 0.263 and for
Lipitor it was 0.224. Is this a statistically
significant difference? - This is a question about comparing two
proportions.
4Success vs. Failure in Business
- How do small businesses that fail differ from
small businesses that succeed? - Business school researchers compared the asset
liability ratios of two samples of firms started
in 2000, one sample of failed businesses and one
of firms that are still going after two years. - This observational study compares two random
samples, one from each of two different
populations.
5Two-Sample Problems
- Comparing two populations or two treatments is
one of the most common situations encountered in
statistical practice. We call such situations
two-sample problems.
6Two-Sample Problems
- A two-sample problem can arise from a randomized
comparative experiment that randomly divides
subjects into two groups and exposes each group
to a different treatment, like the PROVE-IT
Study. - Comparing random samples separately selected from
two populations, like the successful and failed
small businesses, is also a two-sample problem. - Unlike the matched pairs designs studied earlier,
there is no matching of units in the two samples
and two samples can be of different sizes. - Inference procedures for two-sample data differ
from those of matched pairs.
7Comparing Means and Proportions
- Who is more likely to binge drink male or female
college students? - This is obviously a two-sample problem because we
are comparing the population of male college
students to female college students. - To conduct this study, the Harvard School of
Public Health surveyed random samples of male and
female undergraduates at four-year colleges and
universities about their drinking behaviors. - This observational study was designed to compare
the proportion of undergraduate males who binge
drink with the proportion of undergraduate
females who binge drink.
8Comparing Means and Proportions
- A bank wants to know which of two incentive plans
will most increase the use of its credit cards. - We are comparing the effect of two different
treatments here, so it is a two-sample problem. - It offers each incentive to a random sample of
credit card customers and compares the amount
charged during the following six months. - This is a randomized experiment designed to
compare the mean amount spent under each of the
two incentive treatments.
9Chapter 13 Section 1
- Comparing Two Means
- HW 13.1, 13.2, 13.4, 13.6, 13.8, 13.10, 13.11,
13.14, 13.16
10Comparing Two Means
- We can examine two-sample data graphically by
comparing dotplots or stempots (for small
samples) and boxplots or histograms (for large
samples). - Now we will apply the ideas of formal inference
in this setting. - When both population distributions are symmetric,
and especially when they are approximately
Normal, a comparison of the mean responses in the
two populations is the most common goal of
inference.
11(No Transcript)
12Notation
Parameters Parameters Statistics Statistics
Population Variable Mean Standard Deviation Sample Size Mean Standard Deviation
1 x1 µ1 ?1 n1 s1
2 x2 µ2 ?2 n2 s2
- There are four unknown parameters, the two means
and the two standard deviations. - We want to compare the two population means,
either by giving a confidence interval for their
difference µ1 - µ2 or by testing the hypothesis
of no difference, H0µ1 µ2. - We use the sample means and standard deviations
to estimate the unknown parameters.
13Calcium and Blood Pressure
- Does increasing the amount of calcium in our diet
reduce blood pressure? - An examination of a large number of people
revealed a relationship between calcium intake
and blood pressure. The relationship was
strongest for black men. As a result,
researchers designed a randomized comparative
experiment. - The subjects were 21 healthy black men. A
randomly chosen group of 10 of the men received
calcium supplements for 12 weeks. The other 11
men received a placebo pill that looked similar
for the 12 weeks.
14Calcium and Blood Pressure
- The response variable is the decrease in systolic
blood pressure for a subject after 12 weeks. An
increase appears as a negative response. - Group 1 will be the calcium group and Group 2
will be the placebo group. Here are the data. - Here are the summary statistics.
Group 1 Calcium Group Group 1 Calcium Group Group 1 Calcium Group Group 1 Calcium Group Group 1 Calcium Group Group 1 Calcium Group Group 1 Calcium Group Group 1 Calcium Group Group 1 Calcium Group Group 1 Calcium Group Group 1 Calcium Group
7 -4 18 17 -3 -5 1 10 11 -2
Group 2 Placebo Group Group 2 Placebo Group Group 2 Placebo Group Group 2 Placebo Group Group 2 Placebo Group Group 2 Placebo Group Group 2 Placebo Group Group 2 Placebo Group Group 2 Placebo Group Group 2 Placebo Group Group 2 Placebo Group
-1 12 -1 -3 3 -5 5 2 -11 -1 -3
Group Treatment n s
1 Calcium 10 5.000 8.743
2 Placebo 11 -0.273 5.901
15Calcium and Blood Pressure
- Notice that the calcium group experienced a drop
in blood pressure, while the placebo
group shows a small increase,
Is this good evidence that calcium decreases
blood pressure in the entire population of
healthy black men more than a placebo does? - This example fits the two-sample setting because
we have a separate sample from each treatment and
we have not attempted to match them. - Since we are testing a claim, we will conduct a
significance test and follow the Inference
Toolbox.
16Calcium and Blood Pressure
- Step 1 Hypotheses We write the hypotheses in
terms of the mean decreases we would see in the
entire population µ1 of black men taking calcium
for 12 weeks and µ2 for black men taking the
placebo for 12 weeks. There are two possible
hypotheses or
17Calcium and Blood Pressure
- Step 2 Conditions We do not know the name of
the test, but we know the conditions we must
check to compare two means. - SRS The 21 subjects are not an SRS. Therefore,
we may not be able to generalize our findings to
all healthy black men. Since we randomly
assigned treatments, however, any differences can
be attributed to the treatments themselves. - Normality Since we have small samples, we must
look at a boxplot and histogram for both samples.
There are no serious problems (outliers or
serious departure from Normality). - Independence Since we randomized the
treatments, we can safely assume that the calcium
and placebo are two independent samples.
18Calcium and Blood Pressure
- The natural estimator of the difference µ1 - µ2
is the difference between the sample means - This statistic measures the average advantage of
calcium over the placebo. In order to use this,
however, we need to know about its sampling
distribution. In other words, we need to know
what the mean and standard deviation would be for
the population of differences if we took repeated
samples many times.
19The Two-Sample z Statistic
- Here are the facts about the sampling
distribution of the difference
between the two sample means of independent SRSs. - Therefore,
- If both populations are Normal, then the
distribution of is also Normal
with
20Two-Sample z Statistic
- When the statistic has a Normal
distribution, we can standardize it to obtain a
standard Normal z statistic.
21Two-Sample z Statistic
- In the very unlikely case that we know both
population standard deviations, the two-sample z
statistic is what we would use to conduct
inference about - Since we rarely know one, much less two,
population standard deviations, we are going to
move immediately to the more useful t procedures.
22Two-Sample t Procedures
- Because we dont know the population standard
deviations, we estimate them with the standard
deviations from our two samples. - The result is the standard error, or estimated
standard deviation, of the difference in sample
means - We then standardize our estimate
the result if the two-sample t statistic
23Two-Sample t Procedures
- The statistic t has the same interpretation as
any z or t statistic it says how far
is from its mean in standard deviation units. - The two-sample t statistic has approximately a t
distribution. It does not have exactly a t
distribution even if the populations are both
exactly Normal. The approximation is very close
though. - There is a catch we must use a messy formula to
calculate the degrees of freedom. Often, the
degrees of freedom are not whole numbers.
24Two-Sample t Procedures
- There are two practical options for using the
two-sample t procedures - With technology, use the statistic t with
accurate critical values from the approximating t
distribution. - Without technology, use the statistic t with
critical values from the t distribution with
degrees of freedom equal to the smaller of n1 1
and n2 1. These procedures are always
conservative for any two Normal populations. - Technology will obviously use method 1.
- We are going to start by looking at how to do
method 2.
25(No Transcript)
26Two-Sample t Procedures
- These two-sample t procedures always err on the
safe side, reporting higher P-values and lower
confidence than may actually be true. The gap
between what is reported and the truth is
actually quite small unless the sample sizes are
both small and unequal. - As the sample sizes increase, probability values
based on t with degrees of freedom equal to the
smaller of n1 1 and n2 1 become more
accurate. - Lets complete our calcium and blood pressure
problem from earlier.
27Calcium and Blood Pressure
- Here are the summary statistics again.
- Step 3 Calculations
- Since it was a one-sided test, we are looking for
the probability being 1.604 or greater when we
have 9 degrees of freedom. From the table, it is
between 0.05 and 0.10.
Group Treatment n s
1 Calcium 10 5.000 8.743
2 Placebo 11 -0.273 5.901
28Calcium and Blood Pressure
- Step 4 Interpretation
- The experiment provides some evidence that
calcium reduces blood pressure, but the evidence
falls short of the traditional 5 and 1 levels
of significance. We would fail to reject H0 at
both significance levels.
29Creating a Confidence Interval
- We can estimate the difference in mean decreases
in blood pressure for the hypothetical calcium
and placebo populations using a two-sample t
interval. - We have already checked all of the conditions.
- Recall
- Since the 90 confidence interval includes 0, we
cannot reject H0µ1 µ2 0 against the
two-sided alternative at the a 0.10 level of
significance.
Group Treatment n S
1 Calcium 10 5.000 8.743
2 Placebo 11 -0.273 5.901
30Sample Size Matters
- Sample sizes strongly influence the P-value of a
test. - A result that fails to be significant at a
specified level a in a small sample may be
significant in a larger sample. - For instance, the difference of 5.273 in the mean
systolic blood pressures between our two groups
was not significant. In a larger study with more
subjects, they were able to obtain a P-value of
0.008.
31Robustness Again
- The two-sample t procedures are more robust than
the one-sample t procedures, particularly when
the distributions are not symmetric. - When the sizes of the two samples are equal and
the two populations being compared have
distributions with similar shapes, probability
values from the t table are quite accurate for a
broad range of distributions for samples as small
as 5. When the populations have different
shapes, larger samples are needed.
32Robustness Again
- As a guide to practice, adapt the guidelines on
p. 655 for the use of one-sample t procedures to
two-sample t procedures by replacing sample
size with the sum of the sample sizes as long
as both samples are at least 5. - These guidelines err on the side of safety,
especially when the two-samples are of equal
size. - Whenever possible, try to make both samples the
same size. Two-sample procedures are most robust
against non-Normality when the sample sizes are
equal and the conservative P-values are most
accurate.
33Software Approximations for the DF
- The t procedures remain exactly as before except
that we use the t distribution with df given by
the formula in the box above to give critical
values and find P-values.
34Calcium and Blood Pressure
- Here are the summary statistics again.
- For improved accuracy, lets calculate the df
given by the formula on the prior slide.
Group Treatment n s
1 Calcium 10 5.000 8.743
2 Placebo 11 -0.273 5.901
35(No Transcript)
36- Notice that the P-value here is 0.064 compared to
the 0.0716 we got from the conservative approach.
37Degrees of Freedom
- The formula from the box will always give us df
at least as large as the smaller of the two
samples and never bigger than n1 n2 -2. - The number of degrees of freedom is generally not
a whole number. Since the table only has whole
numbers, we will need to use technology to do
these calculations easily. - Lets do the Calcium and Blood Pressure problem
on the calculator! - We should use the calculator to do these
calculations from now on!
38DDT Poisoning
- Poisoning by the pesticide DDT causes convulsions
in humans and other mammals. Researchers seek to
understand how the convulsions are caused. In a
randomized comparative experiment, the compared 6
white rats poisoned with DDT with a control group
of 6 unpoisoned rats. Electrical measurements of
nerve activity are the main clue to the nature of
DDT poisoning. When a nerve is stimulated, its
electrical response shows a sharp spike followed
by a much smaller second spike. The experiment
found that the second spike is larger in rats fed
DDT than in normal rats.
39DDT Poisoning
- The researchers measured the height (or
amplitude) of the second spike as a percent of
the first spike when a nerve in the rats leg was
stimulated. - For the poisoned rats the results were
- For the control group the results were
- Lets conduct a significance test at the 0.05
significance level to determine if there is a
difference using the calculator.
12.207 16.869 25.050 22.429 8.456 20.589
11.074 9.686 12.064 9.351 8.182 6.642
40DDT Poisoning
- Step 1 Hypotheses
- We want to compare the mean height µ1 of the
second-spike electrical response in rats fed DDT
with the mean height µ2 of the second-spike
electrical response in the population of normal
rats. Or
41DDT Poisoning
- Step 2 Conditions Since both population
standard deviations are unknown we need to
conduct a 2-sample t test. - SRS By randomly assigning the rats to the
treatments, we can conclude that differences are
a result of the treatment. The researchers are
willing to assume that the two samples of rats
represent an SRS. - Normality We dont know if the populations are
Normal and do not have a large enough sample. We
must look at a boxplot and histogram. No
outliers or heavy skewness. - Independence Due to the random assignment, the
researchers can treat the two groups as
independent.
42DDT Poisoning
- Step 3 Calculations
- Since it is a two-sided hypothesis, we must find
the probability that we are less than -2.99 or
greater than 2.99. - The degrees of freedom are df 5.9 and the
P-value from t(5.9) distribution is 0.0246. - Step 4 Conclusion
- Since 0.0246 is less than the significance level
of 0.05, we reject the null hypothesis and
conclude that there is sufficient evidence to
conclude that the height of the second-spike
electrical response in rats fed DDT differs from
that of normal rats.
43(No Transcript)
44(No Transcript)
45(No Transcript)
46Pooled Two-Sample t Procedures
- Do not use them.
- If a printout says pooled, do not use that.
Instead use the one that says unpooled. - On the calculator, always do No for pooled.
- If you want more information you can read it on
p. 800.
47Chapter 13 Section 2
- Comparing Two Proportions
- HW 13.26, 13.27, 13.28, 13.29, 13.30, 13.32,
13.33, 13.38
48Prayer and In Vitro Pregnancy
- Some women want to have children but cannot for
medical reasons. One option for these women is
in vitro fertilization. About 28 of women who
undergo in vitro fertilization get pregnant. Can
praying for these women help increase the
pregnancy rate? - Researchers developed an experiment to help
answer this question. (Why not just survey women
who have already gone through in vitro to find
out if a higher percentage of women who were
prayed for got pregnant?)
49Prayer and In Vitro Pregnancy
- A large group of women who were about to undergo
in vitro fertilization served as the subjects.
Each subject was randomly assigned to the
treatment group (prayed for by people who did not
know them) or a control group (no prayer). - The results 44 of the 88 women (50) got
pregnant in the treatment (prayer) group while
only 21 out of 81 got pregnant in the control
group. - This seems like a large difference, but is it
statistically significant?
50Two-Sample Proportions
- We will use notation that is similar to what we
used for two-sample means. We still want to
compare two groups, Population 1 and Population
2. - Here is the notation
- We compare the populations by doing inference
about the difference p1 - p2 between the
population proportions. - The statistic that estimates this difference is
Population Population Proportion Sample Size Sample Proportion
1 p1 n1
2 p2 n2
51Does Preschool Help?
- To study the long-term effects of preschool
programs for poor children, the High/Scope
Educational Research Foundation has followed two
groups of Michigan children since early
childhood. - Group 1 Control Group 61 children from
population 1, poor children with no preschool - Group 2 Treatment Group 62 children from
population 2, poor children with preschool as 3-
and 4-year-olds. - Both groups were from the same area and had
similar backgrounds. - So our sample sizes are n1 61 and n2 62.
52Does Preschool Help?
- One response variable of interest is the need for
social services as adults. In the past ten
years, 49 of the control sample and 38 of the
preschool sample had needed social services. So
the sample proportions are - To see if the study provides significant evidence
that preschool reduces the later need for social
services, we are going to create a 95 confidence
interval.
53Does Preschool Help?
- To estimate how large the reduction is, we give a
confidence interval for the difference. - Both the test and the confidence interval start
with the difference in the sample proportions - This means we need to know the sampling
distribution of - So lets look at that now!
54Sampling Distribution of .
- Both are random variables because
their values would vary if we took repeated
samples of the same size. - In Chapter 7, we learned that if X and Y are any
two random variables then - In Chapter 9, we learned that
55Sampling Distribution of .
- Using all of this information, we can find the
mean and standard deviation of - If the two sample proportions are independent,
- Thus
56Sampling Distribution of .
- As far as the shape, the distribution will be
approximately normal when both of the
distributions are approximately Normal. - In other words,
- Actually, we are safe performing significance
tests about as long as all of these
values are greater than 5. - The distribution of is on the next
graph.
57Sampling Distribution of .
58Sampling Distribution of .
- The standard deviation of involves
the unknown parameters p1 and p2. - Just like in Chapter 12, we must replace these by
estimates in order to do inference. - Just like in Chapter 12, we do this a bit
differently for confidence intervals and
significance tests.
59Confidence Intervals for .
- To obtain a confidence interval, replace p1 and
p2 in the expression for with the
sample proportions. - The result is the standard error of the statistic
- The confidence interval again has the form
60(No Transcript)
61Does Preschool Help?
- Here is a summary of the information from the
preschool problem we discussed earlier. - We setup our hypotheses earlier. So we have
already done Step 1. Here are the Hypotheses as
a reminder. or
Population Population Description Sample Size Sample Proportion
1 Control n1 61
2 Preschool n2 62
62Does Preschool Help
- Step 2 Conditions We are going to construct a
two-proportion z interval. - SRS We were not told how the children were
selected, so we must be cautious when drawing
conclusions. - Normality - Since all are at least 5 we can
assume Normality. - Independence We are fairly certain that there
are at least 610 poor children who did not attend
preschool and 620 poor children who did attend
preschool in our populations of interest.
63Does Preschool Help
- Step 3 Calculations
- Step 4 Interpretation
- We are 95 confident that the percent needing
social services is between 3.3 and 34.7 lower
among those who attended preschool. The interval
is wide because of the small sample sizes. Also,
our results may be questionable due to the fact
that the samples may not have been SRSs.
64(No Transcript)
65(No Transcript)
66Significance Tests for .
- Observed differences in sample proportions may
reflect a difference in the populations, or it
may just be due to variation due to random
sampling. - Significance tests help us to determine if the
difference we see is really there or just chance
variation. - The null hypothesis will always say that there is
no difference in the two populations. Hence - The alternative hypothesis will always say what
kind of difference we expect.
67Significance Tests for .
- To conduct a significance test, we must
standardize to get a z statistic. - If H0 is true, all the observations in both
samples come from a single population. - So, instead of estimating p1 and p2 separately,
we combine the two samples and use the overall
sample proportion to estimate the single
population parameter p.
68Significance Tests for .
- We call this single proportion the combined
sample proportion. It is - Now, we use in place of both
in the expression for the standard error of - This yields a z statistic that has the standard
Normal distribution when H0 is true.
69(No Transcript)
70Cholesterol and Heart Attacks
- High levels of cholesterol in the blood are
associated with higher risk of heart attacks.
Does using a drug to lower blood cholesterol
reduce heart attacks? - The Helsinki Heart Study looked at this question
by randomly assigning middle-aged men to one of
two treatments 2051 men took the drug
gemfibrozil to reduce their cholesterol levels,
and a control group of 2030 men took a placebo. - During the next 5 years, 56 men in the
gemfibrozil group and 84 men in the control group
had heart attacks.
71Cholesterol and Heart Attacks
- Is the apparent benefit of gemfibrozil
statistically significant? - To answer this question, we need to conduct a
significance test. - To conduct a significance test we need So
lets find
Population Population Description Sample Size Sample Proportion
1 Gemfibrozil n1 2051
2 Control n2 2030
72Cholesterol and Heart Attacks
- Step 1 Hypotheses We want to use this
comparative randomized experiment to draw
conclusions about p1, the proportion of
middle-aged men who would suffer heart attacks
after taking gemfibrozil, and p2, the proportion
of middle-aged men who would suffer heart attacks
if they only took a placebo. We hope to show
that gemfibrozil reduces heart attacks, so we
have a one-sided alternative.
73Cholesterol and Heart Attacks
- Step 2 Conditions - We are going to conduct a
two-proportion z test. - SRS Since the data come from a comparative
randomized experiment, we meet this condition.
This will allow us to conclude that the treatment
caused the differences we observe. Since the men
in the experiment were not randomly selected, we
may not be able to generalize our results to the
population of all middle-aged men. - Normality We must use to check for
Normality since we are assuming that both
proportions are the same. So - Independence Due to the random assignment of
men, the two groups of men can be viewed as
independent samples.
74Cholesterol and Heart Attacks
- Step 3 Calculations
- We believed it would decrease heart attacks, so
we need the probability that we are less than or
equal to -2.47.
75Cholesterol and Heart Attacks
- Step 4 Interpretation Since our P-value
(0.0068) is less than 0.01, our results are
significant at the a 0.01 significance level.
So there is strong evidence that gemfibrozil
reduced the rate of heart attacks.
76Dont Drink the Water
- The movie A Civil Action tells the story of a
legal battle that took place in the small town of
Woburn, Massachusetts. A town well that supplied
water to East Woburn residents was contaminated
by industrial chemicals. During the period that
residents drank the water from this well, a
sample of 414 births showed 16 birth defects. On
the west side of Woburn, a sample of 228 babies
born during the same time period revealed 3 with
birth defects. The plaintiffs suing the
companies responsible for the contamination
claimed that these data show that the rate of
birth defects was significantly higher in East
Woburn, where the contaminated well water was in
use. How strong is the evidence supporting the
claim? What decision should the judge make?
77Dont Drink the Water
- To conduct a significance test we need So
lets find - Step 1 Hypotheses We are interested in seeing
if there is a difference in the proportion of
birth defects between East and West Coburn.
Population Population Description Sample Size Sample Proportion
1 East Coburn n1 414
2 West Coburn n2 228
78Dont Drink the Water
- Conditions We are going to conduct a
Two-Proportion z test. - SRS We dont know that they are SRSs, but we
will treat them as SRSs. - Normality We must check our rules.
Since each is larger than 5, it
is approximately Normal. - Independence We must assume that both
populations are at least 10 times as large as the
sample of babies.
79Dont Drink the Water
- Step 3 - Calculations
- The P-value would be the probability that we
would be 1.82 or greater. - Step 4 Interpretation
- Since the P-value (0.0344) is smaller than the
usual level of significance of 0.05, we reject
the null hypothesis and conclude that there is
reason to believe that the proportion of birth
defects was higher in East Coburn.
80(No Transcript)
81(No Transcript)