Title: On Thursday, Ill provide information about the project
1Announcements
- On Thursday, Ill provide information about the
project - Due on Friday after last class.
- Proposal will be due two weeks from today (April
15th) - Youre encouraged (but not required) to work in
groups of three people - Homework
- Due next Tuesday
- On web tonight
2Hypothesis Testing 20,000 Foot View
- Set up the hypothesis to test and collect data
Hypothesis to test HO
3Hypothesis Testing 20,000 Foot View
- Set up the hypothesis to testand collect data
- Assuming that the hypothesis is true, are the
observed data likely?
Hypothesis to test HO
Data are deemed unlikely if the test statistic
is in theextreme of its distribution when HO is
true.
4Hypothesis Testing 20,000 Foot View
- Set up the hypothesis to testand collect data
- Assuming that the hypothesis is true, are the
observed data likely? - If not, then the alternative to the hypothesis
must be true.
Hypothesis to test HO
Data are deemed unlikely if the test
statistics is in theextreme of its distribution
when HO is true.
Alternative to HO is HA
5Hypothesis Testing 20,000 Foot View
- Set up the hypothesis to testand collect data
- Assuming that the hypothesis is true, are the
observed data likely? - If not, then the alternative to the hypothesis
must be true. - P-value describes how likely the observed data
are assuming HO is true. (i.e. answer to Q2
above)
Hypothesis to test HO
Data are deemed unlikely if the test
statistics is in theextreme of its distribution
when HO is true.
Alternative to HO is HA
Unlikely if p-value lt a
6Large Sample Test for a ProportionTaste Test
Data
- 33 people drink two unlabeled cups of cola (1 is
coke and 1 is pepsi) - p proportion who correctly identify drink
20/33 61 - Question is this statistically significantly
different from 50 (random guessing) at a 10?
7Large Sample Test for a ProportionTaste Test
Data
- HO p 0.5HA p does not equal 0.5
- Test statistic z (p - .5)/sqrt( p(1-p)/n)
(.61-.5)/sqrt(.61.39/33) 1.25 - Reject if z gt z0.10/2 1.645
- Its not, so theres not enough evidence to
reject HO.
8Large Sample Test for a ProportionTaste Test
Data
- P-value
- Pr( (P-p)/sqrt(P Q/n) gt
- (p-p)/sqrt(p q/n) when H0 is true)
- Pr( (P-0.5)/sqrt(P Q/n) gt 1.25 when H0 is
true) - 2Pr( Z gt 1.25) where ZN(0,1)
- 21
- i.e. How likely is a test statistic of 1.25
when true p 50?
9Minitab
- Minitab computes the test statistic as
- z (p - .5)/sqrt( .5(1-.5)/n)
(.61-.5)/sqrt(.25/33) 1.22 - Since .25 gt p(1-p) for any p, this is more
conservative (larger denominator smaller test
statistic). Either way is fine.
10Difference between two means
- PCB Data
- Sample 1 Treatment PCB 156
- Sample 2 Treatment PCB 156 estradiol
- Response estrogen produced by cells
- Question Can we conclude that average estrogen
produced in sample 1 is different from average by
sample 2 (at a 0.05)? -
11- H0 m1 m2 0HA m1 m2 does not 0
- Test statistic
- (Estimate value under H0)/Std Dev(Estimate)
- z (x1 x2)/sqrt(s12/n1 s22/n2)
- Reject if z gt za/2
- P-value
- 2Pr Z gt (x1 x2)/sqrt(s12/n1 s22/n2)
where ZN(0,1).
12- n x s
- PCB156 96 1.93 1.00
- PCB156E 64 2.16 1.01
- z -0.23/sqrt(1.002/96 1.012/64)
-1.42 1.42 - za/2 z0.05/2 z0.025 1.96
- So dont reject.
- P-value 2Pr(Z gt 1.42) 16
Pr( Test statistic gt 1.42 when HO is true)
13In General, Large Sample 2 sided Tests
- Test statistic
- z (Estimate value under H0)/Std
Dev(Estimate) - Reject if z gt za/2
- P-value 2Pr( Z gt z ) where ZN(0,1).
14Large Sample Hypothesis Tests summary for means
- Single mean
- Hypotheses Test (level 0.05)
- HO m k Reject HO if (x-k)/s/sqrt(n)gt1.96
- HA m does not k p-value 2Pr(Zgt(x-k)/s/sqrt
(n)) where ZN(0,1) - Difference between two means
- Hypotheses Test (level 0.05)
- HO m1-m2 D Let d x1 x2 HA m1-m2 does
not D Let SE sqrt(s12/n2
s22/n2) Reject HO if (d-D)/SEgt1.96 - p-value 2Pr(Zgt(d-D)/SE) where
ZN(0,1)
15Large Sample Hypothesis Tests summary for
proportions
- Single proportion
- Hypotheses Test (level 0.05)
- HO true p k Reject HO if (p-k)/sqrt(p(1-p)/n
)gt1.96 - HA p does not k p-value 2Pr(Zgt(p-k)/sqrt(p
(1-p)/n)) where ZN(0,1) - Difference between two proportions
- Hypotheses Test (level 0.05)
- HO p1-p2 d Let d p1 p2 HA p1-p2 does
not d Let p total success/(n1n2) Le
t SE sqrt(p(1-p)/n1 p(1-p)/n2) - Reject HO if (p-d)/SEgt1.96
- p-value 2Pr(Zgt(d)/SE) where
ZN(0,1)
16Hypothesis tests versus confidence intervals
The following is discussed in the context of
tests / CIs for a single mean, but its true
for all the confidence intervals / tests we have
done.
- A two sided level a hypothesis test, H0 mk vs
HA m does not equal k - is rejected if and only if k is not in a 1-a
confidence interval for the mean. - A one sided level a hypothesis test, H0 mltk
vs HA mgtkis rejected if and only if a level
1-2a confidence interval is completely to the
left of k.
17Hypothesis tests versus confidence intervals
- The previous slide said that confidence intervals
can be used to do hypothesis tests. - CIs are better since they contain more
information. - Fact Hypothesis tests and p-values are very
commonly used by scientists who use statistics. - Advice
- Use confidence intervals to do hypothesis testing
- know how to compute / and interpret p-values
18Type 1 and Type 2 Errors
Action
Fail to Reject H0
Reject H0
Significance level a Pr(Making type 1 error)
correct
H0 True
Type 1 error
Truth
Power 1Pr(Making type 2 error)
Type 2 error
correct
HA True
19In terms of our folate example, suppose we
repeated the experiment and sampled 333 new people
- Pr( Type 1 error ) Pr( reject H0 when mean is
300 ) Pr( Z gt z0.025 ) Pr( Z gt 1.96 ) Pr(
Z lt -1.96 ) 0.05 a
When mean is 300, then Z, the test statistic, has
a standard normal distribution.
Note that the test is designed to have type 1
error a
20- Power Pr( reject H0 when mean is not 300 )
Pr( reject H0 when mean is 310) Pr(
(X-300)/193.4/sqrt(333) gt 1.96) Pr(
(X-300)/10.6 gt 1.96 )Pr( (X-300)/10.6 lt -1.96
) Pr(X gt 320.8) Pr(X lt 279.2) - Pr( (X 310)/10.6 gt (320.8-310)/10.6 ) Pr(
(X 310)/10.6 lt (279.2-310)/10.6 ) - Pr( Z gt 1.02 ) Pr( Z lt -2.90 ) where
ZN(0,1) 0.15 0.00 0.15
In other words, if there true mean is 310,
theres an 85 chance that we will not detect it.
If 310 is scientifically significantly different
from 300, then this means that our experiment
was wasted in some sense.
As n increases, power goes up.As standard
deviation of x decreses, power goes up. As a
increases, power goes up.
21Picture for Power
Power forn333 and a 0.05
1.0
As n increases and/or a increases and/or stddev
decreases, thesecurves becomesteeper
0.8
Pr(Reject HO when its false)
0.6
Power
0.4
0.2
260
280
300
320
340
True Mean
22Power calculations are a very important part of
planning any experiment
- Given
- a certain level of a
- preliminary estimate of std dev (of xs that go
into x) - difference that is of interest
- Compute required n in order for power to be at
least 85 (or some other percentage...)
23Power calculations are an integral part of
planning any experiment
- Bad News Algebraically messy (but you should
know how to do them) - Good News Minitab can be used to do them
- Stat Power and Sample Size
- Inputs
- required power
- difference of interest
- Output
- Result required sample size
- Options Change a, one sided versus 2 sided tests