Title: Power
1Chapter 8
2Chapter 4 Flashback. . . .
- Type I error is the probability of rejecting the
null hypothesis when it is really true. - The probability of making a type I error is
denoted as ?.
3Chapter 4 Flashback. . . .
- Type II error is the probability of failing to
reject a null hypothesis that is really false - The probability of making a type II error is
denoted as ?. - In this chapter, youll often see these outcomes
represented with distributions
4Distributions Representing the Various Outcomes
- To make these representations clear, lets first
consider the situation where H0 is, in fact, true
correct failure to reject
Alpha
Type I Error
- Now assume that H0 is false (i.e., that some
treatment has an effect on our dependent
variable, shifting the mean to the right).
5Distributions Representing the Various Outcomes
Distribution Under H0
Correct Rejection
Distribution Under H1
Type II error
Power
Alpha
6Definition of Power
- Thus, power can be defined as follows
- Assuming some manipulation effects the dependent
variable, power is the probability that the
sample mean will be sufficiently different from
the mean under H0 to allow us to reject H0. - As such, the power of an experiment depends on
three (or four) factors
7Standard Error of the Mean which is a function of
N and the population variance
Alpha
8Alpha
- As alpha is moved to the left (for example, if
one used an alpha of 0.10 instead of 0.05), beta
would decrease, power would increase ... but, the
probability of making a type I error would
increase. - ?1 - ?2
- The further that H1 is shifted away from H0, the
more power (and lower beta) an experiment will
have.
9Standard Error of the Mean
- The smaller the standard error of the mean (i.e.,
the less the two distributions overlap), the
greater the power. As suggested by the CLT, the
standard error of the mean is a function of the
population variance and N. Thus, of all the
factors mentioned, the only one we can really
control is N.
10Effect Size (d)
- Most power calculations use a term called effect
size which is actually a measure of the degree to
which the H0 and H1 distributions overlap. - As such, effect size is sensitive to both the
difference between the means under H0 and H1, and
the standard deviation of the parent populations. - Specifically
11Effect Size (d)
- In English then, d is the number of standard
deviations separating the mean of H0 and the mean
of H1. - Note N has not been incorporated in the above
formula. Youll see why shortly.
12Estimating the Effect Size
- As d forms the basis of all calculations of
power, the first step in these calculations is to
estimate d. - Since we do not typically know how big the effect
will be a priori, we must make an educated guess
on the basis of - 1) Prior research.
- 2) An assessment of the size of the effect that
would be important. - 3) General Rule (small effect d0.2, medium
- effect d0.5, large effect d 0.8)
13Bringing N back into the Picture
- The calculation of d took into account 1) the
difference between the means of H0 and H1 and 2)
the standard deviation of the population. - However, it did not take into account the third
variable the effects the overlap of the two
distributions N.
14Bringing N back into the Picture
- This was done purposefully so that we have one
term that represents the relevant variables we,
as experimenters, can do nothing about (d) and
another representing the variable we can do
something about N. - The statistic we use to recombine these factors
is called delta and is computed as follows - where the specific (N) differs depending on the
type of t-test you are computing the power for.
15Power Calcs for One Sample t
- In the context of a one sample t-test, the f(N)
alluded to above is simply - Thus, when calculating the power associated with
a one sample t, you must go through the following
steps - 1) Estimate d, or calculate it using
16Power Calcs for One Sample t
17Power Calcs for One Sample t
- Example
- Say I find a new stats textbook and after
looking at it, I think it will raise the average
mark of the class by about 8 points. From
previous classes, I am able to estimate the
population standard deviation as 15. If I now
test out the new text by using it with 20 new
students, what is my power to reject the null
hypothesis (that the new students marks are the
same as the old students marks). - How many new students would I have to test to
bring my power up to .90? - Note Dont worry about the bit on
noncentrality parameters in the book.