Title: Notes on Exam I
1Notes on Exam I
- Don't confuse ???with ?? .
- ??????????????????????in?confidence intervals
- When do we accept the null hypothesis?
- s and s2 have N-1 in the denominator.?
- 1-tail mean that we suspect a direction and we
are testing for that direction. - t in SPSS
2Chapter 8
- Statistical power and effect size
3What is statistical power?
- Statistical power is the ability to avoid type II
errors - Recall that a type II error is one where we
reject the alternative hypothesis, although it is
true - I.e. being too conservative in drawing
conclusions
4What is statistical power?
- Another way of looking at it, is that statistical
power is the ability to detect effects. - Definition effect - A difference due to some
treatment or subpopulation selection criteria
5The down side of statistical power?
- Being able to detect small effects is good.
- If you have a lot of power, you can detect small
effects. - Once detected as significant, a small effect can
be misinterpreted as large. - Remember significant ? large.
- This often happens in the popular press.
- This is why consumers of statistical information
need to know something about statistics. - Power is always good, so long as you know what
you are doing.
6Statistical power and the alternative hypothesis
distribution
- The key to the understanding statistical power is
the alternative hypothesis distribution. - Normally we look at the null hypothesis
distribution. - Recall that the NHD is the same as the
distribution of sample means assuming no
difference between populations. - The null hypothesis distribution is easier to
deal with because we know its mean. - Comparing a population to a subpopulation, the
mean of Mean of NHD ?p- ?s 0. - Comparing two subpopulations, the mean of the
Mean of NHD ?1- ?? 0.
7Statistical power and the alternative hypothesis
distribution
- The alternative hypothesis is trickier since the
alternative hypothesis is vague. - H0 ?1 ?? or ?1- ?? 0
- HA ?1? ?? or ?1lt ?? or ?1gt ??
- Which doesnt say anything specific about the
mean of the distribution of the sample means when
HA is true.
8Statistical power and the alternative hypothesis
distribution
- As you now know, mathematicians do some strange
things. - So, lets do like the mathematicians and pretend
for a moment that we do know the means of our sub
populations. - Further, we can standardize by considering our
AHD as the distribution of z or t - In this example we will focus on the t for the
difference of sample means.
9Statistical power and the alternative hypothesis
distribution
- As you now know, mathematicians do some strange
things. - So, lets do like the mathematicians and pretend
for a moment that we do know the means of our sub
populations. - Further, we can standardize by considering our
AHD as the distribution of z or t - In this example we will focus on the t for the
difference of sample means. - Assume HA ?1? ??? or ?1lt ?? or ?1gt ??
10Statistical power and the alternative hypothesis
distribution
- HA Men are taller than women.
- ?m 69 gt ?w 65
- ?m ?w 3
- N 4
- The probability of accepting the null when you
shouldnt is ? - The probability of not making this mistake is 1 -
? statistical power
11The alternative hypothesis distribution as the
distribution of t
- It has a maximum at the expected value of t
- This example focuses on the two sample t-test
- The shape is assumed to be normal.
- Is the t distribution normal?
- Statistical power calculations are rough
estimates.
12Large expected t (?) goodSmall expected t (?)
bad
13Analysis for a special case
- Special case
- Both samples are the same size, N1N2
- Both standard deviations are the same, ?1?2
- Most other cases will follow a similar pattern.
- Once again, statistical power is a matter of
rough approximation.
14The parts of ?
- We want ??to be large
- But what contributes to the size of ??
- Lets rearrange the factors of ?
- In particular, we know that large N is good.
- So, it should be in the numerator.
15The parts of ?
16Now we have two parts of ?
17Now we have two parts of ?
- The factor that depends on n
- We already know that we get more statistical
power when n is large
18Now we have two parts of ?
- And the rest
- This other part will be called d
- Large d means a large ??
19The anatomy of effect size d
- The numerator is the difference of the means.
- If there is a large difference, it is easier to
detect. - The denominator is the standard deviation of each
population. - Variance makes difference harder to detect.
- What can we do to affect the size of d?
20Your intuition and effect size
- Distributions have more or less overlap as the
difference of means and and standards deviation
changes.
21What do particular effect sizes mean?
- dlt.2
- Not worth investigating
- d.8
- A large effect but not obvious without statistics
- dgt1.33
- So obvious, an experiment is not needed
22What the parts of ??tell us
- N tells us how hard we have worked to find a
difference between two populations - d tells us how much difference there actually is
23Studies in conflict!
- When one study appears to overturn the results of
a previous study, should we be shocked? - Sometimes, sometimes not
- If there is not much effect size, it is no big
thing - If there is a big effect size, start looking for
an explanation
24How do we know how big the effect size is?
- We usually dont have ? or ??
- So if you are reading someones paper, how do you
know if the effect size is large or small? - The author will usually report t and N, among
other things - If t is large and N is small, d is probably
- If t is small and N is large, d is probably
25Back to ? and power
- How to calculate?
- The problem is
- Each AHD has a unique ?
- Each AHD is a different t dist.
- This leads to many possible dist.
- One can solve this problem with a computer
- Without a computer, one can at least eliminate
all the different t distributions by estimating
them using a normal distribution - See table A3
- Each normal distribution in A3 refers to a
different ? - This is just a rough estimate
26Exercises
27Setting upper and lower bounds on the sample size
- What is the smallest practical sample size
possible? 1? - When is our sample so big that there is no point
in making it any bigger? 100,000? - We want to know N
- Solve for N
28Setting upper and lower bounds on the sample size
- But we dont know ? or d -(
- Fortunately, experience with many previous t
tests has told us what a healthy amount of power
is - Experience also tells us for what values of d
(effect size) our two sample distributions are
well separated - Another example of intuition in mathematics
29Setting the upper useful bounds on the sample size
- Experience tells us that .8 is a healthy amount
of power (only 20 chance of type II error) - We also know power is related to ?.
- Reverse power table A.4 gives the relationship.
- .8 power translates into a ??of??????using
reverse power table A.4 - The smallest effect size d that would still be
interesting is .2
30Setting the upper useful bound on the sample size
- Plugging these numbers in gives
31Setting the lower useful bound on the sample size
- Experience tells us that .7 is the least amount
of power we would find acceptable - .7 power translates into a ??of???????using
reverse power table A.4 - Any d larger than .8 would be so large a
difference that we would hardly need to do an
experiment
32Setting the lower useful bound on the sample size
- Plugging these numbers in gives
- Experience tells us that .7 is the least amount
of power we would find acceptable - .7 power translates into a ??of???????using
reverse power table A.4 - Any d larger than .8 would be so large a
difference that we would hardly need to do an
experiment
33Setting the upper and lower bounds on the sample
size
- Thus, almost any experiment can be run with a
sample size n between 20 and 400
34Exercises