Title: Statistical Tests
1Statistical Tests
- How to tell if something (or somethings) is
different from something else
2Populations vs. Samples
- Remember that a population is all the possible
members of a category that we could measure - Examples
- the heights of every male or every female
- the temperature on every day since the beginning
of time - Ever person who ever has, and ever will, take a
particular drug
3Populations vs. Samples
- So a population is kind of abstract - typically
you couldnt ever hope to measure the entire
population - Notable exceptions include
- Standardized tests (mean IQ is 100 with std. dev.
of 15) - Special populations such as rare diseases or
isolated groups of people
4Populations vs. Samples
- A sample is some subset of a population
- Examples
- The heights of 10 students picked at random
- The participants in a drug trial
5Populations vs. Samples
- The notation
- Sample statistics are usually regular letters
like s and - Population statistics are usually greek letters
like
? - the population mean
? - the population standard deviation
6Populations vs. Samples
- Test your intuition
- Under what circumstances does the mean of a
sample equal the mean of the population from
which it was drawn? - What about the standard deviation?
- What if your sample was very small relative to
the population?
7Populations vs. Samples
- Test your intuition
- Most importantly What if you took more than one
sample
8Central Limit Theorem
- There is a distribution of sample means
9Central Limit Theorem
- There is a distribution of sample means
The population of IQ scores
100
10Central Limit Theorem
- There is a distribution of sample means
Your Sample
95
The population of IQ scores
100
11Central Limit Theorem
- There is a distribution of sample means
Your Sample
103
The population of IQ scores
100
12Central Limit Theorem
- There is a distribution of sample means
Your Sample
99
The population of IQ scores
100
13Central Limit Theorem
- There is a distribution of sample means
- This is the sampling distribution of the mean
14Central Limit Theorem
- What is the mean of the sampling distribution of
the mean? - mean of the sampling distribution approaches the
mean of the population with many resamplings
15Central Limit Theorem
- What is the standard deviation of the sampling
distribution of the mean? - The standard error of the mean
Notice it will always be less than the standard
deviation of the population!
16Central Limit Theorem
- What is the shape of the sampling distribution of
the mean? - Central Limit Theorem the sampling distribution
of the mean is normal regardless of the shape of
the underlying distribution ! - This means you can use the Z transform and use
the Z table
17The Logic of Statistical Tests
18Statistical Tests
- Consider a simple example
- you are testing the hypothesis that eating
walnuts makes people smarter by feeding walnuts
to a group of 30 subjects and then testing their
IQ
19Statistical Tests
- Consider a simple example
- you are testing the hypothesis that eating
walnuts makes people smarter by feeding walnuts
to a group of 30 subjects and then testing their
IQ - If you are right, then eating walnuts will make
the average IQ of your subjects be higher than
the average IQ of all people (the population)
since, mostly, those other people dont eat
walnuts much
20Statistical Tests
- Consider a simple example
- Put another way
- Is this sample (entirely) of walnut eaters
different from the population of mostly
non-walnut-eaters
21Types of Errors
- There are two mistakes you could make
22Types of Errors
- There are two mistakes you could make
- Type I error or False-Positive - you decide the
walnut treatment works when it doesnt really - Type II error or False-Negative - you decide the
walnuts dont work when really they do
23Types of Successes
- There are two ways to succeed
- Hit or True-Positive You decide the walnuts do
make people smarter and, in fact, they really do - Correct-Rejection or True-Negative You decide
the walnuts dont work and, in fact they really
dont
24Outcome Matrix
Actual Situation
Works Doesnt Work
Works True Positive Type I
Doesnt Work Type II True-Negative
Your Conclusion
25Statistical Tests
- Consider a simple example
- Your subjects turn out to have a mean IQ of 107.5
(1/2 S.D. from the mean of the population) after
eating walnuts
26Statistical Tests
- What are two reasons why the mean IQ of your
subjects might be greater than the mean of the
population? - you happened to pick 30 very smart people (i.e.
university students) - WARNING Type I error is possible!
27Statistical Tests
- What are two reasons why the mean IQ of your
subjects might be greater than the mean of the
population? - you happened to pick 30 very smart people (i.e.
university students) - WARNING Type I error is possible!
- the walnuts worked
28Statistical Tests
- Usually we are worried about making a type I
error so we need to know - What fraction of all possible groups of 30
subjects would have a mean IQ of 105 or less?
29Statistical Tests
- Usually we are worried about making a type I
error so we need to know - What fraction of all possible groups of 30
subjects would have a mean IQ of 105 or less? - In other words, we are interested not in the
distribution of IQ scores themselves, but rather
in the distribution of mean IQ scores for groups
of 30 subjects
30The Z Test
- as it is more formally known
31Example Z Test
- Using our example in which we are testing the
hypothesis that walnuts make people smarter - null hypothesis is that they dont
107.5
? 100
? 15
32Example Z Test
- Using our example in which we are testing the
hypothesis that walnuts make people smarter (null
hypothesis was that they dont) - We want to know how many standard errors from the
mean (of the sampling distribution of means) is
107.5
107.5
? 15
33Example Z Test
Heres what weve got
107.5
? 15
n 30
Heres what we can compute
Thats what were after so that we can use the
Z table
34Example Z Test
Heres what weve got
107.5
? 15
n 30
Heres what we can compute
Which is much less than 15!
35Example Z Test
Heres what weve got
107.5
? 15
n 30
Heres what we can compute
36Example Z Test
Heres what weve got
107.5
? 15
n 30
Thus
107.5 isnt half a standard deviation from the
sampling distribution mean!
Its actually more than two and a half standard
deviations from the sampling distribution mean !
37Example Z Test
Heres what weve got
107.5
? 15
n 30
Looking up 2.739 in the Z table reveals that only
.0031 or .31 of the means in the sampling
distribution of mean IQs (for groups of 30 people
each) would have a mean equal to or greater than
107.5!
38Example Z Test
- What this means is that you have only a 0.31
chance of making a type I error if you conclude
that walnuts made your subjects smarter !
39Example Z Test
- What this means is that you have only a 0.31
chance of making a type I error if you conclude
that walnuts made your subjects smarter ! - Put another way, there is only a 0.31 chance
that this sample of IQs is taken from the regular
populationwalnut eaters are different
40Alpha
- Is .31 small enough? What risk of making a Type
I error is too great?
41Alpha
- Is .31 small enough? What risk of making a Type
I error is too great? - There is no absolute answer - it depends entirely
on the circumstances
42Alpha
- Is .31 small enough? What risk of making a Type
I error is too great? - There is no absolute answer - it depends entirely
on the circumstances - 5 or probability (p) .05 is generally accepted
43Alpha
- Is .31 small enough? What risk of making a Type
I error is too great? - There is no absolute answer - it depends entirely
on the circumstances - 5 or probability (p) .05 is generally accepted
- This rate of making Type I errors (ie. number of
Type I errors per 100 experiments) is called the
Alpha Level
44Statistical Significance
- So we conclude that walnuts have a statistically
significant effect on IQ with a probability of a
Type I error of less than 5 - In a research article we might say the effect of
walnuts on IQ was significant (one-tailed Z test,
p .0031)