Title: Hypothesis Testing
1Hypothesis Testing
- An Inference Procedure
- We will study procedures for both the unknown
population mean on a quantitative variable and
the unknown population proportion on a
qualitative variable.
2Background
There are times we would like to know about the
unknown mean in a population. But, it is often
expensive and too time consuming to investigate
the whole population. So, a sample is taken.
The method of confidence intervals is based on
idea that a point estimate would vary from sample
to sample in theory and so from the one sample we
do take we build in the variability and then are
a certain percent confident our interval contains
the unknown value. Hypothesis testing will rely
on some of the same ideas used in confidence
interval, but here there is a least a starting
point for the unknown value. The starting point
can be from past work or belief one has in a
process.
3Example
Our book has an example about a company that puts
cereal in a boxes. On the label of each box it
says there are 368 grams of cereal in the box.
Does each box have exactly 368 grams? Probably
not because maybe a few extra flakes fall in one
box and a few less in another box. But in the
grand scheme of things the process is filling the
boxes on average to 368 grams. (Now if one box
had 268 grams and the other had 468 grams for an
average of 368 we would have a problem, but not
of the kind we are talking about here.)
4Null hypothesis
From the cereal example we would say the null
hypothesis is that the mean amount of cereal put
in the boxes is 368 grams and in a shorthand
notation we would write Ho µ 368. The Ho
stands for null hypothesis and the null
hypothesis is always one of status quo. Here
this basically means if the company believes they
are putting 368 ounces in each box then we will
not on face value object to that assertion. The
mu, µ, is the idea that we are making a
hypothesis about the population of all boxes. Of
courses we will only take a sample, but our
hypothesis is about the population mean.
5Alternative Hypothesis
In hypothesis testing there will always be a
mutually exclusive alternative hypothesis to the
null. In the cereal example the alternative
hypothesis may be that the cereal boxes are not
being filled to an average of 368 grams and we
would write this as H1 µ ? 368. The general
process of hypothesis testing starts with the
null and alternative hypotheses. Then a sample
is collected and analyzed. The analysis will
have one either continue to believe in the null
hypothesis and thus fail to reject the null, or
one will reject the null and conclude the
alternative is the one to go with. Note in the
cereal example, if the null is rejected the firm
better find out why the machine is not filling
the cereal boxes properly and get that situation
fixed.
6Analogy
Story about hypothesis tests. Not really stats,
but an idea to consider. Say I have two decks of
cards. One deck is a regular deck spades,
hearts, diamonds and clubs. The other deck is
special 4 sets of hearts. Now, I take out one
of the decks, but you do not know which one. In
the language of statistics the null hypothesis
will be that I took out the regular deck. You
will accept the null hypothesis unless an event
occurs that has a really low probability. If a
really low probability event occurs you will
reject the null hypothesis and go with the
alternative hypothesis. So, I take out a deck and
deal you five cards a royal flush hearts! You
would reject the null hypothesis of a regular
deck and go with the alternative that the deck I
pulled out is the special one because a royal
flush hearts has a low probability in a regular
deck.
7Sampling Distribution
You may recall that when we have a quantitative
variable and the population standard deviation of
the variable is known, the distribution of the
sample mean is 1) normal 2) Has the same mean as
the mean of the variable in the population, 3)
Has standard error standard deviation in the
population divided by the square root of the
sample size. When the population standard
deviation is not known we rely on the sample
standard deviation and the distribution of the
sample mean is a t distribution. In what follows
we assume population standard deviation is known,
but the ideas we bring up are also relevant later.
8Regions of Sampling Distribution
X
µ
Imagine that this slide has animation. Think
about the arrows as both starting out in the
center and as the arrows move out they push the
vertical lines with them. Using the cereal
example, the center of the distribution is
thought to be at 368. As we move in either
direction from the center we have sample means
that are possible when the mean really is 368.
But at some point as we move out we start to
wonder about our 1 sample mean as really coming
from a distribution with mean equal to 368.
9Regions of Sampling Distribution
In the process of hypothesis testing the area of
the sampling distribution is divided up into
regions. The nonrejection region is the area
in the middle of the distribution. These values
are relatively close to the center. So if we get
a sample value in this area we do not have enough
evidence to reject the null hypothesis. The
tail areas that I have on the previous screen
are considered rejection regions. While sample
mean values could occur in these regions when in
fact the true mean is 368, the probability is low
and thus this raises suspicion about the null
hypothesized value and leads us to reject the
null. (Could I deal you a royal flush hearts
from a regular deck? Yes, but chance is small,
or much better under the alternative hypothesis.)
10Critical Values
The values of x bar that occur where the arrows
are pushed out are called critical values of x
bar. Note that the critical values are not
determined from the sample. The null
hypothesized value is also NOT determined from
the sample. Remember the null hypothesis value
is determined from past work or knowledge of some
process. The critical values are picked based on
some additional ideas I want to explore next.
11Type I Error
A Type I error is a situation where you reject
the null hypothesis, Ho, when it is true and
should not be rejected. The probability of
making a type I error is called alpha and is
often referred to as the level of
significance. In the cereal example if we reject
the null hypothesis we will have to shut down
production and investigate the production process
to see why it is not putting in the correct
amount of cereal. There is a consequence to
rejecting a true null hypothesis. Depending on
the nature of the consequence we pick the value
of alpha. Traditional values of alpha are .01,
.05 and .1. The choice of alpha will be part of
determining the critical x bar values.
12Type II Error
A type II error is a situation where the null
hypothesis is not rejected when it should be
because the null is false. The probability of
making a type II error is called beta, ß. A type
II error also has consequences. In the cereal
example if we do not reject the null when we
should we could either be giving more cereal than
we say we are (and thus not charging for it we
certainly have costs in making it), or giving
less than we say we are and thus cheating
customers. In an introductory statistics class
such as ours we typiocally focus on the type I
error.
13Critical Value approach
Alpha/2
Alpha/2
X
Reject region
Reject region
µ µo
Do not reject region
Upper critical value
Lower critical value
14Critical value approach
The null and alternative hypotheses can be stated
in a generic way as Ho µ µo H1 µ ? µo,
where µo is a specific number. In our cereal
example we would have Ho µ 368 H1 µ ?
368. When the alternative is a not equal sign we
have what is called a two tailed test because if
we are off in either direction we are concerned.
In this case we divide up the alpha value in half
and make our rejection regions have areas add up
to alpha. If alpha .05 we would have .025 in
each tail of the distribution.
15Critical Value Approach
Our context here is that we know the population
standard deviation so we use the Z table (the
standard normal table). While my graph a few
slides back is of X bar, we translate to Z
values. With alpha .05 and thus .025 in either
tail, the lower critical Z -1.96 and the upper
critical Z 1.96. We would reject the null if
from our sample the Zstat is less than -1.96 or
greater then 1.96 Now, lets say we take a sample
of 25 observations and we get a mean of 372.5
grams and we know the population standard
deviation is 15. The Zstat (372.5
368)/(15/sqrt(25)) 4.5/3 1.50. This means we
can not reject the null. The data support the
filling process is ok!
16p value approach
The critical value approach had you set up
rejection regions and in the end work with a
sample. In the p value approach you will work
with the sample almost as soon as you can.
Remember we had a sample mean of 372.5 and the
Zstat for this is 1.50. A Z of 1.50 has area
.9332 to the left and .0668 to the right. The
area to the right is the upper tail associated
with the actual sample mean. In the critical
value approach we has .025 in the upper tail.
So, the .0668 suggests our sample mean is in the
do not reject region. With a two tail test we
look at the Zstat from the sample and the
negative of the Zstat, here -1.50. Then when
alpha .05 we can see our tail areas add up to
.1336.
17p value approach
The p value for a sample mean is the
probability in the tail given the null hypothesis
is true. If we have a two tail test we just
double the one tail value to get the p
value. Then if p value alpha we do not
reject the null, but if the p value reject the null because we know the Zstat is more
extreme than the critical values. If the p
value is low, then Ho must go. Note in our work
a low p value will be defined from problem to
problem.
18Problem 22 page 285
With a .01 level of significance we have .005 as
the area in each tail. We would reject the null
if 1) The Zstat is less than -2.58, or 2) The
Zstat is greater than 2.58.
Area .005
Area .005
-2.58
2.58
19Problem 24 page 285
If the Zstat 2.00 (from a sample mean taken)
then the area to the right is 1 - .9772 .0228
and we double this if we have a two tail test to
get a p value .0456