Title: Analyze Phase Introduction to Hypothesis Testing
1Analyze PhaseIntroduction to Hypothesis Testing
2Hypothesis Testing (ND)
Welcome to Analyze
X Sifting
Inferential Statistics
Hypothesis Testing Purpose
Tests for Central Tendency
Intro to Hypothesis Testing
Tests for Variance
Hypothesis Testing ND P1
ANOVA
Hypothesis Testing ND P2
Hypothesis Testing NND P1
Hypothesis Testing NND P2
Wrap Up Action Items
3Six Sigma Goals and Hypothesis Testing
- Our goal is to improve our Process Capability,
this translates to the need to move the process
Mean (or proportion) and reduce the Standard
Deviation. - Because it is too expensive or too impractical
(not to mention theoretically impossible) to
collect population data, we will make decisions
based on sample data. - Because we are dealing with sample data, there is
some uncertainty about the true population
parameters. - Hypothesis Testing helps us make fact-based
decisions about whether there are different
population parameters or that the differences are
just due to expected sample variation.
4Purpose of Hypothesis Testing
- The purpose of appropriate Hypothesis Testing is
to integrate the Voice of the Process with the
Voice of the Business to make data-based
decisions to resolve problems. - Hypothesis Testing can help avoid high costs of
experimental efforts by using existing data. This
can be likened to - Local store costs versus mini bar expenses.
- There may be a need to eventually use
experimentation, but careful data analysis can
indicate a direction for experimentation if
necessary. - The probability of occurrence is based on a
pre-determined statistical confidence. - Decisions are based on
- Beliefs (past experience)
- Preferences (current needs)
- Evidence (statistical data)
- Risk (acceptable level of failure)
5The Basic Concept for Hypothesis Tests
- Recall from the discussion on classes and cause
of distributions that a data set may seem Normal,
yet still be made up of multiple distributions. - Hypothesis Testing can help establish a
statistical difference between factors from
different distributions.
Did my sample come from this population? Or
this? Or this?
6Significant Difference
- Are the two distributions significantly
different from each other? How sure are we of our
decision? - How do the number of observations affect our
confidence in detecting population Mean?
??
??
Sample 2
Sample 1
7Detecting Significance
- Statistics provide a methodology to detect
differences. - Examples might include differences in suppliers,
shifts or equipment. - Two types of significant differences occur and
must be well understood, practical and
statistical. - Failure to tie these two differences together is
one of the most common errors in statistics.
HO The sky is not falling. HA The sky is
falling.
8Practical vs. Statistical
- Practical Difference The difference which
results in an improvement of practical or
economic value to the company. - Example, an improvement in yield from 96 to 99
percent. - Statistical Difference A difference or change
to the process that probably (with some defined
degree of confidence) did not happen by chance. - Examples might include differences in suppliers,
markets or servers.
We will see that it is possible to realize a
statistically significant difference without
realizing a practically significant difference.
9Detecting Significance
- During the Measure Phase, it is important that
the nature of the problem be well understood. -
- In understanding the problem, the practical
difference to be achieved must match the
statistical difference. - The difference can be either a change in the
Mean or in the variance. - Detection of a difference is then accomplished
using statistical Hypothesis Testing.
Mean Shift
Variation Reduction
10Hypothesis Testing
- A Hypothesis Test is an a priori theory relating
to differences between variables. - A statistical test or Hypothesis Test is
performed to prove or disprove the theory. - A Hypothesis Test converts the practical problem
into a statistical problem. - Since relatively small sample sizes are used to
estimate population parameters, there is always a
chance of collecting a non-representative sample. - Inferential statistics allows us to estimate the
probability of getting a non-representative
sample.
11DICE Example
- We could throw it a number of times and track how
many each face occurred. With a standard die, we
would expect each face to occur 1/6 or 16.67 of
the time. - If we threw the die 5 times and got 5 ones, what
would you conclude? How sure can you be? - Pr (1 one) 0.1667 Pr (5 ones) (0.1667)5
0.00013 - There are approximately 1.3 chances out of 1000
that we could have gotten 5 ones with a standard
die. - Therefore, we would say we are willing to take a
0.1 chance of being wrong about our hypothesis
that the die was loaded since the results do
not come close to our predicted outcome.
12Hypothesis Testing
Type I Error
a
DECISIONS
Sample Size
ß
n
Type II Error
13Statistical Hypotheses
- A hypothesis is a predetermined theory about the
nature of, or relationships between variables.
Statistical tests can prove (with a certain
degree of confidence), that a relationship
exists. - We have two alternatives for hypothesis.
- The null hypothesis Ho assumes that there are
no differences or relationships. This is the
default assumption of all statistical tests. - The alternative hypothesis Ha states that there
is a difference or relationship.
P-value gt 0.05 Ho no difference or
relationship P-value lt 0.05 Ha is a
difference or relationship
Making a decision does not FIX a problem, taking
action does.
14Steps to Statistical Hypothesis Test
- State the Practical Problem.
- State the Statistical Problem.
- HO ___ ___
- HA ___ ? ,gt,lt ___
- Select the appropriate statistical test and risk
levels. - a .05
- ß .10
- Establish the sample size required to detect the
difference. - State the Statistical Solution.
- State the Practical Solution.
Noooot THAT practical solution!
15How Likely is Unlikely?
- Any differences between observed data and claims
made under H0 may be real or due to chance. - Hypothesis Tests determine the probabilities of
these differences occurring solely due to chance
and call them P-values. - The a level of a test (level of significance)
represents the yardstick against which P-values
are measured and H0 is rejected if the P-value
is less than the alpha level. - The most commonly used levels are 5, 10 and 1.
16Hypothesis Testing Risk
- The alpha risk or Type 1 Error (generally called
the Producers Risk) is the probability that we
could be wrong in saying that something is
different. It is an assessment of the
likelihood that the observed difference could
have occurred by random chance. Alpha is the
primary decision-making tool of most statistical
tests. -
17Alpha Risk
- Alpha (? ) risks are expressed relative to a
reference distribution. - Distributions include
- t-distribution
- z-distribution
- ?2- distribution
- F-distribution
The a-level is represented by the clouded
areas. Sample results in this area lead to
rejection of H0.
18Hypothesis Testing Risk
- The beta risk or Type 2 Error (also called the
Consumers Risk) is the probability that we
could be wrong in saying that two or more things
are the same when, in fact, they are different.
19Beta Risk
- Beta Risk is the probability of failing to reject
the null hypothesis when a difference exists.
Distribution if H0 is true
? 0.05
H0 value
Accept H0
Distribution if Ha is true
? Pr(Type II error)
?
Critical value of test statistic
20Distinguishing between Two Samples
- Recall from the Central Limit Theorem as the
number of individual observations increase the
Standard Error decreases. - In this example when n2 we cannot distinguish
the difference between the Means (gt 5 overlap,
P-value gt 0.05). - When n30, we can distinguish between the Means
(lt 5 overlap, P-value lt 0.05) There is a
significant difference.
Theoretical Distribution of Means When n 2 d
5 S 1
?
Theoretical Distribution of Means When n 30 d
5 S 1
21Delta SigmaThe Ratio between d and S
- Delta (d) is the size of the difference between
two Means or one Mean and a target value. - Sigma (S) is the sample Standard Deviation of the
distribution of individuals of one or both of the
samples under question. - When ? ? S is large, we dont need statistics
because the differences are so large. - If the variance of the data is large, it is
difficult to establish differences. We need
larger sample sizes to reduce uncertainty.
We want to be 95 confident in all of our
estimates!
22Typical Questions on Sampling
- Question How many samples should we take?
- Answer Well, that depends on the size of your
delta and Standard Deviation. - Question How should we conduct the
sampling?Answer Well, that depends on what
you want to know. - Question Was the sample we took large
enough?Answer Well, that depends on the size
of your delta and Standard Deviation. - Question Should we take some more samples just
to be sure?Answer No, not if you took the
correct number of samples the first time!
23The Perfect Sample Size
- The minimum sample size required to provide
exactly 5 overlap (risk). In order to
distinguish the Delta. - Note If you are working with Non-normal Data,
multiply your calculated sample size by 1.1
24Hypothesis Testing Roadmap
25Hypothesis Testing Roadmap
26Hypothesis Testing Roadmap
27Common Pitfalls to Avoid
- While using Hypothesis Testing the following
facts should be borne in mind at the conclusion
stage - The decision is about Ho and NOT Ha.
- The conclusion statement is whether the
contention of Ha was upheld. - The null hypothesis (Ho) is on trial.
- When a decision has been made
- Nothing has been proved.
- It is just a decision.
- All decisions can lead to errors (Types I and
II). - If the decision is to Reject Ho, then the
conclusion should read There is sufficient
evidence at the a level of significance to show
that state the alternative hypothesis Ha. - If the decision is to Fail to Reject Ho, then
the conclusion should read There isnt
sufficient evidence at the a level of
significance to show that state the alternative
hypothesis.
28Summary
- At this point, you should be able to
- Articulate the purpose of Hypothesis Testing
- Explain the concepts of the Central Tendency
- Be familiar with the types of Hypothesis Tests