Title: STAT 3120 Statistical Methods I
1STAT 3120Statistical Methods I
- Lecture 3
- Hypothesis Testing
2STAT3120 Hypothesis Testing
- Some Limitations of Confidence Interval Analysis
- Descriptive, Not Decision-Oriented
- Provides no analysis of potential errors
(Impact/likelihood) - Hypothesis Testing is Decision Oriented
- Is a Population Parameter less than, Equal to, or
Greater than a specific Value (With Decision
Making Implications) - Guides user to a particular choice
- Provides User with quantitative information
regarding probabilities of different outcomes - Highlights that Two Different Decision Making
Errors Possible - TYPE I or ?-Error
- TYPE II or ?-Error
- ?-Level of significance predefines standard of
proof (risk) - 1- ? indicating the power of the test (the
probability of finding a significant effect) - p-Value (Prob-Value) Aids in Interpreting Results
- Strength of the Evidence
3STAT3120 Hypothesis Testing
- The first step in Hypothesis Testing is to
develop a claim to be tested. For example - Drug A will lower an individuals cholesterol
level by a minimum of 10 points. - ltDrug A will not lower an individuals
cholesterol level by a minimum of 10 pointsgt - Trashbag A has greater tensile strength than
Trashbag B. - lt Trashbag A does not have greater tensile
strength than Trashbag Bgt - Automobile A will average at least 35 miles to
the gallon. - lt Automobile A will not average at least 35 miles
to the gallongt - Note that every possible claim has an opposite
statement the two statements together must be
mutually exclusive and collectively exhaustive.
4STAT3120 Hypothesis Testing
- When conducting Hypothesis Testing, the process
of testing includes five distinct parts - Statement of the Claim This is referred to as
the Alternative Hypothesis and is designated as
H1 or Ha - Statement of the Opposite of the Claim This
is referred to as the Null Hypothesis and is
designated as H0 - Calculation of the appropriate Test Statistic
- Identification of the Rejection Region (H0)
- Assess Results and Draw Conclusions
5STAT3120 Hypothesis Testing
- Develop the appropriate Hypothesis Statements to
test the claims - The Coca Cola Marketing Department wants to run a
TV ad Diet Coke tastes better than Diet
Pepsi. Assume some kind of taste scale of 1-10
(10 is the best). - A Pharmaceutical company wants to claim in their
marketing materials that a particular drug will
lower cholesterol by at least 30. - A manufacturer of PVC pipes wants to become a
supplier to a large civil engineering firm. The
manufacturer claims that they can manufacture
pipes to within 1mm of the engineering firms
specifications. - The Milemaster Tire company has a new tire that
they claim will go 100,000 before the treads wear
out.
6STAT3120 Hypothesis Testing
- When conducting a hypothesis test, you should
always develop the 2x2 matrix below which
compares the statistically-supported decision to
the true state of nature
True State of nature Decision Ho is true Ho is false
Reject Ho
Do not reject Ho
7STAT3120 Hypothesis Testing
- Descriptions of errors
- Type I Error Reject Null Hypothesis When Null is
actually true (?-Error) - Type II Error Accept Null Hypothesis which is
false (?-Error) - Significance level, ? , is Maximum Risk of making
Type I error that we are prepared to Live With.
In other words, Probability of Type I Error ?
(usually set at .05 or Less) - A Type II Error ? (not typically controlled)
- Type I Error and Type II error cannot be
controlled simultaneously
Type of Error DECISION CONSEQUENCES/COSTS
TYPE I Reject Null Null is true Market tire, but should not have done so may cause customer dissatisfaction, loss, claims for refund
TYPE II Accept Null Null is false Fail to market a good product opportunity cost
8STAT3120 Hypothesis Testing
Tire is no good Ho is true Tire meets expectations Ho is false
Market Tire Reject Ho
Dont Market Tire Do not reject Ho
customer dissatisfaction, lose market share Type
1 Error - ?
Gain market share Valid Decision
Fail to capitalize on good product, opportunity
cost Type II Error - ?
No change Valid Decision
The calculated probability of this outcome is
1- ?. This is known as the Power of the test.
9STAT3120 Hypothesis Testing
- Descriptions of each outcome in English
Ho is true Ho is false
Reject Ho Typically the Worst Possible Mistakethis decision asserts that an effect is present when it is not a false positive. The Power of the test. There was an effect present and it was detected.
Do not reject Ho A Push. There was no effect present and this was correctly determined. Lost opportunity. There was an effect present and it was not detected a false negative
In a medical context, the False Negative is
often considered to be the worse mistake.
10STAT3120 Hypothesis Testing
- After the Hypothesis Statements have been
developed, and the Type I and Type II errors have
been evaluated, we establish the alpha level
the highest probability we are willing to assume
of committing a Type I error. This alpha value
corresponds to a Critical or cut off Z-score
- One tailed tests
- alpha/Z-score
- .01/2.33
- .05/1.645
- .10/1.28
- Two tailed tests
- alpha/Z-score
- .01/2.575
- .05/1.96
- .10/1.645
11STAT3120 Hypothesis Testing
- At this point, we perform the calculation of the
ACTUAL Z-score and compare this to the CRITICAL
Z-score
(Example from Book (pg 211)) A corporation
manages a fleet of company cars. A random sample
of 40 cars is examined. The mean and std for the
sample are 2,752 and 350 miles, respectively.
Records for previous years indicated that the
average miles driven was 2,600. Use the sample
data to test the claim that the current mean is
different from the previous mean. Use alpha
.05. Z (2752-2600)/(350/SQRT(40))
2.75 How does this compare to a Critical Z of
1.96? What is your decision? What is the
implication if you are wrong?
12STAT3120 Hypothesis Testing
The Z-scores can be compared directly. However,
we typically translate these Z-scores into
probabilities
Sample Conditions p-Value Significance level
Z-Score Associated with the Actual or Calculated Z-score Associated with the Critical Z-score
Range 0 lt p lt 1 0 lt ? lt 1
Definition Actual Probability of Making a Type I Error Maximum Probability of Making a Type I Error
How Determined and When Known From Sample Data After Analysis Set Before Analysis
13STAT3120 Hypothesis Testing
Decision Rule If the p-value is less than the
established alpha value, REJECT the null
hypothesis and proceed with the claim. The
potential error is a Type I. If the p-value is
greater than the established alpha value, we FAIL
TO REJECT the null hypothesis and maintain the
status quo. The potential error is a Type II.
14STAT3120 Hypothesis Testing
Fun and Exciting SAS and SPSS exercises for all
to enjoy!
15STAT3120 Hypothesis Testing
A second calculation that is conducted when
conducting Hypothesis Testing is the evaluation
of the Power of the test. Recall that Power is
the probability of correctly rejecting the null
hypothesis when it should be rejectedin other
wordsthe probability of detecting a true effect
(1-Type II error). Question Why not set both
alpha and power at acceptable levels?
16STAT3120 Hypothesis Testing
Answer Because the Type I and Type II errors are
inversely relatedas the probability of a Type I
error becomes more restricted, the probability of
a Type II error increases. Since Power is
1-probability of a Type II, increasing the Type
II error, decreases the Power of the test Type
II and Power ARE directly related. Statistical
Power is the probability of rejecting the null
hypothesis when it should be rejected. In other
words, it is the probability that if a true
difference exists, it will be discovered.
Statistical Power is heavily used in medicine,
clinical psychology and biology. Typically, a
test must have a Statistical Power of 80 or
greater to be considered valid.
17STAT3120 Hypothesis Testing
- Power is a function of three factors
- Effect size i.e., the difference between the
two groups or measurements. As the effect size
goes up, the power increases. - Alpha As the chance of finding an incorrect
significant effect is reduced (Type I error), the
probability of correctly finding an effect is
also reduced. Typically, alpha is set to be .01
(most conservative and lowers power), .05 or .10
(most risk tolerant and increases power). - Sample Size - Increased sample sizes will
always produce greater power. But, increasing
the sample size can also produce too much power
smaller and smaller effects will be found to be
significant until at a large enough sample size,
any effect is considered to be significant.
18STAT3120 Hypothesis Testing
When conducting a one tailed test, the power
calculation is executed as 1-?(?) 1-Pzltz?-
( ?0 - ?a/ ?) A two tailed test is conducted
similarly, with the alpha value associated with
the z score divided by two.
19STAT3120 Hypothesis Testing
Fun In class examples 5.8, 5.9, 5.10
20STAT3120 Hypothesis Testing
A Power Curve, is a plot of differing values of ?
versus the calculated Power. The slope of the
curve will become more steep as the sample size
increasesexamine the curves on pgs 218/219.
21STAT3120 Hypothesis Testing
If you are starting an experiment from scratch,
how do you determine the appropriate sample
size? n ?2(Z ?Z ?)2/?2 A two tailed test
is conducted similarly, with the alpha value
associated with the z score divided by two. See
Example 5.11
22STAT3120 Hypothesis Testing
Secret weapon for all of that computational
_at_ proc power onesamplemeans testt
mean 7 stddev 3
ntotal 50 power . run
23STAT3120 Hypothesis Testing
Secret weapon for all of that computational
_at_ proc power twosamplemeans
testdiff meandiff 7 stddev
12 npergroup 50 power
. run
24STAT3120 Hypothesis Testing
Secret weapon for all of that computational
_at_
proc power pairedmeans testdiff
pairedmeans 8 15 corr 0.4
pairedstddevs (7 12) npairs .
power 0.9 run