Title: Schaum
1 Schaums Outline Probability and
Statistics Chapter 7 HYPOTHESIS
TESTING presented by Professor Carol
Dahl Examples by Alfred Aird Kira Jeffery
Catherine Keske Hermann Logsend Yris Olaya
2Outline of Topics
Topics Covered
- Statistical Decisions
- Statistical Hypotheses
- Null Hypotheses
- Tests of Hypotheses
- Type I and Type II Errors
- Level of Significance
- Tests Involving the Normal Distribution
- One and Two Tailed Tests
- P Value
3 Outline of Topics (Continued)
- Special Tests of Significance
- Large Samples
- Small Samples
- Estimation Theory/Hypotheses Testing
Relationship - Operating Characteristic Curves and Power of a
Test - Fitting Theoretical Distributions to Sample
Frequency Distributions - Chi-Square Test for Goodness of Fit
4The Truth Is Out ThereThe Importance of
Hypothesis Testing
- Hypothesis testing
- helps evaluate models based upon real data
- enables one to build a statistical model
- enhances your credibility as
- analyst
- economist
5Statistical Decisions
- Innocent until proven guilty principle
- Want to prove someone is guilty
- Assume the opposite or status quo - innocent
- Ho Innocent
- H1 Guilty
- Take subsample of possible information
- If evidence not consistent with innocent - reject
- Person not pronounced innocent but not guilty
-
6Statistical Decisions
- Status quo innocence null hypothesis
- Evidence sample result
- Reasonable doubt confidence level
-
7Statistical Decisions
- Eg. Tantalum ore deposit
- feasible if quality gt 0.0600g/kg with 99
confidence - 100 samples collected from large deposit at
random. - Sample distribution
- mean of 0.071g/kg
- standard deviation 0.0025g/kg.
8Statistical Decisions
- Should the deposit be developed?
- Evidence 0.071 (sample mean)
- Reasonable doubt 99
- Status quo do not develop the deposit
- Ho ? lt 0.0600
- H1 ? gt 0.0600
-
9Statistical Hypothesis
- General Principles
- Inferences about population using sample
statistic - Prove A is true by assuming it isnt true
- Results of experiment (sample) compared with
model - If results of model unlikely, reject model
- If results explained by model, do not reject
10Statistical Hypothesis
Event A fairly likely, model would be
retained Event B unlikely, model would be
rejected
11Statistical Decisions
- Should the deposit be developed?
- Evidence 0.071 (sample mean)
- Reasonable doubt 99
- Status quo do not develop the deposit
- Ho ? 0.0600
- H1 ? gt 0.0600
-
- How likely Ho given 0.071
12Need Sampling Statistic
- Need statistic with
- population parameter
- estimate for population parameter
- its distribution
13Need Sampling Statistic
- Population Normal - Two Choices
- Small Sample lt30
- Known Variance
Unknown Variance
N(0,1) tn-1
14Need Sampling Statistic
- Population Not-Normal
- Large Sample
- Known Variance Unknown Variance
N(0,1) N(0,1)
Doesnt matter if know variance of not If
population is finite sampling no replacement
need adjustment
15Normal Distribution
27
XN(0,1)
? 0
SD1 (68)
SD2 (95)
SD3 (99.7)
16Statistical Decisions
- Should the deposit be developed?
- Evidence 0.071 (sample mean)
- 0.0025g/kg (sample
variance) - 0.05 (sample standard
deviation) - Reasonable doubt 99
- Status quo do not develop the deposit
- Ho ? 0.0600
- H1 ? gt 0.0600
- One tailed test
- How likely Ho given 0.071
17Hypothesis test
- Evidence 0.071 (sample mean)
- 0.05g/kg (sample standard
deviation) - Reasonable doubt 99
- Status quo do not develop the deposit
- Ho ? 0.0600
- H1 ? gt 0.0600
-
18Statistical Hypothesis
Eg. Z (0.071 0.0600)/ (0.05/ ? 100)
2.2 Conclusion Dont reject Ho ,
dont develop deposit
Zc2.33
2.2
19Null Hypothesis
- Hypotheses cannot be proven
- reject or fail to reject
- based on likelihood of event occurring
- null hypothesis is not accepted
20Test of Hypotheses Maple Creek Mine and Potaro
Diamond field in Guyana
- Mine potential for producing large diamonds
- Experts want to know true mean carat size
produced - True mean said to be 4 carats
- Experts want to know if true with 95 confidence
- Random sample taken
- Sample mean found to be 3.6 carats
- Based on sample, is 4 carats true mean for mine?
21Tests of Hypotheses
- Tests referred to as
- Tests of Hypotheses
- Tests of Significance
- Rules of Decision
22Types of Errors
Ho µ 4 (Suppose this is true) H1 µ ? 4 Two
tailed test Choose ? 0.05 Sample n 100
(assume X is normal), ? 1
23Type I error (?) reject true
Ho µ 4 suppose true
?/2
?/2
24Type II Error (ß) - Accept False
- Ho µ 4 not true
- µ 6 true
- ?X-µ not mean 0 but mean 2
-
ß
25Lower Type I What happens to Type II
ß
26Higher µWhat happens to Type II?
- Ho µ 4 not true
- µ 7 true
- ?X-µ not mean 0 but mean 3
-
ß
27Type I and Type II Errors
- Two types of errors can occur in hypothesis
testing - To reduce errors, increase sample size when
possible
Ho True Ho False
Reject Ho Type I Error Correct Decision
Do Not Reject Ho Correct Decision Type II Error
28To Reduce Errors
- Increase sample size when possible
- Population, n 5, 10, 20
29Error Examples
- Type I Error rejecting a true null hypothesis
- Convicting an innocent person
- Rejecting true mean carat size is 4 when it is
-
- Type II Error not rejecting a false null
hypothesis - Setting a guilty person free
- Not rejecting mean carat size is 4 when its not
30Level of Significance (?)
- a max probability were willing to risk Type I
Error - tail area of probability density function
- If Type I Errors cost high, choose a low
- a defined before hypothesis test conducted
- a typically defined as 0.10, 0.05 or 0.01
- a 0.10 for 90 confidence of correct test
decision - a 0.05 for 95 confidence of correct test
decision - a 0.01 for 99 confidence of correct test
decision
31Diamond Hypothesis Test Example
Ho µ 4 H1 µ ? 4 Choose a 0.01 for 99
confidence Sample n 100, ? 1 ?X 3.6,
-Zc - 2.575, Zc 2.575
-2.575
2.575
.005
.005
3221
Example Continued
1
- Observed not significantly different from
expected - Fail to reject null hypothesis
- Were 99 confident true mean is 4 carats
33Tests Involving the t Distribution
- Billy Ray has inherited large, 25,000 acre
homestead - Located on outskirts of Murfreesboro, Arkansas,
near - Crater of Diamonds State Park
- Prairie Creek Volcanic Pipe
- Land now used for
- agricultural
- recreational
- No official mining has taken place
34Case Study in Statistical Analysis Billy Rays
Inheritance
- Billy Ray must now decide upon land usage
- Options
- Exploration for diamonds
- Conservation
- Land biodiversity and recreation
- Agriculture and recreation
- Land development
35Consider Costs and Benefits of Mining
- Cost and Benefits of Mining
- Opportunity cost
- Excessive diamond exploration damages lands
value - Exploration and Mining Costs
- Benefit
- Value of mineral produced
36Consider Costs and Benefits of Mining
- Cost and Benefits of Mining
- Sample for geologic indicators for diamonds
- kimberlite or lamporite
- larger sample more likely to represent true
population - larger sample will cost more
37How to decide one tailed or two tailed
- One tailed test
- Do we change status quo only if its bigger than
null - Do we change status quo only if its smaller than
null - Two tailed test
- Change status quo if its bigger of if it smaller
38Tests of Mean
- Normal or t
- population normal
- known variance
- small sample
Normal
population normal unknown variance small sample
t
large population
Normal
39Difference Normal and t
t fatter tail than normal bell-curve
40Hypothesis and Sample
- Need at least 30 g/m3 mine
- Null hypothesis Ho µ 20
- Alternative hypothesis H1 ?
- Sample data n16 (holes drilled)
- X close to normal
- ?X 31 g/m³
- variance (s2/n)0.286 g/m³
41Normal or t?
- One tailed
- Null hypothesis Ho µ 30
- Alternative hypothesis H1 µ gt 30
- Sample data n 16 (holes drilled)
- ?X 31 g/m³
- variance (s2) 4.29 g/m³ 4.29
- standard deviation s 2.07
- small sample, estimated variance, X close to
normal - not exactly t but close if X close to normal
42Tests Involving the t Distribution
t16-1
? 0
Reject 5
tc1.75
43Tests Involving the t Distribution
- tn-1 ?X - µ (31 - 30) 1.93
- s/?n 2.07/ ?16
t16-1
? 0
Reject 5
tc1.75
44Wells produces oil
- X API Gravity
- approximate normal with mean 37?
- periodically test to see if the mean has changed
- too heavy or too light revise contract
- Ho
- H1
- Sample of 9 wells, ?X 38?, s2 2
- What is test statistic?
- Normal or t?
45Two tailed t test on mean
? 0
Reject ?/2
Reject ?/2
tc
tc
46Two tailed t test on mean
- Ho µ 37
- H1 µ? 37
- Sample of 9 wells, ?X 38?, s2 2, ? 10
- tn-1 ?X - µ (38 37) 1.5
- s/?n 2/ ? 9
-
47P-values - one tailed test
- Level of significance for a sample statistic
under null - Largest ? for which statistic would reject null
- t16-1 ?X - µ (31 - 30) 1.93
- s/?n 2.07/ ?16
P0.04
tinv(1,87,15,1)
48P-value two tailed test
- Ho µ 37
- H1 µ? 37
- Sample of 9 wells, ?X 38?, s2 2, ? 10
- tn-1 ?X - µ (38 37) 1.5
- s/?n 2/ ? 9
-
TDIST(1.5,8,2) 0.172
49Formal Representation of p-Values
- p-Value lt ? Reject Ho
- p-Value gt ? Fail to reject Ho
50More tests
- Survey - Ranking refinery managers
- Daily refinery production
- Sample two refineries of 40 and 35 1000
b/cd - First refinery mean 74, stand.
dev. 8 - Second refinery mean 78, stand. dev.
7 - Questions difference of means?
- variances?
- differences of variances
- Again Statistics Can Help!!!!
51Differences of Means
- Ho µ1 - µ2 0
- Ho µ1 - µ2 ? 0
- X1 and X2 normal, known variance
- or large sample known variance
- ? 10
5
5
-Zc
Zc
52Differences of Means
- Ho µ1 - µ2 0
- Ho µ1 - µ2 ? 0
- n1 40, n2 35
- ?X1 74, ?1 8
- ?X2 78, ?2 7
5
5
-Z-1.645c
Zc-1.645
53Difference of Means
- X normal
- Unknown but equal variances
- Do above test with
54Variance test (?2 distribution)
Two tailed
?/2
?/2
55Variance test (?2 distribution)
One tailed
?
56Hypothesis Test on Variance
Suppose best practice in refinery ?2 6 Does
refinery 2 have different variability than best
practice? Ho ?2 6 H1 ?2 ? 6.5 Example 2nd
mine, n 1 34, Standard deviation 7
57Hypothesis Test on Variance
Ho ?2 (6.5) 2 H1 ?2 ? 6.52 Example 2nd mine,
n 1 34, Standard deviation 7 ? 10
58Hypothesis Test on Variance
Suppose best practice in refinery Ho ?2
6.5 H1 ?2 ? 6.5 Example 2nd mine, n 1 34,
Standard deviation 7
59Variance test (?2 distribution)
Two tailed
0.05
0.05
21.664
48.602
60Variance test (?2 distribution)
- More variance than best practice
Ho ?2 6.5 H1 ?2 gt 6.5
One tailed
0.10
61Variance test (?2 distribution)
- More variance than best practice
Ho ?2 6.5 H1 ?2 gt 6.5
One tailed
0.10
chiinv(0.10,34)44.903
62Testing if Variances the Same F Distribution
- 2 samples of size n1 and n2
- sample variances s12, s22,
- Ho ?12 ?22 gt Ho ?22/?12 1
- Ho ?12 ? ?22 gt Ho ?22/?12 ? 1
63Testing if Variances the Same F Distribution
- Ho ?12/?22 1
- H1 ?12/?22 ? 1
Two tailed
?/2
?/2
64Testing if Variances the Same F Distribution
- Ho ?22/?12 1
- H1 ?22/?12gt1
One tailed
?10
65Example Testing if Variances the Same
- 2 samples of size n1 40
- and n2 35
- sample variances s12 82, s22 72
- Ho ?22/?12 1
- Ho ?22/?12 ? 1
0.579,
1.749
82/721.306
66Testing if Variances the Same F Distribution
- Ho ?12/?22 1
- H1 ?12/?22? 1
Two tailed
0.05
0.05
Finv(0.05,39,34)1.749
Finv(0.95,39,34)0.579
67Testing if Variances the Same F Distribution
- Ho ?22/?12 1
- H1 ?22/?12 ? 1
One tailed
0.05
Finv(0.10,39,34)1.544
68Power of a test
- Type II error
- ? P(Fail to reject Ho H1 is true)
- Power 1- ?
69Power of a test
- Type II error
- ? P(Fail to reject Ho H1 is true)
- Power 1- ?
70Power of a test
- Researcher controls level of significance, ?
- Increase ? what happens to ß?
71Raise Type I (? )What happens to Type II (ß)
- Ho µ 4 not true
- µ 6 true
- ?X-µ not mean 0 but mean 2
-
ß
72Higher ?What happens to Type II?
ß
Increase ß, reduce ?
73Operating Characteristic Curve
Can graph ? against ? called operating
characteristic curve useful in experimental
design
74Operating Characteristic Curve
75Fitting a probability distribution
- Is electricity demand a log-normal distribution
- Observed Mean 18.42
- Observed Variance 43
- Observations 20
76Fitting a probability distribution
- Does electricity demand follow a normal
distribution?
Observed Mean 18.42 Observed Variance
43 Observations 20
77You can test your model graphically
1. Order observations from smallest Y1 to
largest Yn 2. Compute cumulative frequency
distribution 3. Plot ordered observations
versus Pi on special probability sheet 4.
If straight line within critical range cant
reject normal
78You can test your model graphically
9.26 0.05 17.27 0.55
9.83 0.10 18.18 0.60
12.85 0.15 20.28 0.65
13.11 0.20 20.30 0.70
13.23 0.25 20.88 0.75
13.90 0.30 21.98 0.80
14.18 0.35 23.31 0.85
15.99 0.40 24.35 0.90
16.47 0.45 30.24 0.95
17.24 0.50 35.68 1.00
79Or use the Graph/Probability Plot Option in
Minitab
80Statistical test of distribution
- Ho Xe? N(µ,?2)
- H1 Xe does not follow N(µ,?2)
- Order data
- Estimate sample mean variance
- Observed Mean 18.42
- Observed Variance 43
- Observations 20
- ?2 statistic goodness of fit of model
81Statistical test of distribution
Again order sample Create m 5 categories
9.26 17.27
9.83 18.18
12.85 20.28
13.11 20.30
13.23 20.88
13.90 21.98
14.18 23.31
15.99 24.35
16.47 30.24
17.24 35.68
lt10
10-15
15-20
20-25
gt25
82Statistical test of distribution
9.26 17.27
9.83 18.18
12.85 20.28
13.11 20.30
13.23 20.88
13.90 21.98
14.18 23.31
15.99 24.35
16.47 30.24
17.24 35.68
Actual frequencies
lt10 2
10-15 5
15-20 5
20-25 6
gt25 2
83Statistical test of distribution
Frequencies
actual expected
lt10 2 Normdist(10,18.42,6.56,1)20
10-15 5 (Normdist(15,18.42,6.56,1) - Normdist(10,18.42,6.56,1)20
15-20 5 (Normdist(20,18.42,6.56,1) Normdist(15,18.42,6.56,1)20
20-25 6
gt25 2
84Statistical test of distribution
Frequencies
Observed Expected
lt10 2 1.99
10-15 5 4.03
15-20 5 5.88
20-25 6 4.94
gt25 2 3.16
85?2 Goodness of Fit Test
- Is based on
- ?2 ?(oi-ei)2/ei
m
i1
df m k 1 k number of parameters replaced
by estimates oi observed frequency, ei expected
frequency
86Statistical test of distribution
Frequencies
oi ei
lt10 2 1.99
10-15 5 4.03
15-20 5 5.88
20-25 6 4.94
gt25 2 3.16
?2 ?(oi-ei)2/ei
(2-1.99)2/1.99 (5-4.03)2/4.03 (5-5.88)2/5.88 (
6-4.94)2/4.94 (2-3.19)2/3.16 1.04
87Statistical test of distribution
Ho X ? N(µ,?2) H1 X does not follow
N(µ,?2) df m k 1 5 2 - 1
?2 ?(oi-ei)2/ei 1.04
CHIINV(0.05,2)5.99
88 Outline of Topics (Continued)
- Estimation Theory/Hypotheses Testing Relationship
- Operating Characteristic Curves and Power of a
Test - Fitting Theoretical Distributions to Sample
Frequency Distributions - Chi-Square Test for Goodness of Fit
89Sum Up Chapter 7
- Hypothesis testing
- null vs alternative
- null with equal sign
- null often status quo
- alternative often what want to prove type
I error vs type II error - type I called level of significance
- P values
- 1-ß power of test
- probability of rejecting false
- one tailed vs two tailed
90Sum Up Chapter 7
- Hypothesis tests
- mean Normal test
- population normal, known variance
- large sample
- mean t test
- population normal, unknown variance,
- small sample
- Statistical Decisions
- Statistical Hypotheses
- Null Hypotheses
- Tests of Hypotheses
- Type I and Type II Errors
- Level of Significance
- Tests Involving the Normal Distribution
- One and Two Tailed Tests
91Sum Up Chapter 7
Normal and t
92Sum Up Chapter 7
- Hypothesis tests
- difference of means Normal test
- population normal, known variance
-
-
93Sum Up Chapter 7
- Hypothesis tests
- variance
-
-
Are variances equal
94Sum Up Chapter 7
?2 and F
95Sum Up Chapter 7
- How is random variable distributed
- normal graph cumulative frequency distribution
- special paper
- straight line
- Statistical
?2k-m-1 ?(oi-ei)2/ei k categories m
estimated parameters always 1 tailed
96End of Chapter 7!