Title: Confidence, Prediction, and Tolerance Intervals
1Confidence, Prediction, and Tolerance Intervals
- Engineering Experimental Design
- Valerie L. Young
2What is a Confidence Interval for the Mean?
- A way of expressing the uncertainty in x as an
estimate of ? - x sample mean
- ? population mean
- 95 CI says that
- about 95 of the time, if you estimate an
interval for ? this way, the true value of ? will
be inside the interval - if you collected data this way many times, 95
times out of 100, the mean of the dataset would
be in this range
3What is a Prediction Interval?
- A way of expressing the uncertainty in x as an
estimate of what the next measured value will be - x sample mean
- 95 PI means that about 95 of the time, the
next measurement you make will be inside this
interval
4What is a Tolerance Interval?
- A way of determining a range that (with a certain
confidence level) will contain a certain
percentage of the population - An 80 TI with 95 confidence says that about
95 of the time, 80 of the measurements you
make will be inside this interval
5In this course . . .
- Confidence intervals are the most important of
the three - Prediction intervals will be revisited when we
learn to place uncertainties on values predicted
using a model determined by regression - Tolerance intervals will not be tested on an exam
6Why the Distinction Between Confidence Interval
and Prediction Interval?
- It is much easier to predict what will happen on
average, in the long run, than it is to predict
what will happen in any particular measurement. - Based on the heights of students in this class,
what is the average height of an OU student? - Based on the heights of students in this class,
how tall will the next person through the door
be? - A prediction interval must be wider than a
confidence interval to allow for this additional
uncertainty.
7When Do I Do What?
8When Do I Do What?
9Why the Distinction Between Large and Small
Samples for Confidence Intervals?
- For large samples, the distribution of sample
means is always normal, regardless of what the
original population distribution looks like. So,
you can always use the standard normal (z)
distribution. - For small samples, you only get a normal
distribution of sample means if the population
distribution is normal, and you have to correct s
as an estimate of ? by taking into account n.
(As n increases, s decreases.)
10Normality and Prediction Intervals
- The formulas given here for prediction intervals
are only valid for normally-distributed data. - Should use a normal probability plot to check
- Will assume normal distribution when I ask for
prediction interval in this class - Nelson, Coffin Copeland discuss how to handle
non-normal data, should you need to in the future.
11Example 1
- You measure the zinc concentration in the livers
of 56 fish and find that the mean of these 56
values is 9.15 ?g Zn / g liver and the standard
deviation is 1.27 ?g Zn / g liver. What is the
concentration of zinc in fish liver?
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
12Example 1
- You measure the zinc concentration in the livers
of 56 fish and find that the mean of these 56
values is 9.15 ?g Zn / g liver and the standard
deviation is 1.27 ?g Zn / g liver. What is the
concentration of zinc in fish liver? - n 56, so this is a large sample and we can use
z - For 95 confidence, we want 5 of the area in
the 2 tails, or 2.5 of the area in each - 1 0.025 0.975
- z(area 0.975) 1.96
- CI 1.96 ? 1.27 / sqrt(56) 0.3326
- The mean zinc concentration is 9.2 0.3 ?g Zn /
g liver (95 confidence interval) - Note if you use t-critical(?55) then you get
t-critical2.000 and CI 0.3394. No important
difference.
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
13Example 2
- You measure the zinc concentration in the livers
of 56 fish and find that the mean of these 56
values is 9.15 ?g Zn / g liver and the standard
deviation is 1.27 ?g Zn / g liver. What
concentration of zinc do you expect to find in
the next fish liver you eat?
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
14Example 2
- You measure the zinc concentration in the livers
of 56 fish and find that the mean of these 56
values is 9.15 ?g Zn / g liver and the standard
deviation is 1.27 ?g Zn / g liver. What
concentration of zinc do you expect to find in
the next fish liver you eat? - n 56, so we need t-critical for ?60 (closest
we can get to 55) and ? 0.025 (half of 0.05). - t-critical 2.000 (Table B2 in text)
- PI 2.000 ? 1.27 ? sqrt(1(1/56)) 2.562
- Concentration of zinc in next fish will be 9.2
2.5 ?g Zn / g liver (95 prediction interval)
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
15Example 3
- You measure the zinc concentration in the livers
of 56 fish and find that the mean of these 56
values is 9.15 ?g Zn / g liver and the standard
deviation is 1.27 ?g Zn / g liver. What range
of zinc concentrations will describe the livers
of 90 of this type of fish?
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
16Example 3
- You measure the zinc concentration in the livers
of 56 fish and find that the mean of these 56
values is 9.15 ?g Zn / g liver and the standard
deviation is 1.27 ?g Zn / g liver. What range
of zinc concentrations will describe the livers
of 90 of this type of fish? - n 56 and p 0.90, so r 1.6585 and u 1.1787
(Table B.12 in text use the value for n60) - TI 1.6585 ? 1.1787 ? 1.27 2.482
- 90 of these fish are likely to have 9.5 2.5
?g Zn / g liver in their livers (95 confidence
level)
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
17Example 4
- Measurements of stabilized viscosity were made on
five asphalt specimens, resulting in values of
2781, 2900, 3013, 2856, and 2888 cP. What is the
viscosity of the asphalt?
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
18Example 4
- Measurements of stabilized viscosity were made on
five asphalt specimens, resulting in values of
2781, 2900, 3013, 2856, and 2888 cP. What is the
viscosity of the asphalt? - Mean 2887.60 cP, sample std dev 84.03 cP
- n5, so use t-critical for ?4 and ?0.025
- t-critical 2.776 (Table B2 in text)
- CI 2.776 ? 84.03 / sqrt(5) 104
- The mean viscosity of the asphalt is 2890 100
cP (95 confidence interval)
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
19Example 5
- Measurements of stabilized viscosity were made on
five asphalt specimens, resulting in values of
2781, 2900, 3013, 2856, and 2888 cP. If you
measured a sixth specimen, what would you expect
its viscosity to be?
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
20Example 5
- Measurements of stabilized viscosity were made on
five asphalt specimens, resulting in values of
2781, 2900, 3013, 2856, and 2888 cP. If you
measured a sixth specimen, what would you expect
its viscosity to be? - Mean 2887.60 cP, sample std dev 84.03 cP
- n5, so use t-critical for ?4 and ?0.025
- T-critical 2.776 (Table B2 in text)
- PI 2.776 ? 84.03 ? sqrt(1 1/5) 256 cP
- The viscosity of the next asphalt will be 2890
260 cP (95 prediction interval)
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
21Example 6
- Measurements of stabilized viscosity were made on
five asphalt specimens, resulting in values of
2781, 2900, 3013, 2856, and 2888 cP. Give the
range of viscosities within which you would
expect 75 of specimens from this manufacturer
to lie.
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
22Example 6
- Measurements of stabilized viscosity were made on
five asphalt specimens, resulting in values of
2781, 2900, 3013, 2856, and 2888 cP. Give the
range of viscosities within which you would
expect 75 of specimens from this manufacturer
to lie. - Mean 2887.60 cP, sample std dev 84.03 cP
- n5. This sample is not large enough to derive a
tolerance interval. At least two more specimens
must be tested.
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
23Example 7
- In order to learn which brand of battery lasts
longest in Kasey the Kinderbot, you purchase 50
Duralife batteries and 45 Rayolife batteries.
You find that the Duralife batteries last an
average of 4.15 hours (standard deviation 0.92
hours). You find that the Rayolife batteries
last an average of 4.53 hours (standard deviation
0.84 hours). How much longer do Rayolife
batteries last?
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
24Example 7
- You find that the 50 Duralife batteries last an
average of 4.15 hours (standard deviation 0.92
hours). You find that the 45 Rayolife batteries
last an average of 0.84 hours (standard deviation
1.64 hours). How much longer do Rayolife
batteries last? - The difference between the means is 0.38.
- 95 CI for the difference is 1.96 ?
sqrt((0.922/50)(0.842/45)) 0.3539 - Rayolife batteries last 0.38 0.35 hours longer
than Duralife batteries (95 confidence level).
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
25Example 8
- You want to learn which brand of battery lasts
longest in Kasey the Kinderbot, but youre on a
budget, so you purchase 10 Duralife batteries and
10 Rayolife batteries. You find that the
Duralife batteries last an average of 4.15 hours
(standard deviation 0.92 hours). You find that
the Rayolife batteries last an average of 4.53
hours (standard deviation 0.84 hours). How much
longer do Rayolife batteries last?
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999
26Example 8
- Youre on a budget, so you purchase 10 Duralife
batteries and 10 Rayolife batteries. You find
that the Duralife batteries last an average of
4.15 hours (standard deviation 0.92 hours). You
find that the Rayolife batteries last an average
of 4.53 hours (standard deviation 0.84 hours).
How much longer do Rayolife batteries last? - CI 2.262 ? sqrt((0.922/10)(0.842/10)) 0.891
hours - Based on samples of 10, the difference in
lifetime between the battery brands is not
significant at the 95 confidence level. - Note that if the sample sizes are not the same,
there is a complex formula that must be used to
calculate the degrees of freedom.
Based on a problem in Devore Farnum, Applied
Statistics for Engineers Scientists, 1999