Title: Modeling Process Quality
1Chapter 2
22-1. Describing Variation
- Graphical displays of data are important tools
for investigating samples and populations. - Displays can include stem and leaf plots,
histograms, box plots, and dot diagrams. - Graphical displays give an indication of the
overall distribution of the data.
32-1.1 The Stem-and-Leaf Plot
- 17 558
- 18 357
- 19 00445589
- 20 1399
- 21 00238
- 22 005
- 23 5678
- 24 1555899
- 25 158
- The numbers on the left are the stems
- The values on the right are the leaves
- The smallest number in this set of data is 175
- The median is 211
4More about it
- The previous is actually an ordered stem-and-leaf
display - Ordered from lowest to highest
- Terms used in describing data can be introduced
using the example - Percentiles
- kth percentile
5More about it
- 50th percentile is at 211
- 20 observations are below and 20 observations are
above - So, its also the sample median
- 10th percentile is at 184
- 4 observations are below and 36 observations are
above
6More about it
- Quartiles
- 1st quartile
- 25 of the observations are below the value
7More about it
- 1st quartile is at 194.5
- 10 observations are below and 30 observations are
above - (194 195)/2 194.5
- 3rd quartile is at 239.5
- 30 observations are below and 10 observations are
above - (238 241)/2 239.5
8More about it
- Interquartile range, or IQR, is 45
- Q3 Q1 239.5 194.5
92-1.2 The Frequency Distribution and
Histogram
- Frequency Distribution
- Arrangement of data by magnitude
- More compact than a stem-and-leaf display
- Graphs of observed frequencies are called
histograms.
102-1.2 The Frequency Distribution and
Histogram
11Suggestions for histograms
- Use between 4 and 20 bins
- Guideline is SQRT (n)
- Make the bins of uniform width
- Start the lower limit for the first bin just
slightly below the smallest data value - lt5 overflow
12Graphical Displays
- What is the overall shape of the data?
- Are there any unusual observations?
- Where is the center or average of the data
located? - What is the spread of the data? Is the data
spread out or close to the center?
132-1.3 Numerical Summary of Data
- Important summary statistics for a distribution
- of data can include
- Sample mean,
- We often write Xbar
- Sample variance, S2
- Sample standard deviation, S
- Sample median, M
142-1.3 Numerical Summary of Data
- For the data shown in the previous histogram and
stem and leaf plot, the summary statistics are - n Mean Median Var StDev
- 40 215.50 211.00 634.5 25.19
15Use Excel
16Understanding variance
- Sample 1
- 1, 3, 5
- Mean 3
- S2 12 32 52 3(32)/2 4
- S 2
17Understanding variance
- Sample 2
- 1, 5, 9
- Mean 5
- S2 16
- S 4
18Understanding variance
- Sample 3
- 101, 103, 105
- Mean 103
- S2 4
- S 2
192-1.4 The Box Plot
- The Box Plot is a graphical display that
- provides important quantitative
- information about a data set. Some of
- this information is
- Location or central tendency
- Spread or variability
- Departure from symmetry
- Identification of outliers
20Example 2-2 Data on hole diameters
- 120.5, 120.9, 120.3, 121.3, 120.4, 120.2, 120.1,
120.5, 120.7, 121.1, 120.9, 120.8 - Minimum
- 120.1
- Maximum
- 121.3
- 1st quartile
- 120.35
- 3rd quartile
- 120.9
- Median
- 120.6
212-1.4 The Box Plot
222-1.5 Sample Computer Output
232-1.6 Probability Distributions
- Definitions
- Sample A collection of measurements selected
from some larger source or population. - Probability Distribution A mathematical model
that relates the value of the variable with the
probability of occurrence of that value in the
population. - Random Variable Variable that can take on
different values in the population according to
some random mechanism.
242-1.6 Probability Distributions
- Two Types of Probability Distributions
- Continuous When a variable being measured is
expressed on a continuous scale, its probability
distribution is called a continuous distribution.
The probability distribution of piston-ring
diameters is continuous. - Discrete When the parameter being measured can
only take on certain values, such as the integers
0, 1, 2, , the probability distribution is
called a discrete distribution. The distribution
of the number of nonconformities would be a
discrete distribution.
252-2 Important Discrete Distributions
- 2-2.1 The Hypergeometric Distribution
- 2-2.2 The Binomial Distribution
- 2-2.3 The Poisson Distribution
26Hypergeometric distribution
- Finite population of N items
- D (where D lt N) have a characteristic of interest
- Sample of size n is taken
- Probability that x of n have the characteristic
of interest - Concepts are used in acceptance sampling
27Hypergeometric distribution
- Example
- Lot contains 100 items
- 5 of the lot are nonconforming
- Sample 10 from the lot
- Probability that not more than one is
nonconforming - P(x lt 1) P(x 0) P(x 1)
28Hypergeometric distribution
- Example, cont.
- P(x 0)
- 5!/(0!5!) 95!/(10!85!)/100!/(10!90!)
- P(x 1)
- 5!/(1!4!) 95!/(9!86!)/100!/(10!90!)
- P(x lt 1) .923
292-2.2 The Binomial Distribution
- A quality characteristic follows a binomial
- distribution if
- 1. All trials are independent.
- 2. Each outcome is either a success or a
failure. - The probability of success on any trial is given
as p. The probability of a failure is 1- p. - 4. The probability of a success is constant.
302-2.2 The Binomial Distribution
- The binomial distribution with parameters
- n 0 and 0 lt p lt 1, is
- The mean and variance of the binomial
distribution are
31Example
- The probability that the Braves win a game at
home against the Mets is 0.52 - What is the probability that the Braves win
exactly 2 of 3 in the next home stand with the
Mets - 3!/(2!1!)(.52)2 (.48) .389
322-2.3 The Poisson Distribution
- The Poisson distribution is
-
- Where the parameter ? gt 0. The mean and variance
of the Poisson distribution are
332-2.3 The Poisson Distribution
- The Poisson distribution is useful in quality
engineering - Typical model for the number of defects or
nonconformities that occur in a unit of product. - Any random phenomenon that occurs on a per unit
basis is often well approximated by the Poisson
distribution.
34Example
- The expected number of surface blemishes on the
door of a new Lexus 400 is distributed as a
Poisson random variable with a mean of 1.2 - What is the probability that there are 2 or more
blemishes on a door
35Example, cont.
- P(x gt2) 1 P(x0) P(x1)
- P(x0) (e-1.21.20)/0! .301
- P(x1) (e-1.21.21)/1! .361
- P(xgt2) 1 - .301 - .361 .339
362-3 Important Continuous Distributions
- 2-3.1 The Normal Distribution
- 2-3.2 The Exponential Distribution
372-3.1 The Normal Distribution
- The normal distribution
- is an important
- continuous distribution.
- Symmetric, bell-shaped
- Mean, ?
- Standard deviation, ?
- N(m, s2)
382-3.1 The Normal Distribution
- For a population that is
- normally distributed
- approx. 68 of the data will lie within 1
standard deviation of the mean - approx. 95 of the data will lie within 2
standard deviations of the mean, and - approx. 99.7 of the data will lie within 3
standard deviations of the mean.
392-3.1 The Normal Distribution
- Standard normal distribution
- Many situations will involve data that is
normally distributed. We will often want to find
probabilities of events occurring or percentages
of nonconformities, etc. A standardized normal
random variable is
402-3.1 The Normal Distribution
- Standard normal distribution
- Z is normally distributed with mean 0 and
standard deviation, 1. - Use the standard normal distribution to find
probabilities when the original population or
sample of interest is normally distributed. - Tabulated.
412-3.2 The Normal Distribution
- Example 2-5
- The tensile strength of paper is modeled by a
normal - distribution with a mean of 35 lbs/in2 and a
standard - deviation of 2 lbs/in2.
- What is the probability that the tensile strength
of a selected item is less than 40 lbs/in2? - If the specifications require the tensile
strength to exceed 30 lbs/in2, what is the
probability that a selected item is scrapped?
42Example, cont.
- Determine P(xlt40)
- Z (40 35)/2 2.5
- So, F(2.5) .99379
- And, P(xlt40) .99379
- Determine P(xgt30)
- Z (30 35)/2 -2.5
- Note, F(-2.5) 1 - F(2.5)
- So, P(xgt30) 1 - .99379 .00621
43Another example
- Given XN(10,9)
- Find a such that P(x gt a) .05
- From Appendix II, Z 1.645
- Z (a m)/s
- a Zs m
- So, a 3(1.645) 10 14.935
44Linear combinations
- If normally distributed random variables are
combined, the result is a normally distributed
random variable whose mean is the sum of the
individual means and whose variance is the sum of
the individual variances
45Linear combinations
- If Y a1x1 a2x2 anxn
- Then, my a1m1 a2m2 anmn
- And sy2 a12s12 a22s22 an2sn2
46Example
- Four rods with the following distributions N(m,
s2) in cm are laid end to end - N( 40, 4), N(36, 3), N(42, 7), N(100, 11)
- What is the distribution of the combination?
- N(218, 25), my 218 cm, sy2 25, sy 5
47Example, cont.
- What is the probability that the assembly will be
longer than 220 cm? - Z (220 218)/5 .4
- So, P(Ygt220) 1 - .65542 .34458
- What is the probability that the assembly will be
between 216 and 220 cm? - 1 - .34458 - .34458 .31084
48Central limit theorem
- If x1, x2, , xn are independent random variables
mi, and variance si2, and if y x1 x2
xn, then - y Smi/SQRT(Ssi2) is distributed N(0, 1) as n
approaches infinity - The further the deviation from a symmetric,
unimodal distribution, the larger the value of n
required to achieve normality
49Example
- If six U(0, 1) random number are added, the mean
is subtracted, and the result is divided by
SQRT(.5), a N(0, 1) random variable is generated - .2345, .1987, .7762, .8150, .5337, .3462
- (2.9043 3)/.7071 -.1353
- Note, si2 1/12
502-3.3 The Exponential Distribution
- The exponential distribution is widely used in
the field of reliability engineering. - The exponential distribution is
- The mean and variance are
512-4 Some Useful Approximations
- In certain quality control problems, it is
sometimes useful to approximate one probability
distribution with another. This is particularly
useful if the original distribution is difficult
to manipulate analytically. - Some approximations
- Binomial approximation to the hypergeometric
- Poisson approximation to the binomial
- Normal approximation to the binomial
52Binomial approximation to the hypergeometric
- If n/N (the sampling fraction is small (say n/N lt
.1), then the binomial distribution with p D/N
and n is a good approximation to the
hypergeometric
53Example
- A lot of 200 units contains 5 nonconforming units
- What is the probability that a sample of 10 will
contain no nonconforming units? - N 200, n 10, n/N .05 lt .1
- Using the hypergeometric
- P(x0) (C5,0)(C195,10)/ C200,10
- P(x0) .7717
54Example, cont.
- Using the binomial approximation with p
D/N 5/200 .025, and n 10 - P(x0) C10,0(.025)0(.975)10 .7763
55Poisson approximation to the binomial
- When p approaches 0 and n approaches infinity
with l np constant - Good when p lt .1
56Example
- Using the data from the last example with p
.025 and n 10, l np .25 - Then, p(0) (e-.252.50)/0! .7788
- (Compare to .7763 from the last example)
57Normal approximation to the binomial
- If the number of trials, n, is large
(say n gt 10), and p is close to ½, the normal can
be used to approximate the binomial, with a
continuity correction
58Example
- p .5
- n 20
- np 10
- np(1 p) 5, SQRT np(1 p) 2.236
- P(8) ?
- Using the binomial
- p(8) C20,8(.5)8(.5)12 .12
59Example, cont.
- Using the normal approximation
- P(7.5 lt a lt 8.5) F(8.5 10)/2.236 - F(7.5
10)/2.236 - F(-.671) F(-1.118) .8682 - .7489
- .1193
- (Compare to .12)
60Suggestion
- Work enough odd numbered exercises so that you
understand this chapter - If it says Derive you can skip it
- If it says Continuation of work the previous
exercise as well
61End