Title: ISE-180 Engineering Statistics
1ISE-180Engineering Statistics
2What is statistics
- To guess is cheap. To guess wrongly is expensive
- Chinese Proverb - There are three kinds of lies lies, d--n lies,
and statistics - Benjamin Disraeli, British PM - First get your facts, then you can distort them
at your leisure - Mark Twain - Statistical Thinking will one day be as necessary
for efficient citizenship as the ability to read
and write - H. G. Wells
3What is statistics
- In our world, through conversation with others,
through books, TV and sports, we are continually
confronted with collections of facts or data - STATISTICS is a branch of mathematics with
the collection, analysis, interpretation, and
presentation of masses of numerical data - Statistics is used to understand variability.
4What is statistics
- Frequently we wish to acquire information or draw
conclusions about a population (all individuals
or objects of a particular type) - Many times, the data at our disposal is a sample
(a portion or subject of a population) - The field of statistics can be broken into 2
branches - Descriptive statistics concerned with the
organization, summation, and presentation of data
- Inferential statistics - Value to apply to create
population or when data consist of a sample
5What is statistics
- Probability a bridge between the 2 fields by
studying random variation - Inferential statistics using sample data to
draw conclusion about the entire population (
i.e. making an inference about a group based on a
sample) - To do so, split each xi into 2 parts, a stem
consisting of one or more leading digits and a
leaf which consists of the remaining digits .
6Descriptive statistics
7Descriptive statistics
8Descriptive statistics
9Descriptive Statistics
- Histograms
- A graphical summary tool that permits sorting of
data into cells. It is especially useful for
finding population tendencies (location and
dispersion). Requires multiple (20-30)
observations to allow process responses to
exhibit their tendencies. Also data specifics
are lost within the cell boundaries.
10Descriptive Statistics
- An example of this is the representation of 36
air pollution readings (units .01 ppm of ozone).
11Descriptive Statistics
- As a set of readings, it is not easy to see any
clear trend to the data. So, we can construct a
frequency distribution using the classes 2.0 lt
3.0, 3.0 - lt 4.0, etc. to 8.0 - lt 9.0 in order
to get a sense of the distribution. The tallies
are tabulated below
12Descriptive Statistics
- These lead to the frequency distribution
13Descriptive Statistics
- And also lead to the histogram
14Descriptive Statistics
- The drawback is that within each cell, we lose
the data point values contained by the cell. For
example, we would not be able to see from the
graphical representation that the 3s cell data
all lie within the range of 3.0 3.4.
Therefore, cell selection can be an art form
requiring some care.
15Descriptive statistics
16Descriptive statistics
17Descriptive statistics
18Descriptive statistics
19Descriptive Statistics
- Continuing with the data set from before,
- Mean the weighted average of all observations
- Sample Mean
- Median the data point at which half of the
observations are higher and half are lower in
value -
- Mode the most common data point. There may be
multiple modes in a distribution. The most
frequently occurring value is 4.7 (5 times).
20Descriptive statistics
21Descriptive statistics
22Descriptive statistics
23Descriptive statistics
24Descriptive Statistics
- Continuing with the example data from before
25Exploratory data analysis
26Exploratory data analysis
27Probability
28Properties of Probability
- An example would be the use of batting averages
in baseball. If there is a .300 hitter at the
plate, the chance of a base hit is 0.300 ( P(A))
for each official at-bat. This is expected over
many at-bats, but does not guarantee a base hit
at any given at-bat.
29Probability Theory
30Probability Theory
31Probability Theory
32Properties of Probability
33Properties of Probability
34Properties of Probability
35Enumeration or Counting Technologies
36Enumeration or Counting Technologies
37Enumeration or Counting Technologies
38Enumeration or Counting Technologies
39Enumeration or Counting Technologies
40Conditional Probability
41Conditional Probability
42Conditional Probability
- Continuing the baseball analogy, managers
frequently attempt to play the percentages by
setting up relatively favorable match-ups. For
example, if a particular hitter owns the
pitcher, is really hot at the time, or bats from
the correct side of the plate for the situation,
the manager can expect better odds for success
than if these facts were ignored (again, over the
long run).
43Conditional Probability
44Independence
45Independence
- The classic example is a coin flip, whereby using
an honest coin would not tell us any extra
information about the next flip or series. - Other examples abound in industrial and other
settings. We use the assumption independence to
ease the estimation of processes and expected
results.
46Independence
47Discrete Random Variables
48Random Variables
49Random Variables
50Random Variable Definition
51Random Variable Definition
52Discrete Random Variables
53Probability Distribution for Discrete R.V.
54Probability Distribution for Discrete R.V.
55Probability Distribution for Discrete R.V.
- An example a biotech firm specializing in test
kits tracks sales and has developed the following
distribution for forecasting
56Probability Distribution for Discrete R.V.
- The probability mass function can be described
by
57Parameter of a probability distribution
58Parameter of a probability distribution
59Cumulative Distribution Function
60Cumulative Distribution Function
61Cumulative Distribution Function
- Continuing from the data provided, the CDF for
the forecast distribution would be
62Expected value of a Discrete R.V.
63Expected value of a Discrete R.V.
64Expected Value for a Discrete R.V.
- Here are two examples, noting that E(x) doesnt
have to be an allowed distribution value. - First, the expected value of a one die would be
- Continuing the example of the sales forecast
65Expected Value for a Discrete R.V.
- A chip maker tracks the number of large particles
on a chip as part of the inspection process. 100
sets of data were taken to get an estimate of the
probability of each number of particles and the
expected long-run number of particles. -
66Expected Value for a Discrete R.V.
- The long-term expected value can then be
determined from the observed distribution using
the summation described previously -
67Rules of Expected Value
68Expected Value of a Function
69Variance of a Discrete RV
70Variance of a Discrete RV
71Variance of a Discrete R.V.
- Continuing with the chip example, the variance is
calculated using the estimate for the mean and
the observations
72Variance of a Discrete R.V.
- Similarly, the variances for the forecast and die
situations can also be calculated, and are - Die
- Variance 2.917 s 1.708
- Forecast
- Variance 0.7875 s 0.8874
- This information is handy for determining the
likelihood of an observation occurring.
73Rules of Variance
74Common Families of Discrete Probability
Distribution
75Bernoulli Distribution
76Binomial Distribution
77Binomial Distribution
78Binomial Distribution
79Binomial Distribution
80Binomial Distribution
- A process uses an inspection plan that calls for
a sample of 5 units to be checked before
shipment. One failure rejects the lot. If there
are 10 defective, what is the probability of lot
rejection? - In this example, p 0.10, and n 5, giving the
following pmf
81Binomial Distribution
- This yields the following distribution table for
the possible results - There is a 59 chance that failing lots would be
sent out.
82Geometric Probability Distribution
83Geometric Probability Distribution
84Geometric Probability Distribution
85Poisson Probability Distribution
86Poisson Probability Distribution
87Poisson Probability Distribution
88Poisson Probability Distribution
- A manufacturer checks for contamination on their
storage disks. The mean value is 0.1
contaminants per square centimeter, with a disk
surface of 100 square centimeters. What is the
probability of five or more contaminants on the
disks? - The expected value per disk is
- 100 0.10 10 contaminants per disk
89Poisson Probability Distribution
- The question asked for 5 or more, which can be
calculated by difference, e.g. - 1 P(0) P(1) P(2) P(3) P (4)
- lambda 10, so the basic relation is
90Poisson Probability Distribution
- Tabulating the results and subtracting from 1
gives - This totals 0.0292528, leaving the probability as
0.9707472 that 5 or more contaminants will be
found
91Poisson Probability Distribution
92Tchebysheffs Theorem
93Continuous Random Variables
94Continuous Random Variables
95Continuous Random Variables
96Continuous Random Variables
97Continuous Random Variable
- An example would be the following random
variable, distributed as follows
98Continuous Random Variable
- Its density function would be
99Continuous Random Variables
100Cumulative Distribution Function
101Cumulative Distribution Function
102Cumulative Distribution Function
- The c.d.f. for the previous example is shown
below
103Expected Value for a Continuous R.V.
104Expected Value for a Continuous R.V.
105Expected Value for a Continuous R.V.
- Continuing with the previous example
distribution, and integrating from the allowed
values from 0 to 2, the expected value is
106Variance of a Continuous R.V.
107Variance of a Continuous R.V.
- Continuing as before, and integrating from 0 to 2
as before, the variance of the distribution is
108Variance of a Continuous R.V.
109Common families of Continuous Distribution
110Common Families of Continuous Distributions
111The Uniform Distribution
112The Uniform Distribution
113The Uniform Distribution
114Exponential Distribution
115Exponential Distribution
116Exponential Distribution
- An example A production line has the potential
to break down, with an average time between
breakdown events of 10 months (e.g. ? 0.10
/month). What is the probability of the time
between breakdowns being one year or less?
117Normal Distribution
118Normal Distribution
119Normal Distribution
120Normal Distribution
121Normal Distribution
122Normal Distribution
123Normal Distribution
124Exponential Distribution
- An example A production line has the potential
to break down, with an average time between
breakdown events of 10 months (e.g. ? 0.10
/month). What is the probability of the time
between breakdowns being one year or less?
125Normal Distribution
- An example of converting to standard normal
distribution is given by the data from a dry
plasma etch study (Lynch and Markle, 1997). The
data are in angstroms, from the before process
improvement trial. The mean is 564.11, and the
standard deviation is 10.747 angstroms.
126Normal Distribution
- This example demonstrated the yielded values from
the normalization equation for Z. The values
obtained can then be used to compare the
likelihood of occurrence when comparing to other
data. This is done in hypothesis testing and
related methods (SPC, etc.)
127Normal approximates to Binomial
128Multivariate/ Joint Probability Distributions
129Sampling distribution
130Sampling distribution
131Sampling distribution
132Central limit theorem
133Central limit theorem
134Central limit theorem
135Central Limit Theorem
- Using the exponential distribution and random
number generator, it is possible to plot the
resulting frequency distributions of data.
Notice the trend towards normality.
136Central Limit Theorem
137T-distribution
- Use of the t-distribution is similar to the use
of the standard normal distribution, except that
the degrees of freedom must be accounted for.
The estimation of the true process mean µ by the
experimental mean creates the loss of one degree
of freedom in estimating the true process
standard deviation s by s.
138T-Distribution
139T-Distribution
140Parameter estimation
Statistical inference process by which
information from samples data is used to draw
conclusions about the population from which the
sample was selected.
141Parameter estimation
142Parameter estimation
143Parameter estimation
144Confidence Interval
145Confidence Interval
- Interpreting a confidence interval? is covered
by interval with confidence 100(1- ?).If many
samples are taken and a 100(1- ?) CI is
calculated for each, then 100(1-?) of them will
contain/ cover the true value for ?. - Note the larger (wider) a CI, the more confident
we are that the interval contains the true value
of ?. - But, the longer it is, the less we know about ?,
due to variability or uncertainty ? need to
balance
146Confidence Interval
147Confidence Interval
148Confidence Interval
149Confidence Interval
150Confidence Interval
151Confidence Interval
- Interpreting a confidence interval? is covered
by interval with confidence 100(1- ?).If many
samples are taken and a 100(1- ?) CI is
calculated for each, then 100(1-?) of them will
contain/ cover the true value for ?. - Note the larger (wider) a CI, the more confident
we are that the interval contains the true value
of ?. - But, the longer it is, the less we know about ?,
due to variability or uncertainty ? need to
balance
152Normal Distribution
- An example of converting to standard normal
distribution is given by the data from a dry
plasma etch study (Lynch and Markle, 1997). The
data are in angstroms, from the before process
improvement trial. The mean is 564.11, and the
standard deviation is 10.747 angstroms.
153Confidence Interval
154Sample Size Needed
- Suppose we desire a confidence interval
- Based on a preliminary sample of n0, we have an
estimate of S2 and confidence interval
155Sample Size needed
- Find n such that
- If n is large,
156Hypothesis Tests - Review
- Hypothesis Tests
- Objective This section devoted to enabling us
to - Construct and test a valid statistical hypothesis
- Conduct comparative statistical tests( t-tests)
- Relate alpha and beta risk to sample size
- Conceptually understand analysis of variance
(ANOVA) - Interpret the results of various statistical
tests - T-tests, f-tests, chi-square tests.
- Understand the foundation for full and fractional
factorial - Compute confidence intervals to assess degree of
improvements
157Hypothesis Tests
- Hypotheses defined
- Used to infer population characteristics from
observed data. - Hypothesis test A series of procedures that
allows us to make inferences about a population
by analyzing samples - Key question was the observed outcomes the
result of chance variation, or was it an unusual
event? - Hint Frequency Area Probability
158Hypothesis Tests
- Hypothesis Definition of terms
- Null hypothesis (H0) Statement of no change or
difference. This statement is tested directly,
and we either reject H0 or we do not reject H0 - Alternative hypothesis (H1) The statement that
must be true if H0 is rejected.
159Hypothesis Tests
- Definition of terms
- Type I error The mistake of rejecting H0 when it
is true. - Type II error The mistake of failing to reject
H0 when it is false. - alpha risk (?)Probability of a type I error
- beta risk (?) Probability of a type II error
- Test statistic sample value used in making
decision about whether or not to reject H0
160Hypothesis Tests
- Definition of terms
- Critical region Area under the curve
corresponding to test statistic that leads to
rejection of H0 - Critical value The value that separates the
critical region from those values that do not
lead to rejection of H0 - Significance level The probability of rejecting
H0 when it is true - Degrees of freedom Referred to as d.f. or ?, and
n - 1
161Hypothesis Tests
- Definition of terms
- Type I error Producers risk
- Type II error Consumers risk
- Set so type I is the more serious error type
(taking action when none is required) - Levels for ? and ? must be established before the
test is conducted
162Hypothesis Tests
- Hypothesis Definition of terms
- Degree of freedom
- Degree of freedom are a way of counting the
information in an experiment. In other words,
they relate to sample size. More specifically,
d.f. n 1 - A degree of freedom corresponds to the number of
values that are free to vary in the sample. If
you have a sample with 20 data points, each of
the data points provides a distinct place of
information. The data set is described completely
by these 20 values. If you calculate the mean for
this set of data, no new information is created
because the mean was implied by all of the
information in the 20 data points.
163Hypothesis Tests
- Hypothesis Definition of terms
- Degree of freedom
- Once the mean is known, though, all of the
information in the data set can be described with
any 19 data points. The information in a 20th
data point is now redundant because the 20th data
points has lost the freedom to have any value
besides the one imposed on it by the mean - We have one less than the total in our sample
because a sample is at least one less than the
total population.
164Hypothesis Tests
- If the population variance is unknown, use s of
the sample to approximate population variance,
since under central limit theorem, s ? when n gt
30. Thus solve the problem as before, using s - With smaller sample sizes, we have a different
problem. But it is solved in the same manner.
Instead of using the z distribution, we use the t
distribution
165Hypothesis Tests
- Using t distribution when
- Sample is small (lt30)
- Parent population is essentially normal
- Population variance (?) is unknown
- As n decreases, variation within the sample
increases, so distribution becomes flatter.
166Methods to Test a Statistical Hypothesis
167Methods to Test a Statistical Hypothesis
168Relationship Between Hypothesis Tests and
Confidence Intervals
169Relationship Between Hypothesis Tests and
Confidence Intervals
170Relationship between Hypothesis Tests and
Confidence Intervals
- Using the data from the plasma etch study, can a
true process mean of 530 angstroms be expected at
a 95 confidence level? - The 95 confidence interval (developed earlier in
detail) runs from 555.85 to 572.37. Since 530 is
not included in this interval, the null
hypothesis of µ 530 is rejected.
171Confidence Interval
172Prediction Interval
173Prediction Interval
174Prediction Interval
175P - Values
- The P value is the smallest level of
significance that leads to rejection of the null
hypothesis with the given data. It is the
probability attached to the value of the Z
statistic developed in experimental conditions.
It is dependent upon the type of test (two-sided,
upper, or lower tail tests) selected to analyze
data significance.
176Confidence Interval
177P-value
178P-value
179Hypothesis Tests
- Compare the means of two samples Steps
- Understand word problem by writing out null and
alternative hypotheses - Select alpha risk level and find critical value
- Draw graph of the relation
- Insert data into formula
- Interpret and conclude
180Test for comparing two means
181Tests for Comparing Two Means
- A company wanted to compare the production from
two process lines to see if there was a
statistically significant difference in the
outputs, which would then require separate
tracking. The line data is as follows - A 15 samples, mean of 24.2, and variance of 10
- B 10 samples, mean of 23.9, and variance of 20
- 95 confidence
182P - Values
- Using the data developed from the process line
example, but with line A having a mean of 27.2,
instead of 24.2, the P-value would be
183Test for comparing two means
184Test for comparing two means
185Tests for Comparing Two Means
- A process improvement by exercising equipment was
attempted for an etch line. Given that the true
variances are unknown but equal, determine
whether a statistically significant difference
exists at the 95 confidence level. - Before Mean 564.108, standard deviation
10.7475, number of observations 9. - Exercise Mean 561.263, standard deviation
7.6214, number of observations 9.
186Tests for Comparing Two Means
- Since the variances are equal, the pooled
variance is used for creation of the confidence
interval. If zero is included, there is no
statistically significant difference. - There are 16 degrees of freedom, and at the a/2
0.025 level, the critical value for t is 2.120.
187Test for comparing two means
188Tests for Comparing Two Means
- An etch process was improved by recalibration of
equipment. The values for a determination of
statistically significant improvement at the 95
confidence level are given as follows - Before Mean 564.108, standard deviation
10.7475, number of observations 9. - Calibrated Mean 552.340, standard
deviation 2.3280, number of observations 9. - The null hypothesis is that µb µc 0
189Tests for Comparing Two Means
- The first task is to determine the number of
degrees of freedom and the appropriate critical
value.
190Tests for Comparing Two Means
- For 9 degrees of freedom and a/2 0.025, the
critical value for t is 2.262
191Test for comparing two means
192Tests for Comparing Two Means
193Tests for Comparing Two Means
- Two materials were compared in a wear test as a
paired, randomized experiment. The coded data
are tabulated below, as well as the differences
for each pairing.
194Tests for Comparing Two Means
- The mean, standard deviation, and 95 confidence
intervals are constructed below, with nine
degrees of freedom. If the interval does not
contain zero, the null hypothesis of no
difference is rejected.
195Test for Comparing Two Variances
- The variances can also be used to determine the
likelihood of observations being part of the same
population. For variances, the test used is the
F-test since the ratio of variances will follow
the F-distribution. This test is also the basic
test used in the ANOVA method covered in the next
section.
196F - Test
- The F distribution is developed from three
parameters a (level of confidence), and the two
degrees of freedom for the variances under
comparison. The null hypothesis is typically one
where the variances are equal, which would yield
an allowed set of values that F can be and still
not reject the null hypothesis.
197F - Test
- If the observed value of F is outside of this
range, the null hypothesis is rejected and the
observation is statistically significant. - Tables for the F distribution are found in texts
with a statistical interest. Normally, the ratio
is tabulated with the higher variance in the
denominator, the lower variance in the numerator.
198F - Test
- Compare two sample variances
- Compare two newly drawn samples to determine if a
difference exists between the variance of the
samples.(Up until now we have compared samples
to populations, and sample means) - For two normally distributed populations with
equal variances. ?12 ?22 - We can compare the two variances such that s12 /
s22 Fmax where s12 gt s22
199F - Test
- Compare two sample variances
- F tests for equality of the variances and uses
the f-distribution. - This works just like method used with the t
distribution critical value is compared to test
statistics. - If two variances are equal, F s12 / s22 1
,thus we compare ratios of variances.
200F - Test
- Compare two sample variances
- If two variances are equal, F s12 / s22
1Thus, we compare ratios of variances - Large F leads to conclusion variances are very
different. - Small F (close to 1) leads to conclusion
variances do not differ significantly. Thus for F
test - H0 s12 s22 H1 s12 ? s22
201F - Test
- Compare two sample variances
- F tables
- Several exist, depending on alpha level.
- Using F tables requires 3 bits of information.
- Chosen alpha risk
- Degree of freedom (n1 1) for numerator term.
- Degree of freedom (n2 1) for denominator term.
202Test for comparing two variances
203F - Test
- An etching process of semiconductor wafers is
used with two separate gas treatment mixtures on
20 wafers each. The standard deviations of
treatments 1 and 2 are 2.13 and 1.96 angstroms,
respectively. Is there a significant difference
between treatments at the 95 confidence level?
204F - Test
- 95 confidence level infers a 0.05, and also
since this is a two-tailed distribution, a/2
0.025 is used for F. There are 19 degrees of
freedom for each set of data. - Therefore the null hypothesis of no difference
in treatments cannot be rejected.
205Test for comparing two proportions
206Test for comparing two means
207Test for comparing two means
208Test for comparing two means
209Test for comparing two means
210Test for comparing two means - Paired
211Test for comparing two means - Paired
212Test for comparing two means - Paired
213Test for comparing two variances
214Test for comparing two variances
215ANOVA
- Analysis of Variance
- Employs F distribution to compare ratio of
variances of samples when true population
variance is unknown - Compares variance between samples to the variance
within samples (variance of means compared to
mean of variances). If ratio of variance of means
gt mean of variances, the effect is significant - Can be used with more than 2 group at a time
- Requires independent samples and normally
distributed population.
216ANOVA
217ANOVA
- ANOVA Concepts
- All we are saying is
- Assumption that the population variances are
equal or nearly equal allows us to treat the
samples from many different populations as if
they in fact belonged to a single large
population. If that is true, then variance among
samples should nearly equal variance within the
samples - H0 ?1 ?2 ?3 ?4 ?k
218ANOVA
- ANOVA Steps
- Understand word problem by writing out null and
alternative hypotheses - Select alpha risk level and find critical value
- Run the experiment
- Insert data into Anova formula
- Draw graph of relation
- Interpret and conclude
219ANOVA
- Analysis of Variance (ANOVA) is a powerful tool
for determining significant contributors towards
process responses. The process is a vector
decomposition of a response, and can be modified
for a wide variety of models. - Can be used with more than 2 groups at a time
- Requires independent, normally distributed
samples.
220ANOVA
- ANOVA decomposition is an orthogonal vector
breakdown of a response (which is why
independence is required), so for a process with
factors A and B, as tabulated below
221ANOVA
- The ANOVA values are given by
222ANOVA
- In this case, we use it to demonstrate how the
deposition of oxide on wafers as described by
Czitrom and Reese (1997) can be decomposed into
significant factors, checking the wafer type and
furnace location. The effects will be removed in
sequence and verified for significance using the
F-test. The proposed model is - YM W F R, where
- M is the grand mean, W is the effect of a given
wafer type, F is the effect of a particular
furnace location, and R is the residual. Y
denotes the observed value.
223ANOVA
- F-test in ANOVA
- The estimator for a given variance in ANOVA is
the mean sum of squares (MS). For a given
factor, its MS can be calculated as noted before,
and the ratio of the factor MS and residual MS
compared to the F-distribution for significance.
To do so, the level of significance ? must be
defined to establish the Type I error limit.
224ANOVA
- In this example, the level of significance is
selected at 0.10, yielding the following table of
upper bounds for the F-test. In all cases, the
higher variance (usually the factor) is divided
by the lower variance (usually the residual).
225ANOVA
- The set of means observed in the process, broken
down by wafer type and furnace location are (in
mils 0.001 inch) tabulated below. The grand
mean is 92.1485.
226ANOVA
- The sum of squares about the grand mean is found
by adding the squares of all of the deviations in
the 12 inner cells, and totals 6.7819. One
degree of freedom is expended in fixing the mean,
and 11 are left.
227ANOVA
- Determining the sum of squares for the wafer
types is done by multiplying the squared
difference between a type and the grand mean by
the number of times that type appears in the data
(e.g. 4squared differences). This is done for
all types, and totals 3.4764, with 2 degrees of
freedom.
228ANOVA
- The residual sum of squares totals 3.3145 with
nine degrees of freedom, and indicates that the
wafer type may not be the only significant
factor. The significance of the wafer type is
verified using the F-test -
229ANOVA
- As noted before, a residual sum of squares of
about 50 percent of the total sum of squares may
indicate the presence of a significant factor.
The effect of furnace location is then removed
from the data and tested for significance as
before. The furnace location sum of squares
totals 2.7863, with three degrees of freedom. -
230ANOVA
- The remaining residual sum of squares totals
0.5282 with six degrees of freedom. Repeating
the F-tests as before for both factors yields
231ANOVA
- These are very significant values to the ? 0.02
level. The resulting ANOVA table shows all of
the factors combined
232ANOVA
- The last task is to verify that the residuals
follow a normal distribution about the expected
value from the model. This is done easily using
a normal plot and checking that the responses are
approximately linear. Patterns or significant
deviations could indicate another significant
factor affecting the response. The plot that
follows of the residual responses from all values
shows no significant indication of non-normal
behavior, and we fail to reject (on this level)
the null hypothesis of the residuals conforming
to a normal distribution around the model
predicted value.
233ANOVA
234DOE Overview
- Focus is on univariate statistics (1 dependent
variable) vs. multivariate statistics (more than
one dependent variable) - Focus is on basic designs, and does not include
all possible types of designs (I.e. Latin
squares, incomplete blocks, nested, etc.)
235DOE Overview
- One key item to keep in clear focus while
performing a designed experiment - Why are we doing it?
- According to Taguchi (among others) it is to
refine our process to yield one of the following
quality outcomes
236DOE Overview
- Bigger is better (yields, income, some tensile
parameters, etc.) - Smaller is better (costs, defects, taxes,
contaminants, etc.) - Nominal is best (Most dimensions, and associated
parameters, etc.) - Remember also that whatever is selected as the
standard for comparison, it must be measured!
237ISE-130
- The End !
- Have a nice vacation!