Title: DMAIC:%20Improve
1DMAIC Improve
2Objective
- Ready to develop, test, and implement solutions
to improve the process by reducing variation in
the critical output variables caused by the vital
few of input variables.
3Small note
- In many cases, it is difficult to completely
separate the activities in Measure, Analyze, and
Improve.
4Design of Experiment (DOE)
- DOE is a collection of statistical methods for
studying the relationships between independent
variables, and their interactions (also called
factors, input variables, or process variables)
on a dependent variable (or CTQ).
5Design of Experiment (DOE)
23.5 24.6
Factors
Replications
Levels
6Design of Experiment (DOE)
- Full factorial
- All possible combinations
- No prior knowledge about the subject
- 2k k factors each with 2 levels
- 22 2 factors each with 2 levels
- Fractional factorial
- Excluding some combinations
- Preferred when it is costly to do experiments
- 2k-1 k-1 factors each with 2 levels
7Design of Experiment (DOE)
- ANOVA One Factor
- ANOVA Two Factor
- Remember Gage RR with ANOVA?
8Correlation Coefficient
- The sample correlation coefficient (r) measures
the degree of linearity in the relationship
between X and Y - -1 lt r lt 1
9Correlation Analysis
10Notes on Correlation Coefficient
- Correlation is a measure of linear association
and not necessarily causation - Just because two variables are highly correlated,
it does not mean that one variable is the cause
of the other, and vice versa.
11Notes on Correlation Coefficients
How about this one? Do you think there is no
correlations between X and Y? Remember that rxy
only measures linear correlation.
Obviously, the above shows no correlations
between X and Y
12Example
- A golfer is interested in investigating the
relationship, if any, between driving distance
and 18-hole score
Average Driving Distance (yds.)
Average 18-Hole Score
69 71 70 70 71 69
277.6 259.5 269.1 267.0 255.6 272.9
13Example (contd)
x
y
69 71 70 70 71 69
-1.0 1.0 0 0 1.0 -1.0
277.6 259.5 269.1 267.0 255.6 272.9
10.65 -7.45 2.15 0.05 -11.35 5.95
-10.65 -7.45 0 0 -11.35 -5.95
Average
267.0
70.0
-35.40
Total
Std. Dev.
8.2192
.8944
14Example
15Regression Analysis
- Simple Regression Analysis
- One predictor and one response.
- Multiple Regression Analysis
- Two or more predictors and one response.
16Simple Linear Regression
- Analyzes the relationship between two variables
- It specifies one dependent (response) variable
and one independent (predictor) variable
17Simple Linear Regression
18Regression Model and Parameters
- Unknown parameters are
- b0 Intercept
- b1 Slope
- The assumed model for a linear relationship is
- yi b0 b1xi ei for all observations
(i 1, 2, , n)
19Estimations
- The fitted model used to predict the expected
value of Y for a given value of X is - yi b0 b1xi
- The fitted coefficients are
- b0 the estimated intercept
- b1 the estimated slope
20Formulas
21Example
- Reed Auto periodically has a special week-long
sale. As part of the advertising campaign Reed
runs one or more television commercials during
the weekend preceding the sale. Data from a
sample of 5 previous sales are shown below.
Number of TV Ads
Number of Cars Sold
1 3 2 1 3
14 24 18 17 27
22Example
- Slope
- Intercept
- Estimated regression equation
23Assessing the Fit
- Relationship Among SST, SSR, SSE
SST SSR SSE
where SST total sum of squares SSR
sum of squares due to regression SSE
sum of squares due to error
24R2 or Coefficient of Determination
- R2 is a measure of relative fit based on a
comparison of SSR and SST. - 0 lt R2 lt 1
- R2 1 means that the regression fits perfectly
(x can 100 explain the variations in y).
25R2 or Coefficient of Determination
R2 SSR/SST
where SSR sum of squares due to
regression SST total sum of squares
Note that in a simple regression, R2 (r)2
26Example
- In Reed Auto Example, the coefficient of
determination, R2 is
R2 SSR/SST 100/114 .8772
The regression relationship is very strong 88
of the variability in the number of cars sold can
be explained by the linear relationship between
the number of TV ads and the number of cars sold.
27Hypothesis Testing
- We need to determine whether x is statistically
significant to y - To test for the significance, we must conduct a
hypothesis test to determine whether the value of
b1 is different than zero or not.
28Regression Using Excel (Reed Auto previous TV
ads example)
gtgt Tools gtgt Data Analysis gtgt Regression
29Interpreting the result
- The regression equation is
- y 10 5x
- The above means that when x 2, the model
predicts y (that is ) to be 20. - R2 0.8772 means that X could explain 87.72
variations in Y.
30Interpreting the result
- Is the slope (b1) statistically significant?
-
-
- p-value for b1 is 0.01898. Using a 0.05, we
reject Ho (since a gt p-value). Therefore we
conclude that the slope is not equal to zero. It
means that X is statistically influencing Y.
- The above question can be rewrite as
- Is the slope (b1) statistically different than
zero? - We know that the slope is 5. But our interest is
to check whether this value, 5, is statistically
different than zero or not.
31Reading ANOVA table
- Note that in this case K 1
32Multiple Regression
- Multiple regression is simply an extension of
bivariate regression. - Multiple regression includes more than one
independent variable. - Same concepts as in Bivariate Analysis.
33Multiple Regression
- Y is the response variable and is assumed to be
related to the k predictors (X1, X2, Xk) - Regression Model
- Estimated Regression Equation
34Example (Y is Price)
35Example (contd)
- Is SqFt significantly affecting Price?
p-value for b1 is 1.42561E-14 or 1.426 x 10-14
or 0.0000. Using a 0.05, we reject Ho (since a
gt p-value). Therefore we conclude that the slope
is not equal to zero. It means that SqFt is
statistically influencing Price.
36Example (contd)
- Is LotSize significantly affecting Price?
p-value for b1 is 0.00011462. Using a 0.05, we
reject Ho (since a gt p-value). Therefore we
conclude that the slope is not equal to zero. It
means that LotSize is statistically influencing
Price.
37Reading ANOVA table