Title: Linear Regression
1Chapter 12
2Introduction
- Regression analysis and Analysis of variance are
the two most widely used statistical procedures. - Regression analysis
- Description
- Prediction
- Estimation
312.1 Simple Linear Regression
(12.1)
(12.2)
412.1 Simple Linear Regression
5Table 12.1 Quality Improvement Data
Month Time Devoted to Quality Impr. of Non-conforming
January 56 20
February 58 19
March 55 20
April 62 16
May 63 15
June 68 14
July 66 15
August 68 13
September 70 10
October 67 13
November 72 9
December 64 8
6Figure 12.1 Scatter Plot
7Figure 12.1a Scatter Plot
812.1 Simple Linear Regression
912.1 Simple Linear Regression
- The regression equation is
- Y 55.9 - 0.641 X
- Predictor Coef SE Coef T P
- Constant 55.923 2.824 19.80 0.000
- X -0.64067 0.04332 -14.79 0.000
- S 0.888854 R-Sq 95.6 R-Sq(adj) 95.2
- Analysis of Variance
- Source DF SS MS F P
- Regression 1 172.77 172.77 218.67 0.000
- Residual Error 10 7.90 0.79
- Total 11 180.67
1012.1 Simple Linear Regression
1112.2 Worth of the Prediction Equation
Obs X Y Fit SE Fit Residual St Resid
1 56.0 20.000 20.046 0.464 -0.046 -0.06
2 58.0 19.000 18.765 0.395 0.235 0.30
3 55.0 20.000 20.687 0.500 -0.687 -0.93
4 62.0 16.000 16.202 0.286 -0.202 -0.24
5 63.0 15.000 15.561 0.270 -0.561 -0.66
6 68.0 14.000 12.358 0.289 1.642 1.95
7 66.0 15.000 13.639 0.261 1.361 1.60
8 68.0 13.000 12.358 0.289 0.642 0.76
9 70.0 10.000 11.077 0.338 -1.077 -1.31
10 67.0 13.000 12.999 0.272 0.001 0.00
11 72.0 9.000 9.795 0.400 -0.795 -1.00
12 74.0 8.000 8.514 0.470 -0.514 -0.68
1212.2 Worth of the Prediction Equation
(12.4)
1312.3 Assumptions
(12.1)
1412.4 Checking Assumptions through Residual Plots
1512.4 Checking Assumptions through Residual Plots
1612.5 Confidence Intervals
1712.5 Hypothesis Test
1812.6 Prediction Interval for Y
1912.6 Prediction Interval for Y
2012.7 Regression Control Chart
(12.5)
(12.6)
2112.8 Cause-Selecting Control Chart
- The general idea is to try to distinguish between
quality problems that occur at one stage in a
process from problems that occur at a previous
processing step. - Let Y be the output from the second step and let
X denote the output from the first step. The
relationship between X and Y would be modeled.
2212.9 Linear, Nonlinear, and Nonparametric Profiles
- Profile refers to the quality of a process or
product being characterized by a (Linear,
Nonlinear, or Nonparametric) relationship between
a response variable and one or more explanatory
variables. - A possible way is to monitor each parameter in
the model with a Shewhart chart. - The independent variables must be fixed
- Control chart for R2
2312.10 Inverse Regression
- An important application of simple linear
regression for quality improvement is in the area
of calibration. - Assume two measuring tools are available One is
quite accurate but expensive to use and the other
is not as expensive but also not as accurate. If
the measurements obtained from the two devices
are highly correlated, then the measurement that
would have been made using the expensive
measuring device could be predicted fairly from
the measurement using the less expensive device. - Let Y measurement from the less expensive
device - X measurement from the accurate device
2412.10 Inverse Regression
2512.10 Inverse RegressionExample
Y X 2.3 2.4 2.5 2.6 2.4 2.5 2.8 2.9 2.9 3.0 2.6 2.
7 2.4 2.5 2.2 2.3 2.1 2.2 2.7 2.7
2612.11 Multiple Linear Regression
2712.12 Issues in Multiple Regression12.12.1
Variable Selection
2812.12.3 Multicollinear Data
- Problems occur when at least two of the
regressors are related in some manner. - Solutions
- Discard one or more variables causing the
multicollinearity - Use ridge regression
2912.12.4 Residual Plots
3012.12.6 Transformations