Title: Linear Regression
1Linear Regression
- Mechanical Engineering Majors
- Authors Autar Kaw, Luke Snyder
- http//numericalmethods.eng.usf.edu
- Transforming Numerical Methods Education for STEM
Undergraduates
2Linear Regression http//numericalmethods.e
ng.usf.edu
3What is Regression?
What is regression? Given n data points
best fit
to the data. The best fit is generally based on
minimizing the sum of the square of the
residuals,
.
Residual at a point is
Sum of the square of the residuals
Figure. Basic model for regression
4Linear Regression-Criterion1
Given n data points
best fit
to the data.
Figure. Linear regression of y vs. x data showing
residuals at a typical point, xi .
Does minimizing
work as a criterion, where
5Example for Criterion1
Example Given the data points (2,4), (3,6),
(2,6) and (3,8), best fit the data to a straight
line using Criterion1
Table. Data Points
x y
2.0 4.0
3.0 6.0
2.0 6.0
3.0 8.0
Figure. Data points for y vs. x data.
6Linear Regression-Criteria1
Using y4x-4 as the regression curve
Table. Residuals at each point for regression
model y 4x 4.
x y ypredicted e y - ypredicted
2.0 4.0 4.0 0.0
3.0 6.0 8.0 -2.0
2.0 6.0 4.0 2.0
3.0 8.0 8.0 0.0
Figure. Regression curve for y4x-4, y vs. x data
7Linear Regression-Criteria1
Using y6 as a regression curve
Table. Residuals at each point for y6
x y ypredicted e y - ypredicted
2.0 4.0 6.0 -2.0
3.0 6.0 6.0 0.0
2.0 6.0 6.0 0.0
3.0 8.0 6.0 2.0
Figure. Regression curve for y6, y vs. x data
8Linear Regression Criterion 1
for both regression models of y4x-4 and y6.
The sum of the residuals is as small as possible,
that is zero, but the regression model is not
unique. Hence the above criterion of minimizing
the sum of the residuals is a bad criterion.
9Linear Regression-Criterion2
Will minimizing
work any better?
Figure. Linear regression of y vs. x data showing
residuals at a typical point, xi .
10Linear Regression-Criteria 2
Using y4x-4 as the regression curve
Table. The absolute residuals employing the
y4x-4 regression model
x y ypredicted e y - ypredicted
2.0 4.0 4.0 0.0
3.0 6.0 8.0 2.0
2.0 6.0 4.0 2.0
3.0 8.0 8.0 0.0
Figure. Regression curve for y4x-4, y vs. x data
11Linear Regression-Criteria2
Using y6 as a regression curve
Table. Absolute residuals employing the y6 model
x y ypredicted e y ypredicted
2.0 4.0 6.0 2.0
3.0 6.0 6.0 0.0
2.0 6.0 6.0 0.0
3.0 8.0 6.0 2.0
Figure. Regression curve for y6, y vs. x data
12Linear Regression-Criterion2
for both regression models of y4x-4 and y6.
The sum of the errors has been made as small as
possible, that is 4, but the regression model is
not unique. Hence the above criterion of
minimizing the sum of the absolute value of the
residuals is also a bad criterion.
Can you find a regression line for which
and has unique
regression coefficients?
13Least Squares Criterion
The least squares criterion minimizes the sum of
the square of the residuals in the model, and
also produces a unique line.
Figure. Linear regression of y vs. x data showing
residuals at a typical point, xi .
14Finding Constants of Linear Model
Minimize the sum of the square of the residuals
To find
and
we minimize
with respect to
and
.
giving
15Finding Constants of Linear Model
Solving for
and
directly yields,
and
16Example 1
The coefficient of thermal expansion of steel is
given at discrete values of temperature, as shown
in the table.
Temperature, T Coefficient of Thermal Expansion,
80
60
40
20
0
-20
-40
-60
-80
-100
If the data is regressed to a first order
polynomial,
-120
-140
-160
-180
-200
-220
-240
-260
-280
-300
-320
-340
Find the constants of the model.
Table. Data points for thermal expansion vs.
temperature
17Example 1 cont.
The necessary summations are calculated as
18Example 1 cont.
We can now calculate the value of using
19Example 1 cont.
The value for
can be calculated using
where
The regression model is now given by
20Example 1 cont.
Question Can you find the decrease in the
diameter of a solid cylinder of radius 12 if the
cylinder is cooled from a room temperature of
80F to a dry-ice/alcohol bath with temperatures
of -108F? What would be the error if you used
the thermal expansion coefficient at room
temperature to find the answer?
Figure. Linear regression of Coefficient of
Thermal expansion vs. Temperature data
21Example 2
To find the longitudinal modulus of composite,
the following data is collected. Find the
longitudinal modulus,
using the regression model
Table. Stress vs. Strain data
and the sum of the square of the
Strain Stress
() (MPa)
0 0
0.183 306
0.36 612
0.5324 917
0.702 1223
0.867 1529
1.0244 1835
1.1774 2140
1.329 2446
1.479 2752
1.5 2767
1.56 2896
residuals.
Figure. Data points for Stress vs. Strain data
22Example 2 cont.
Residual at each point is given by
The sum of the square of the residuals then is
Differentiate with respect to
Therefore
23Example 2 cont.
Table. Summation data for regression model
With
i e s e 2 es
1 0.0000 0.0000 0.0000 0.0000
2 1.830010-3 3.0600108 3.348910-6 5.5998105
3 3.600010-3 6.1200108 1.296010-5 2.2032106
4 5.324010-3 9.1700108 2.834510-5 4.8821106
5 7.020010-3 1.2230109 4.928010-5 8.5855106
6 8.670010-3 1.5290109 7.516910-5 1.3256107
7 1.024410-2 1.8350109 1.049410-4 1.8798107
8 1.177410-2 2.1400109 1.386310-4 2.5196107
9 1.329010-2 2.4460109 1.766210-4 3.2507107
10 1.479010-2 2.7520109 2.187410-4 4.0702107
11 1.500010-2 2.7670109 2.250010-4 4.1505107
12 1.560010-2 2.8960109 2.433610-4 4.5178107
1.276410-3 2.3337108
and
Using
24Example 2 Results
The equation
describes the data.
Figure. Linear regression for Stress vs. Strain
data
25Additional Resources
- For all resources on this topic such as digital
audiovisual lectures, primers, textbook chapters,
multiple-choice tests, worksheets in MATLAB,
MATHEMATICA, MathCad and MAPLE, blogs, related
physical problems, please visit - http//numericalmethods.eng.usf.edu/topics/linear
_regression.html
26- THE END
- http//numericalmethods.eng.usf.edu