Title: Linear Least Squares Approximation
1(No Transcript)
2Linear Least SquaresApproximation
- By
- Kristen Bauer, Renee Metzger,
- Holly Soper, Amanda Unklesbay
3Linear Least Squares
- Is the line of best fit for a group of points
- It seeks to minimize the sum of all data points
of the square differences between the function
value and data value. - It is the earliest form of linear regression
4Gauss and Legendre
- The method of least squares was first published
by Legendre in 1805 and by Gauss in 1809. - Although Legendres work was published earlier,
Gauss claims he had the method since 1795. - Both mathematicians applied the method to
determine the orbits of bodies about the sun. - Gauss went on to publish further development of
the method in 1821.
5Example
- Consider the points (1,2.1), (2,2.9), (5,6.1),
and (7,8.3) - with the best fit line f(x) 0.9x 1.4
- The squared errors are
- x11 f(1)2.3 y12.1 e1 (2.3 2.1)² .04
- x22 f(2)3.2 y22.9 e2 (3.2 2.9)² . 09
- x35 f(5)5.9 y36.1 e3 (5.9 6.1)² .04
- x47 f(7)7.7 y48.3 e4 (7.7 8.3)² .36
- So the total squared error is .04 .09 .04
.36 .53 - By finding better coefficients of the best fit
line, we can make this error smaller
6We want to minimize the vertical distance between
the point and the line.
- E (d1)² (d2)² (d3)² (dn)² for n data
points - E f(x1) y1² f(x2) y2² f(xn)
yn² - E mx1 b y1² mx2 b y2² mxn
b yn² - E ?(mxi b yi )²
7E must be MINIMIZED!
- How do we do this?
- E ?(mxi b yi )²
- Treat x and y as constants, since we are trying
to find m and b. - SoPARTIALS!
- ?E/?m 0 and ?E/?b 0
- But how do we know if this will yield maximums,
minimums, or saddle points?
8Minimum Point
Maximum Point
Saddle Point
9Minimum!
- Since the expression E is a sum of squares and is
therefore positive (i.e. it looks like an upward
paraboloid), we know the solution must be a
minimum. - We can prove this by using the 2nd Partials
Derivative Test.
102nd Partials Test
Suppose the gradient of f(x0,y0) 0. (An
instance of this is ?E/?m ?E/?b 0.) We set
- And form the discriminant D AC B2
- If D lt 0, then (x0,y0) is a saddle point.
- If D gt 0, then f takes on
- A local minimum at (x0,y0) if A gt 0
- A local maximum at (x0,y0) if A lt 0
11Calculating the Discriminant
12- If D lt 0, then (x0,y0) is a saddle point.
- If D gt 0, then f takes on
- A local minimum at (x0,y0) if A gt 0
- A local maximum at (x0,y0) if A lt 0
- Now D gt 0 by an inductive proof showing that
- Those details are not covered in this
presentation. - We know A gt 0 since A 2 ? x2 is always positive
(when not all xs have the same value).
13Therefore
Setting ?E/?m and ?E/?b equal to zero will yield
two minimizing equations of E, the sum of the
squares of the error.
Thus, the linear least squares algorithm (as
presented) is valid and we can continue.
14- E ?(mxi b yi)² is minimized (as just shown)
when the partial derivatives with respect to each
of the variables is zero. ie ?E/?m 0 and ?E/?b
0
?E/?b ?2(mxi b yi) 0 set equal to 0
m?xi ?b ?yi
mSx bn Sy ?E/?m ?2xi (mxi b yi)
2?(mxi² bxi xiyi) 0
m?xi² b?xi
?xiyi mSxx bSx Sxy
NOTE ?xi Sx ?yi Sy ?xi² Sxx ?xiyi
SxSy
15Next we will solve the system of equations for
unknowns m and b
Solving for m
nmSxx bnSx nSxy Multiply by n mSxSx bnSx
SySx Multiply by Sx nmSxx mSxSx nSxy
SySx Subtract
m(nSxx SxSx) nSxy SySx Factor m
16Next we will solve the system of equations for
unknowns m and b
Solving for b
mSxSxx bSxSx SxSxy Multiply by Sx mSxSxx
bnSxx SySxx Multiply by Sxx bSxSx bnSxx
SxySx SySxx Subtract
b(SxSx nSxx) SxySx SySxx Solve for b
17- Example Find the linear least squares
approximation to the data (1,1), (2,4), (3,8)
Use these formulas
Sx 123 6 Sxx 1²2²3² 14 Sy 148
13 Sxy 1(1)2(4)3(8) 33 n number of points
3
The line of best fit is y 3.5x 2.667
18Line of best fit y 3.5x 2.667
19THE ALGORITHMin Mathematica
20(No Transcript)
21(No Transcript)
22Activity
- For this activity we are going to use the linear
least squares approximation in a real life
situation. - You are going to be given a box score from either
a baseball or softball game. - With the box score you are given you are going to
write out the points (with the x coordinate being
the number of hits that player had in the game
and the y coordinate being the number of at-bats
that player had in the game). - After doing that you are going to use the linear
least squares approximation to find the best
fitting line. - The slope of the besting fitting line you find
will be the teams batting average for that game.
23In Conclusion
- E ?(mxi b yi )² is the sum of the squared
error between the set of data points
(x1,y1),,(xi,yi),,(xn,yn) and the line
approximating the data f(x) mx b. - By minimizing the error by calculus methods, we
get equations for m and b that yield the least
squared error
24Advantages
- Many common methods of approximating data seek to
minimize the measure of difference between the
approximating function and given data points. - Advantages for using the squares of differences
at each point rather than just the difference,
absolute value of difference, or other measures
of error include - Positive differences do not cancel negative
differences - Differentiation is not difficult
- Small differences become smaller and large
differences become larger
25Disadvantages
- Algorithm will fail if data points fall in a
vertical line. - Linear Least Squares will not be the best fit for
data that is not linear.
26The End