Title: Linear regression in matrix terms
2Clean-up from last class
3Least squares estimates in simple linear
regression setting
soap suds sosu soap2 4.0 33 132.0
16.00 4.5 42 189.0 20.25 5.0 45 225.0
25.00 5.5 51 280.5 30.25 6.0 53 318.0
36.00 6.5 61 396.5 42.25 7.0 62 434.0
49.00 --- --- ----- ----- 38.5 347 1975.0
4The inverse of (X'X)
Its very messy (and not very informative) to
determine inverses by hand. Just about everyone
lets a computer do the dirty work.
5Linear dependence (and rank) is not always
The rank of this matrix is 2, not 3 as I claimed
in last class.
The (column) rank of a matrix is the maximum
number of linearly independent columns in the
matrix. The (row) rank of a matrix is the
maximum number of linearly independent rows in
the matrix. And, rank column rank row rank.
7What is the rank of this matrix (5.8b)?
8The main point
- If the columns of the X matrix (that is, if two
or more of your predictor variables) are linearly
dependent (or nearly so), you will run into
trouble when trying to estimate the regression
9Sum of squares
10Sum of squares
In general, if you pre-multiply a vector by its
transpose, you get a sum of squares.
11Error sum of squares
12Error sum of squares
13Total sum of squares
Previously, wed write
14Example Total sum of squares
If n2
15ANOVA Table in matrix terms
Source DF SS MS F
Regression p-1
Error n-p
16Distributional assumptions
17Error term assumptions
- We used to say that the error terms ei for i1,
, n are - independent
- normally distributed
- with mean E(ei)0
- with variance s2(ei)s2.
- Now, how can we say the same thing using matrices
and vectors?
18Error terms as a random vector
19The mean (expectation) of the random error term
20The variance of the random error term vector
That is, the diagonal elements are just the
variances of the error terms, while the
off-diagonal elements are the covariances between
the error terms.
21The ASSUMED variance of the random error term
BUT, we assume the variances of the error terms
are constant (s2) and we assume the error terms
are independent (which is equivalent to assuming
the covariances are 0). That is
22An aside When you multiply a matrix by a scalar
You just multiply each element of the matrix by
the scalar.
23An alternative way of expressing the ASSUMED
variance of the random error term vector
24The general linear regression model
- where
- Y is a ( ) vector of response values
- ß is a ( ) vector of unknown parameters
- X is an ( ) matrix of known constants
(predictor values) - e is an ( ) vector of independent, normal
error terms with mean E(e) 0 and variance s2(e)