Regression - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Regression

Description:

To determine how much of the variation (uncertainty) in Y ... R2 Adj 0.867. Root Mean Square Error 1.334. Mean of Response 6.611. Observations (or Sum Wgts) 36 ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 32
Provided by: hunterl4
Category:
Tags: adj | regression

less

Transcript and Presenter's Notes

Title: Regression


1
Regression
  • Review of Simple linear regression
  • Introduction to multiple linear
    regression

ESM 206A 10 March 2009
2
3 major purposes of linear regression
  • Describe the linear relationship between X and Y
  • To determine how much of the variation
    (uncertainty) in Y can be explained by the linear
    relationship with X, and how much of this
    variation remains unexplained
  • To predict new values of Y from new values of X

3
X
Y
4
Y
X
5
Results from JMP
6
Results from JMP
  • H0 regression slopes equal zero

7
Results from JMP
8
Assumptions of simple linear regression
Model Yi ß0 ß1X ei
  • Normality Population of Y-values and the error
    terms (ei) are normally distributed for each
    level of the predictor variable Xi
  • (Test for normality- and transform data)

9
(No Transcript)
10
Assumptions of simple linear regression
Model Yi ß0 ß1X ei
  • Normality Population of Y-values and the error
    terms (ei) are normally distributed for each
    level of the predictor variable Xi
  • (Test for normality- and transform data)
  • Homogeneity of variance Population of Y-values
    and the error terms (ei) have the same variance
    for each Xi

s12 s22 s32 se2 (for i 1 to n)
(Test for homogeneity of variance with graph of
residuals vs. predicted values)
11
Do you have outliers that are influencing your
results?
If so, should you remove them?
12
Assumptions of simple linear regression
Model Yi ß0 ß1X ei
  • Normality Population of Y-values and the error
    terms (ei) are normally distributed for each
    level of the predictor variable Xi
  • (Test for normality- e.g., using box plots and
    transform data)
  • Homogeneity of variance Population of Y-values
    and the error terms (ei) have the same variance
    for each Xi

s12 s22 s32 se2 for i 1 to n
(Test for homogeneity of variance with graph of
residuals vs. predicted values)
  • Independence Population of Y-values and the
    error terms (ei) are
  • independent of each other, i.e., the Y-values
    for any Xi does not
  • influence the Y-values of any other Xi.

(Test independence with graph of residuals vs.
predicted values)
13
(No Transcript)
14
Multiple regression
  • Linear model with multiple predictor variables
  • When all the predictor variables are continuous
    multiple regression model
  • When all categorical ?

15
(No Transcript)
16
Example
  • Question which aspects of habitat and human
    activity affect the biodiversity and abundance of
    organisms? an important
  • aim of modern conservation biology

Lyon (1987)
  • What characteristics of forest habitat were
    related to the abundance of birds?
  • 56 forest patches in southern Australia
  • Measured bird abundance
  • 6 predictor variables

17
  • Patch area (ha)
  • No. of years since isolated by clearing (yrs)
  • Distance from nearest patch (km)
  • Distance to nearest larger patch (km)
  • Index of cow grazing intensity (1-5)
  • Mean altitude (m)

18
Correlation matrix
Log10 dist
Log10 L dist
Log10 area
Grazing
Altitude
Years
Log10 dist
1.000
Log10 L dist
0.604
1.000
Log10 area
1.000
0.302
0.382
Grazing
1.000
-0.143
-0.034
-0.599
0.275
Altitude
1.000
-0.219
-0.274
-0.407
Years
1.000
-0.020
0.161
-0.278
0.636
-0.233
19
Assumptions
  • No outlier data points or influential values
  • Response variable not skewed
  • Some heterogeneity of spread of residual

20
(No Transcript)
21
Results
Model (bird abundance) ß0 ß1(log10 area)
ß2(log10 dist) ß3(log10 Ldist) ß4(grazing)
ß5(altitude) ß6(years) ei
  • Additive model that does not have interactions
    (multiplicative effects) between
  • predictor variables, although such interactions
    are possible (even likely-
  • so hold on and see below)

22
Results from JMP
  • R2 0.685 , what does this mean?
  • R2 adjusted 0.609

23
Results from JMP
  • H0 all partial regression slopes equal zero

24
Next step
  • Now fit a second model to investigate
    interactions between predictor variables
  • A model with 6 predictor variables is unwieldy so
    we simplify the model first by omitting those
    predictors that contributed little to the
    original model

25
Log10 dist, Log10 Ldist, and altitude - Lose em!
26
Model and Results
  • Model (bird abundance) ß0 ß1(log10 area)
    ß2(grazing) ß3(years) ß4(log10 area x
    grazing) ß5(log10 area x years)
  • ß6(grazing x grazing) ß7(log10 area x
    grazing x years) ei

27
Interpreting interaction terms
  • The log10 area x grazing term indicates how
    much of the effect of grazing on bird density
    depends on log10 area.
  • This interaction is significant, so lets look for
    a effects of grazing on bird density for
    different values of log10 area.
  • We choose mean log10 area (0.932) one standard
    deviation (0.120, 1.744). Because three-way
    interaction was not significant, we simply set
    years since isolation to its mean (33.25). The
    simple slopes of bird abundance against grazing
    for different log10 area values and mean years
    since isolation .

28
Interpreting interactions terms
  • Which means The negative effect of cow grazing
    on bird density is
  • stronger in small fragments and there is no
    relationship between
  • bird abundance and grazing in large fragments.

29
Assumptions of multiple regression
  • Same as simple regression plus
  • Predictor values must be uncorrelated with each
    other. Multi-collinearity is critical!!
  • Number of observations must exceed the number of
    predictor variables

30
Correlation matrix
Log10 dist
Log10 L dist
Log10 area
Grazing
Altitude
Years
Log10 dist
1.000
Log10 L dist
0.604
1.000
Log10 area
1.000
0.302
0.382
Grazing
1.000
-0.143
-0.034
-0.599
0.275
Altitude
1.000
-0.219
-0.274
-0.407
Years
1.000
-0.020
0.161
-0.278
0.636
-0.233
31
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com