Regression - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Regression

Description:

Fit a line to data, to 'model' that data. 2. Test hypotheses about the the ... Interpolation within the limits of our data may be acceptably accurate, ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 25
Provided by: hunterl4
Category:

less

Transcript and Presenter's Notes

Title: Regression


1
Regression
1. Simplest case Least Squares Regression
Fit a line to data, to model that data
2. Test hypotheses about the the parameters of
the fitted model
3. Understand assumptions of the model
4. Describe diagnostic tests to evaluate the fit
of the data to the model
5. Explain how to use the model make predictions
  • ESM 206A
  • 25 February 2009

2
Regression
6. Learn other models Logistic regression,
probit regression, multiple regression,
non-linear regression, robust regression,
quantile regression
  • 7. Model selection how to choose an appropriate
    subset of predictor
  • variables

8. How to compare the relative fit of different
models to the same data set
  • ESM 206A
  • 25 February 2009

3
Basic idea
4
Linear regression
of regression
  • State a hypothesis about cause and effect
  • the value of the X variable causes either
    directly or indirectly
  • the value of the Y variable
  • Some cases the cause and effect is straight
    forward
  • the area of rocky reef influences the
    number of lobsters, but
  • the number of lobsters do not influence the
    area of rocky reef
  • Other cases not so straight forward
  • Do predators control the abundance of the
    prey, or do the
  • number of prey control the number of
    predators?

5
Linear regression
  • Once a decision is made about the direction of
    the cause and effect,
  • the next step is to describe the relationship
    as a mathematical
  • function
  • Y f(x)
  • We apply the function f to each value of variable
    X (the input) to
  • generate the corresponding value of Y (the
    output)
  • Many interesting and complex functions that can
    describe the
  • relationship between 2 variables, but the
    simplest one is that
  • Y is a linear function of X
  • Y ?0 ?1X
  • This equation describes the graph of a.?

6
Most basic form
16
14
12
10
Number of lobster per trap
8
Y ?0 ?1X
6
4
2
0
0
200
100
300
400
500
600
7
Y ?0 ?1X
  • Has 2 parameters ?0 and ?1 which are the.?

?0 the predicted value from the equation when X
.?
8
Y ?0 ?1X
9
Y ?0 ?1X
  • ?1, the slope, measures the change in the Y
    variable for each unit
  • change in the X variable
  • The slope therefore is a rate measured in units
    of ?Y/?X)

10
  • Nothing says that nature has to obey a linear
    equation
  • Many economic, ecological, and social
    relationships are inherently
  • non-linear
  • Examples?
  • Linear model is the simplest starting place for
    fitting functions to data
  • Even complex, non-linear functions may be
    approximately linear over
  • a limited range of the X variable. If we
    restrict our conclusions to that range
  • of X, a linear model may be a valid
    approximation of the function.

11
  • Interpolation within the limits of our data may
    be acceptably accurate,
  • even though the linear model (green line) does
    not describe the true
  • functional relationship between Y and X (the
    back curve)
  • Extrapolation will be become increasingly
    inaccurate as the forecasts
  • move farther away from range of collected data
  • A very important assumption is that the
    relationship between
  • X and Y (or transformations of these variables)
    is linear.

12
16
14
12
10
Number of lobster per trap
8
6
4
2
0
0
200
100
300
400
500
600
13
Fitting data to a linear model
  • The data for a regression analysis consists of a
    series of paired observations
  • Each observation includes an X value (Xi) and a
    corresponding Y value (Yi)
  • that both have been measured for the same
    replicate.

14
Fitting data to a linear model
  • But most data sets exhibit more variation than
    this - a single variable rarely
  • will account for most of the variation in the
    data -the data points will fall
  • within a fuzzy band rather than a sharp line.
  • the bigger the ?2, the more the noise, or error,
    there will be around the
  • regression line

15
Adding some data to the story
Species-area relationship
relationship between the number of species and
the area of an island (or a sample)
See Data Data_6_Galapagos.xls
  • Number of species seems to follow a power
    relationship
  • Island areas range over 3 orders of magnitude (1
    - 7500 km2)
  • -Species richness spans two orders of magnitude
    (7-325)
  • So data follow a power function S cAz

16
Adding some data to the story
Species-area relationship
17
Transforming data
S cAz
log (S) log(cAz)
log (S) log(c) zlog(Az)
S c zA
So- we plot logarithims of the data.
18
Adding some data to the story
Species-area relationship
2.5
2.0
log10(Number of species)
1.5
But how do we define the best fit for the line?
1
0.5
0
1.0
2.0
3.0
4.0
-1.0
log10(Island area)
19
Adding some data to the story
Species-area relationship
2.5
2.0

log10(Number of species)
1.5
1
0.5
0
1.0
2.0
3.0
4.0
-1.0
log10(Island area)
20
Adding some data to the story
Species-area relationship
2.5
2.0

log10(Number of species)
1.5
1
0.5
0
1.0
2.0
3.0
4.0
-1.0
log10(Island area)
21
Adding some data to the story
Species-area relationship
2.5
2.0

log10(Number of species)
1.5
1
0.5
0
1.0
2.0
3.0
4.0
-1.0
log10(Island area)
22
Adding some data to the story
Species-area relationship
2.5
2.0

log10(Number of species)
1.5
For any Yi, could pass regression line Through
the point, so that di 0
1
0.5
0
1.0
2.0
3.0
4.0
-1.0
log10(Island area)
23
Adding some data to the story
Species-area relationship
2.5
2.0

log10(Number of species)
1.5
1
0.5
0
1.0
2.0
3.0
4.0
-1.0
log10(Island area)
24
Adding some data to the story
Species-area relationship
2.5
2.0

log10(Number of species)
1.5
1
0.5
0
1.0
2.0
3.0
4.0
-1.0
log10(Island area)
Write a Comment
User Comments (0)
About PowerShow.com