Multiple regression, - PowerPoint PPT Presentation

About This Presentation
Title:

Multiple regression,

Description:

How much explains given variable in addition ... I will use height as a covariate. In principle, I test, if lines of weigh dependence on high are the same or ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 29
Provided by: janl4
Category:

less

Transcript and Presenter's Notes

Title: Multiple regression,


1
  • Multiple regression,
  • ANCOVA,
  • General Linear Models

2
Multiple regression
3
I have more predictors than one
  • In manipulative experiment amount of water and
    dose of nutrients as independent variables for
    biomass of plant raised
  • In observation study species richness is
    explained by latitude, altitude and annual
    rainfall.

4
In ideal case, predictors shouldnt be correlated
with each other
  • This can be ensured in an experiment
  • But hardly in observational study (e.g., it would
    be difficult to find a locations ina way that
    latitude and precipitation would be independent)

5
Model
The same assumptions as in simple linear
regression i.e. random variability is additive
and independent of the expected value (i.e.
homogeneity of variances), relation is linear.
More over - effects of individual independent
variables are additive.
6
For two predictors is representation a plain in
three-dimensional space
ozone
Temperature
Wind velocity
7
Numbers of procedures are analogue to simple
regression
  • coefficients a and ßi (for each of predictors)
    mean value for the population, which is
    unknown, we estimate using a sample coefficients
    a and bi.
  • ßi (for population), or bi. for sample - slope
    (dependent on units used)
  • Criterion of least squares of residual sum of
    squares.
  • Tests - either ANOVA of the whole model, or
    (using t-tests) tests of individual regression
    coefficients

8
In contrast to single regression, meaning of
tests differs
  • ANOVA of the whole model H0 Response is
    independent of all the predictors, i.e. ßi0 for
    all i
  • Separate null hypothesis for individual
    predictors ßi0 relating to individual
    variables.

9
Range of predictor values can differ
considerably. and slope values are dependent on
units used.
Water
Nutrients
P.High
10
ANOVA of whole model
Analysis of sum of squares SSTOT SSRegress.
SSResidual
DFTOT n-1 DFRegressnumber of variables,
DFResidn-1-number of variables
Classically MSSS/DF is estimation of
population variance, if H0 is true this all
leads to classic F-distribution.
11
R2 - coefficient of determination
Percentage of variability explained by
model R2adj. adjusted different corrections
having many independent variables and relatively
few observations, then R2 is higher in our
sample than in the population. Number of
observations should be considerably higher than
no. of predictors. When number of observations
number of predictors 1, then the model
perfectly fits all points, (but predictive
ability of the model is null).
12
Partial regression coefficients
How much explains given variable in addition to
all other variables in the model (in addition
is especially important to say, if predictors are
correlated)
13
Tests of partial regression coefficients
Beta in Statistica program it is something
different than our ß - (on principle, it cannot
be computed from finite sample). It is
standardized partial regression coefficient
(computed after Z transformation of all the
variables (both predictors and response)
Regression plain goes through the origin
thereafter
14
Tests of partial regress coefficients
Beta (i.e. standardized r.c.) indicates
relative size of the effect of predictor (with
regard to used range of predictors values), it
is independent of units used B - (is b in our
model) is used for construction of function Ya
biXi and thus depends on measured units.
Translates change in predictor into change in
the response
15
Tests of partial regress coefficients
Beta how much (standardized) repsponse will
change with change of predictor by proportional
part of its variability B how much response
will change in its units with change of
predictor by its one unit.
16
Tests of partial regression coefficients
We use for testing tB/s.e.(B)Beta/s.e.(Beta) Sta
ndard error depends on predictors correlation
considerably! Test for Intercept is usually very
uninteresting again
Attention, results of ANOVA and partial
coefficient tests havent to correspond to each
other!
17
Marginal and partial effects
18
It is not always advantage to have a many
predictors
There are several methods, how to simplify our
model (used usually in observational studies) It
is better to use your head first and dont put
everything to program just because it came from
automatic analyzer. Stepwise selection of
predictors - stepwise selection Forward,
Backward, etc. Criteria weighting independent
character and penalizing Complexity.
(AIC) Jack-knife and similar methods
19
Mind the variables on circular scale used as
predictors
We can hardly get linear response to 1.
Orientation of inclination (or anything) measured
e.g. in degrees or radians 2. Julian day 3.
Hours of a day Various solutions (e.g. Nordness
and Esterness for orientation)
20
  • General Linear Models

21
We have had
ANOVA model Xij µai eij
Eventually for more categorical variables We can
compute average as SX/n , but it can be computed
using method of least residual sum of squares
Regression
Generally Y deterministic part of model e As
deterministic part combination of categorical and
quantitative predictors - single effects are
additive it is then General Linear Model (mind
shortcut GLM)
22
Examples
  • Number of species in community rock categ,
    type of land management categ, altitude quant
  • Level of cholesterol sex categ, age qant,
    amount of flitch consumed qant
  • Level of heterozygosity ploidy categ -
    probably, population size qant

23
Various formulations of models enable to test if
  • two regression lines are the same
  • They arent the same, but have the same
    inclination
  • Have even different inclination (then interaction
    of quantitative variable and factor is
    significant categ. variables)
  • And a lot of similar questions

24
ANCOVA (analysis of covariance)
  • Probably the most common of general linear models
  • We suppose, that lines are parallel to each other
  • Most often we want to filter out some
    disturbing effect should lead to lower error
    variability

25
Example
  • Example I compare weight of members of sport
    club and of beer club. As weight is dependent on
    body height (which is trivial), I will have quite
    big variability in both groups
  • I will use height as a covariate
  • In principle, I test, if lines of weigh
    dependence on high are the same or shifted and I
    assume they have the same inclination

26
Example
  • Example experiment with rats I have a
    suspicion that the result will depend on their
    weight but it is impossible to have all rats
    with the same weight
  • I use rat weight in the beginning of experiment
    as covariate
  • I will try my best at the same time to have rats
    of the same weight in all groups (that variables
    predictors of rat weight and experimental
    group would be independent)

27
How can I decide, as I can use variable as
quantitative and when as categorical one
  • The less degrees of freedom the model takes,
    the more powerful is the test
  • The more degrees of freedom the model takes,
    the better fit
  • And what now...

28
Fertilization, 0, 70 and 140 kg N/ha, effect on
crop yield
Two possible models Regression Yield a
bdose of fertilizer error it assumes linear
increase of yield with the dose, takes one
degree of freedom Anova Yield grand mean
specific effect of potion error it doesnt
presume linear relation, we use two degrees of
freedom If assumption of linearity is true,
regression test will be more powerful but both
of them are alright, but if it false, regression
will be quite absurd
Write a Comment
User Comments (0)
About PowerShow.com