Indicator Variables - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Indicator Variables

Description:

Age 0 for birds less than 1, 1 for birds over 1. Sex 0 for males, 1 for females ... Use Calc Calculator the create a column wingspan multiplied by sex. ... – PowerPoint PPT presentation

Number of Views:22
Avg rating:3.0/5.0
Slides: 21
Provided by: napierun
Category:

less

Transcript and Presenter's Notes

Title: Indicator Variables


1
Indicator Variables
  • sex (male or female)
  • time (day, evening and night)
  • season (spring, summer, autumn, winter)

2
Dummy variables
3
Note
  • Number of indicator variables is needed is always
    1 less than the number of levels.
  • There is usually more than one way to set up the
    variables.
  • The values 0 and 1 are always used.

4
Greenfinch data
  • Weight of the greenfinch
  • Wingspan of the bird
  • Age 0 for birds less than 1, 1 for birds over 1
  • Sex 0 for males, 1 for females

5
Fitting the equations
  • Once indicator variables are created they are
    entered into the equation in the same way as
    continuous variables.

6
Minitab output
  • The regression equation is
  • weight - 1.27 0.335 wing - 0.289 age 0.998
    sex
  •  
  • Predictor Coef SE Coef T
    P
  • Constant -1.267 8.442 -0.15
    0.881
  • wing 0.33452 0.09525 3.51
    0.001
  • age -0.2891 0.3934 -0.73
    0.464
  • sex 0.9981 0.4683 2.13
    0.035
  •  
  • S 1.875 R-Sq 10.5 R-Sq(adj) 8.0
  •  

7
Note
  • R-sq is very low.
  • This means the equation may not be very good for
    making predictions.
  • BUT the model still gives us information about
    the effect of each of the variables on weight.
  • Weight increases with wing length
  • Female birds are, on average, heavier
  • Age has no effect

8
Omitting age
  • The regression equation is
  • weight - 0.54 0.325 wing 0.968 sex
  •  
  • Predictor Coef SE Coef T
    P
  • Constant -0.545 8.366 -0.07
    0.948
  • wing 0.32541 0.09424 3.45
    0.001
  • sex 0.9682 0.4655 2.08
    0.040
  •  
  • S 1.871 R-Sq 10.0 R-Sq(adj) 8.4

9
Equations
  • For males sex 0
  • weight -0.54 0.325wing
  • For females sex 1
  • weight -0.52 0.325wing 0.968
  • weight 0.448 0.325wing

10
Plot of data
Females are 0.968 grams heavier on average than
males
11
Interaction terms
  • Weight b0 b1wingspan b2 sex b3
    wingspansex
  • Previous equation fitted two parallel lines, one
    for each sex.
  • Including an interaction term fits two completely
    separate lines, different intercepts and
    different gradients.
  • The interaction term is not significant for this
    data set.

12
To include interaction term
  • Use CalcgtCalculator the create a column wingspan
    multiplied by sex.
  • Enter the new column into the regression model as
    before.

13
Power station example
  • Response variable abundance of a species of
    worm, logged
  • Predictor variables year and time of year
  • Time fitted as 1-6
  • Quarters of seasons fitted using 3 indicator
    variables.

14
Time series plot
15
CalcgtMake indicator variables
  • Quarter is 1, 2, 3 or 4
  • Indicator variables will be
  • q1 1 for quarter 1, 0 otherwise
  • q2 1 for quarter 2, 0 otherwise
  • Etc.

16
Fitting regression equation
  • Only 3 of the 4 indicator variables are needed.
  • Any 4 can be used. The one omitted can be thought
    of as being used as the reference group.
  • E.g. in the following output quarter 1 is the
    reference group.

17
Output
  • The regression equation is
  • a 2.13 0.00050 t - 0.123 q2 - 1.01 q3 0.443
    q4
  •  
  • Predictor Coef SE Coef T
    P
  • Constant 2.12900 0.07261 29.32
    0.000
  • t 0.000500 0.006393 0.08
    0.939
  • q2 -0.12300 0.08112 -1.52
    0.158
  • q3 -1.01350 0.08187 -12.38
    0.000
  • q4 0.44350 0.08311 5.34
    0.000
  •  
  • S 0.1144 R-Sq 96.9 R-Sq(adj)
    95.8

18
Note
  • t is not significant no evidence of a trend.
  • t can be omitted from the equation.
  • q2 is not significant but q3 and q4 are.
  • Log abundance in quarter 3 is, on average, 1.01
    units lower than the log abundance for quarter 1.
  • For quarter 4 it is 0.443 units higher than for
    quarter 1.
  • It is advisable to keep q2 in the equation
    otherwise the interpretation of the coefficients
    changes.

19
Analysis of covariance
  • An alternative approach to modelling data with a
    combination of qualitative and quantitative
    variables.
  • Uses an analysis of variance model with
    covariates.
  • Conclusions should be the same.

20
Summary
  • All variables qualitative
  • Analysis of variance
  • All variables quantitative
  • Regression analysis
  • Some variables qualitative, some quantitative
  • Regression analysis or analysis of covariance
Write a Comment
User Comments (0)
About PowerShow.com