Chapter 8 Indicator Variable - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter 8 Indicator Variable

Description:

Reduce the degrees of freedom for error. ... a technique frequently used to analyze data from planned ot designed experiments. ... – PowerPoint PPT presentation

Number of Views:181
Avg rating:3.0/5.0
Slides: 41
Provided by: statN8
Category:

less

Transcript and Presenter's Notes

Title: Chapter 8 Indicator Variable


1
Chapter 8 Indicator Variable
  • Ray-Bing Chen
  • Institute of Statistics
  • National University of Kaohsiung

2
8.1 The General Concept of Indicator Variables
  • The Variables in regression analysis
  • Quantitative variables well-defined scale of
    measurement. For example temperature, distance,
    income,
  • Qualitative variable (Categorical variable) for
    example operators, employment status (employed
    or unemployed), shifts (day, evening or night),
    and sex (male or female). Usually no natural
    scale of measurement.

3
  • Assign a set of levels to a qualitative variable
    to account the effect that variable may have on
    the response. (indicator variable or dummy
    variable)
  • For example The effective life of a cutting tool
    (y) v.s. the lathe speed (x1) and the type of
    cutting tool (x2).

4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
  • Example 8.1 Tool Life Data
  • The scatter diagram is in Figure 8.2.
  • Two different regression lines.

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
  • Two separate straight-line models v.s. a single
    model with an indicator variable
  • Prefer the single-model approach (a simpler
    practical result)
  • Since assume the same slope, it makes sense to
    combine the data from both tool types to produce
    a single estimate of this common parameter.
  • Can give one estimate of the common error
    variance ?2 and more residual degrees of freedom.

13
  • Different in intercept and slope

14
(No Transcript)
15
  • Example 8.2 The Tool Life Data

16
(No Transcript)
17
(No Transcript)
18
  • Example 8.3 An Indicator Variable with More Than
    Two Levels
  • Total electricity consumption (y) v.s. the size
    of house (x1) and the four types of sir condition
    systems.
  • Four types of air conditions systems

19
  • ?3 - ?4 relative efficiency of a heat pump
    compared
  • to central air conditioning.
  • Assume the variance doesnt depend on the types.

20
(No Transcript)
21
  • Example 8.4 More Than One Indicator Variable
  • Add the type of cutting oil used in Example 8.1

22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
8.2 Comments on the Use of Indicator Variables
  • 8.2.1 Indicator Variables versus Regression on
    Allocated Codes
  • Another approach to measure the levels of the
    variables is by an allocated code.
  • In Example 8.3,

28
(No Transcript)
29
  • The allocated codes impose a particular metric on
    the levels of the qualitative factor.
  • Indicator variables are more informative because
    they do not force any particular metric on the
    levels of the qualitative factor.
  • Searle and Udell (1970) regression using
    indicator variables always leads to a larger R2
    than does regression on allocated codes.

30
  • 8.2.2 Indicator Variables as a Substitute for a
    Quantitative Regressor
  • Quantitative regressor can also be represented by
    indicator variables.
  • In Example 8.3, for income factor
  • Use four indicator variables to represent the
    factor income.

31
  • Disadvantage
  • More parameters are required to represent the
    information content of the quantitative factor.
    (a-1 v.s. 1) So it would increase the complexity
    of the model.
  • Reduce the degrees of freedom for error.
  • Advantage It does not require the analyst to
    make any prior assumptions about the functional
    form of the relationship between the response and
    the regressor variable.

32
8.3 Regression Approach to Analysis of Variance
  • The Analysis of Variance is a technique
    frequently used to analyze data from planned ot
    designed experiments.
  • Any ANOVA problem can be treated as a linear
    regression problem.
  • Ordinarily we do not recommend that regression
    mothods be used for ANOVA because the specialized
    computing techniques are usually quite efficient.

33
  • However, there some ANOVA situation, particularly
    those involving unbalance designs, where the
    regression approach is helpful.
  • Essentially, any ANOVA problem can be treated as
    a regression problem in which all of the
    regressors are indicator variables.

n
34
  • Define the treatment effects in the balance case
    (an equal number of observations per treatment)
    as ?1 ?2 ?k n
  • ?i ? ?i is the mean of the ith treatment.
  • Test H0 ?1 ?2 ?k 0 v.s. H1 ?2 ? 0
    for at least one i

35
(No Transcript)
36
  • Example 3 treatments
  • Model yij ? ?i ?ij , i 1, 2, 3, j 1,
    2, , n

37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com