Simple Regression - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Simple Regression

Description:

This chapter starts with a linear regression model with one ... It then discusses two methods of estimation: the method of ... nonrandom variables. ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 42
Provided by: ACE5135
Category:

less

Transcript and Presenter's Notes

Title: Simple Regression


1
Chapter 3
  • Simple Regression

2
What is in this Chapter?
  • This chapter starts with a linear regression
    model with one explanatory variable, and states
    the assumptions of this basic model
  • It then discusses two methods of estimation the
    method of moments (MM) and the method of least
    squares (LS).
  • The method of maximum likelihood (ML) is
    discussed in the appendix

3
3.1 Introduction
  • Example 1 Simple Regression
  • y sale
  • x advertising expenditures
  • Here we try to determine the relationship between
    sales and advertising expenditures.

4
3.1 Introduction
  • Example 2 Multiple Regression
  • y consumption expenditures of a family
  • x1 family income
  • x2 financial assets of the family
  • x3 family size

5
3.1 Introduction
  • There are several objectives in studying these
    relationships.
  • They can be used to
  • 1. Analyze the effects of policies that involve
    changing the individual x's. In Example 1 this
    involves analyzing the effect of changing
    advertising expenditures on sales
  • 2. Forecast the value of y for a given set of
    x's.
  • 3. Examine whether any of the x's have a
    significant effect on y.

6
3.1 Introduction
  • Given the way we have set up the problem until
    now, the variable y and the x variables are not
    on the same footing
  • Implicitly we have assumed that the x's are
    variables that influence y or are variables that
    we can control or change and y is the effect
    variable.
  • There are several alternative terms used in the
    literature for y and x1, x2,..., xk.
  • These are shown in Table 3.1.

7
3.1 Introduction
  • Table 3.1 Classification of variables in
    regression analysis
  • Presdictand
    Predictors
  • Regressand
    Regressors
  • Explained variable
    Explanatory variables
  • Dependent variable Independent
    variables
  • Effect variable Causal
    variables
  • Endogenous variable Exogenous
    variables
  • Target variable Control
    variables

8
3.2 Specification of the Relationships
  • As mentioned in Section 3.1, we will discuss the
    case of one explained (dependent) variable, which
    we denote by y, and one explanatory (independent)
    variable, which we denote by x.
  • The relationship between y and x is denoted by
  • y f(x)
  • Where f(x) is a function of x

9
3.2 Specification of the Relationships
  • Going back to equation (3.1), we will assume that
    the function f(x) is linear in x, that is,
  • And we will assume that this relationship is a
    stochastic relationship, that is,
  • Where ,which is called an error or
    disturbance, has a known probability distribution
    (i.e., is a random variable).

10
3.2 Specification of the Relationships
11
3.2 Specification of the Relationships
  • In equation (3.2), is the
    deterministic component of y and u is the
    stochastic or random component.
  • and are called regression coefficients
    or regression parameters that we estimate from
    the data on y and x

12
3.2 Specification of the Relationships
  • Why should we add an error term u ?
  • What are the sources of the error term u in
    equation (3.2)?
  • There are three main sources

13
3.2 Specification of the Relationships
  • Unpredictable element of randomness in human
    responses.
  • ex. If y consumption expenditure of a household
    and x disposable income of the household, there
    is an unpredictable element of randomness in each
    household's consumption.
  • The household does not behave like a
    machine. In one month the people in the
    household are on a spending spree. In another
    month they are tightfisted.

14
3.2 Specification of the Relationships
  • Effect of a large number of omitted variables.
  • Again in our example x is not the only variable
    influencing y. The family size, tastes of the
    family, spending habits, and so on, affect the
    variable y.
  • The error u is a catchall for the effects of all
    these variables, some of which may not even be
    quantifiable, and some of which may not even be
    identifiable.
  • To a certain extent some of these variables are
    those that we refer to in source 1.

15
3.2 Specification of the Relationships
  • 3. Measurement error in y.
  • In our example this refers to measurement error
    in the household consumption. That is, we cannot
    measure it accurately.
  • This argument for u is somewhat difficult to
    justify, particularly if we say that there is no
    measurement error in x (household disposable
    income). The case where both y and x are measured
    with error is discussed in Chapter 11.
  • Since we have to go step by step and not
    introduce all the complications initially, we
    will accept this argument that is, there is a
    measurement error in y but not in x.

16
3.2 Specification of the Relationships
  • If we have n observations on y and x, we can
    write equation (3.2) as
  • Our objective is to get estimates of the unknown
    parameters and in equation (3.3)
    given the n observations on y and x.

17
3.2 Specification of the Relationships
  • To do this we have to make some assumption about
    the error terms . The assumptions we make
    are
  • Zero mean.
  • Common variance.
  • Independence. and are independent for all

18
3.2 Specification of the Relationships
  • Independence of . and are
    independent for all i and j. This assumption
    automatically follows if
  • are considered nonrandom variables.
    With reference to Figure 3.1, what this says is
    that the distribution of u does not depend on the
    value of x.
  • Normality, are normally distributed for all
    i. In conjunction with assumptions 1, 2, and 3
    this implies that are independently and
    normally distributed with mean zero and a common
    variance . We write this as

19
3.2 Specification of the Relationships
  • These are the assumptions with which we start. We
    will, however, relax some of these assumptions in
    later chapters.
  • Assumption 2 is relaxed in Chapter 5.
  • Assumption 3 is relaxed in Chapter 6.
  • Assumption 4 is relaxed in Chapter 9.

20
3.2 Specification of the Relationships
  • We will discuss three methods for estimating the
    parameters and
  • 1. The method of moments (MM).
  • 2. The method of least squares (LS).
  • 3. The method of maximum likelihood (ML).

21
3.3 The Method of Moments
  • The assumptions we have made about the error term
    u imply that
  • In the method of moments, we replace these
    conditions by their sample counterparts.
  • Let and be the estimators for and
    , respectively. The sample counterpart of
    is the estimated error (which is also
    called the residual), defined as

22
3.3 The Method of Moments
  • The two equations to determine and are
    obtained by replacing population assumptions by
    their sample counterparts

23
3.3 The Method of Moments
  • In these and the following equations, denotes
  • . Thus we get the two equations
  • These equations can be written as (noting that
    )

24
3.4 The Method of Least Squares
  • The method of least squares requires that we
    should choose and as estimates of and
    , respectively, so that
    is a minimum.
  • Q is also the sum of squares of the
    (within-sample) prediction errors when we predict
    given and the estimated regression
    equation.
  • We will show in the appendix to this chapter that
    the least squares estimators have desirable
    optimal properties.

25
3.4 The Method of Least Squares
  • or
  • or

  • (3.6)
  • and


  • (3.7)

26
3.4 The Method of Least Squares
  • Let us define
  • and

27
3.4 The Method of Least Squares
  • The residual sum of squares (to be denoted by
    RSS) is given by

28
3.4 The Method of Least Squares
  • But .Hence we have
  • is usually denoted by TSS (total sum of
    squares) and is usually denoted by ESS
    (explained sum of squares).
  • Thus TSS ESS RSS
  • (total) (explained)
    (residual)

29
3.4 The Method of Least Squares
  • The proportion of the total sum of squares
    explained is denoted by ,where is
    called the correlation coefficient.
  • Thus and
    .If is high (close to 1), then x
    is a good explanatory variable for y.
  • The term is called the correlation
    determination and must fall between zero and 1
    for any given regression.

30
3.4 The Method of Least Squares
  • If is close to zero, the variable x
    explains very little of the variation in y. If
    is close to 1, the variable x explains most
    of the variation in y.
  • The coefficient of determination is given by

31
Appendix to Chapter 3
  • Prove the OLS estimators are BLUE.
  • The method of maximum likelihood.

32
3.9 Alternative Functional Forms for Regression
Equations
  • For instance, for the data points depicted in
    Figure 3.7(a), where y is increasing more slowly
    than x, a possible functional form is y a ß
    logx.
  • This is called a semi-log form, since it involves
    the logarithm of only one of the two variables x
    and y.

33
3.9 Alternative Functional Forms for Regression
Equations
  • In this case, if we redefine a variable X log
    x, the equation becomes y a ßX.
  • Thus we have a linear regression model with the
    explained variable y and the explanatory variable
    X log x.

34
3.9 Alternative Functional Forms for Regression
Equations
  • For the data points depicted in Figure 3.7(b),
    where y is increasing faster than x, a possible
    functional from is . In this case we
    take logs of both sides and get another kind of
    semi-log specification
  • If we define Y log y and , we
    have
  • which is in the form of a linear regression
    equation.

35
3.9 Alternative Functional Forms for Regression
Equations
36
3.9 Alternative Functional Forms for Regression
Equations
  • An alternative model one can use is
  • In this case taking logs of both sides, we get
  • In this case can be interpreted as an
    elasticity. Hence this form is popular in
    econometric work. This is call a double-log
    specification since it involves logarithms of
    both x and y. Now define Y log y, X log x, and
    .We have
  • which is in the form of a linear regression
    equation. An illustrative example is given at the
    end of this section.

37
3.9 Alternative Functional Forms for Regression
Equations
  • Some other functional forms that are useful when
    the data points are as shown in Figure 3.8 are
  • or
  • In the first case we define X1/x and in the
    second case we define .In both case
    the equation is linear in the variables after the
    transformation.

38
3.9 Alternative Functional Forms for Regression
Equations
  • Some other nonlinearities can be handled by what
    is known as search procedures.
  • For instance, suppose that we have the regress
    equation
  • The estimates of , ,and are
    obtained by minimizing

39
3.9 Alternative Functional Forms for Regression
Equations
  • We can reduce this problem to one of the simple
    least squares as follows For each value of
    ,we define the variable and
    estimate and by minimizing
  • We look at the residual sum of squares in each
    case and then choose that value of for which
    the residual sum of squares is minimum.
  • The corresponding estimates of and are
    the least squares estimates of these parameters.
    Here we are searching over different values of
    .

40
3.9 Alternative Functional Forms for Regression
Equations
  • This search would be a convenient one only if we
    had some prior notion of the range of this
    parameter.
  • In any case there are convenient nonlinear
    regression programs available nowadays.
  • Our purpose here is to show how some problems
    that do not appear to fall in the framework of
    simple regression at first sight can be
    transformed into that framework by a suitable
    redefinition of the variables.

41
  • Nonlinear Model by Gauss
  • Exercise 20(g) at the ending of Chapter 3 (page
    112)
Write a Comment
User Comments (0)
About PowerShow.com