Lecture 5 The Problem of Statistical Inference Chapters 5 and 8

1 / 53
About This Presentation
Title:

Lecture 5 The Problem of Statistical Inference Chapters 5 and 8

Description:

Hypothesis Testing in the Two-Variable Regression Model-- Continued ... Here k is the umber of parameters in the original (full or unrestricted) model ... –

Number of Views:270
Avg rating:3.0/5.0
Slides: 54
Provided by: farrokh5
Category:

less

Transcript and Presenter's Notes

Title: Lecture 5 The Problem of Statistical Inference Chapters 5 and 8


1
Lecture 5The Problem of Statistical
Inference(Chapters 5 and 8)
  • Hypothesis Testing in the Two-Variable Regression
    Model-- Continued
  • Testing Hypotheses about a Regression Coefficient
  • Test of Significance Approach ?
  • Analysis of Variance Approach ?

2
Hypothesis Testing in the Multiple Regression
Model
  • Introduction
  • Testing Joint Hypotheses
  • Testing Significance of a Group of Coefficients
  • Testing Significance of the Overall Model
  • Testing for Causality
  • Testing Linear Restrictions on Coefficients
  • Testing Equality of Two Regression Coefficients
  • Testing Structural Stability of Regression Models

3
Quick Review
  • Last time we saw that we can use the CNLR model
    and suppose to test a null hypothesis such as H0
    ß2 ß2 against say an alternative two-sided
    hypothesis such as H1 ß2? ß2
  • We said one way to do this is to use the t-test
    of significance, where
  • t (ß2 - ß2)/SE(ß2) tn -2

4
Quick Review
  • So, once we estimate the regression equation, we
    compute the above t ratio.
  • Next, we choose a level of significance, ?, and
    use it to look up the critical t value from the t
    table.
  • Finally, we use an appropriate decision rule to
    decide whether or not we should reject the null
    in favor of the alternative.

5
Choosing the Level of Significance
  • How should we choose the level of significance?
  • There is no general rule to follow.
  • It is customary to use 1, 5, or 10.
  • Sometimes the choice can be made based on the
    cost of committing type I error relative to that
    of committing a type II error.
  • You should choose a high level of significance
    if you suspect the test has a low power.

6

The P-Value
  • Instead of using an arbitrary level of
    significance, nowadays we use the p-value, which
    is also known as the exact level of significance
    or the marginal significance level
  • This is the lowest level of significance at which
    a given null hypothesis can be rejected
  • Note that for a given sample size, as t
    increases, the p-value decreases

7

P-Value Two Examples
  • Variable Coefficient Std. Error
    t-Statistic Prob.
  • C 0.01738 0.00287 6.052519
    0.000
  • X 0.21637 0.18839 1.148471 0.258
  • Variable Coefficient Std. Error
    t-Statistic Prob.
  • C -0.00020 7.16E-0 -2.866233 0.006
  • X 0.49379 0.00906 9.49734 0.000

8

Testing Hypothesis in the Two-Variable
Model Analysis of Variance Approach
  • As we said earlier, there are three alternative
    approaches for testing a null hypothesis
  • confidence interval approach
  • test of significance approach
  • analysis of variance approach
  • Having studied the test of significance approach,
    we now turn to the analysis of variance approach.

9

Analysis of Variance Approach
  • Analysis of variance (ANOVA) means examining the
    various sums of squares in the relation,TSS ESS
    RSS in the context of regression analysis.
  • In this approach, the first step is to determine
    the degrees of freedom of the above sums of
    squares.
  • In the two-variable model these are as follows
  • TSS has n - 1 degrees of freedom
  • RSS has n - 2 degrees of freedom
  • ESS has 1 degree of freedom

10
Analysis of Variance Approach
  • Next, we define the mean sum of squares
    associated with a given sum of squares as the
    ratio of that sum of squares to its degrees of
    freedom
  • Mean total sum of squares TSS/(n-1)
  • Mean residual sum of squares RSS/(n-2) ?2
  • Mean explained sum of squares ESS/1 ESS
  • A table containing this information is called an
    ANOVA table.

11
Analysis of Variance Approach
  • We use the information ina an ANOVA table to
    construct the following statistic, which is used
    for testing H0 ß2 0 in the two-variable model
  • ESS ESS
  • F
  • RSS/(n-2) ?2
  • In the two-variable CNLR model this statistic has
    an F distribution with 1 degree of freedom in the
    numerator and n-2 degrees in the denominator.
  • It can be used to test the statistical
    significance of the only slope coefficient in the
    bivariate model.

12
Analysis of Variance Approach
  • Large values of F (i.e., large ESS relative to
    ?2) lead to the rejection of H0,while small
    values of F would be consistent with H0.
  • Of course, question remains as to how large is
    large and how small is small?
  • As with the t test, the answer is, relative to
    the critical value of the test (here F)
    statistic.
  • In fact, to apply this test, which is known as
    the F test, we follow the same procedure as
    with t test.

13
Analysis of Variance Approach
  • First, using sample data, we compute the F ratio.
  • Next, we choose a level of significance, and use
    the F table to find the critical F value with 1
    and n-2 degrees of freedom.
  • Finally, we use the usual decision rule for
    rejecting or not rejecting the null hypothesis,
    i.e., we reject the null if the calculated F
    exceeds the critical F, otherwise we dont reject
    the null.

14
Analysis of Variance Approach An Example
  • Lets use the U.S. consumption function we
    estimated earlier, where ß2 0.76, ESS
    4,598,500.9, and RSS 6,107.3 to test H0 ß2 0
    against H1 ß2 ? 0 at the 5 level.
  • Noting that this is a bivariate model (i.e., k
    2) we determine that ESS has k-1 1 degree of
    freedom, and RSS has 32 - 2 30 degrees of
    freedom so that the F ratio is,
  • F 4,598,500.9/(6,107.3/30) 22,588.55


15
Analysis of Variance Approach An Example
  • At the 5 level and with 1 and 30 degrees of
    freedom, the critical F value is 4.17.
  • Since the computed F is greater than the critical
    F we reject the null in favor of the
    alternative.
  • Thus we conclude that at the 5 level our point
    estimate of ß2, i.e., 0.76 is statistically
    significantly different from zero.


16
Analysis of Variance Approach Some Remarks
  • In the two-variable model, this F test is only
    applicable to zero null hypothesis.
  • But, as we will see later on, in multiple
    regression variants of the F statistic can be
    used for testing a large variety of null
    hypotheses involving several regression
    coefficients.
  • The F test is a two-tail tests.


17
Analysis of Variance Approach Some Remarks
  • In the two-variable model, regardless of
    whether we use the t or F test, the final
    decision (outcome) is the same.
  • This is because F1, n-2 t2n-2


18
Analysis of Variance Approach Some Remarks
  • It can be shown that
  • F (n-2)R2/(1-R2)
  • From this, it follows that F ? 0 as R2 ? 0.
  • And as F ? ? as R2 ? 1.
  • You see, R2 and F move together
  • Thus we can use the F statistic to test the
    statistical significance of R2, that is test
    H0R20 against H1R2? 0


19
Introduction
  • In multiple regression sometimes we concerned
    with the joint effect of explanatory variables,
    in addition to their partial or individual
    effects.
  • This means that in multiple regression, we can
    test not only hypotheses that involve a single
    regression coefficient, but also hypotheses that
    include several regression coefficients.
  • We begin with hypotheses that involve a single
    regression coefficient.

20
Testing Hypothesis Involving a Single Partial
Regression Coefficient
  • As in the two-variable regression model, we can
    use either the t test or the F test.
  • However, the F test for testing hypotheses on a
    single regression coefficient is somewhat
    different in the multiple regression model
    relative to the two-variable model.
  • In particular, F ESS/?2, which we used in the
    two-variable model to test the statistical
    significance of the only slope coefficient, ?2,
    can no longer be used in the multiple regression
    to test the same hypothesis.

21
Testing Hypothesis Involving a Single Partial
Regression Coefficient
  • In multiple regression, the procedure for
    performing an F test of statistical significance
    of a single regression coefficient is a special
    case of the general F testing procedure used to
    test a host of different hypotheses.
  • Lets see how this is so by studying the general
    F testing procedure.

22
The ANOVA Approach in the Multiple Regression
Model
  • In the multiple regression model, the ANOVA
    approach, known as Wald test, involves the same
    set of steps regardless of what form the null
    hypothesis takes.
  • The idea is to once assume the null hypothesis is
    true, and another time assume the alternative is
    true and then determine which model, that
    corresponding to the null or the alternative
    hypothesis, fits the data better.

23
Steps in Wald Test
1. Assume the null hypothesis is true, and
find out what the model would look like in this
case. Call this the restricted
model 2. Estimate the restricted model and save
the RSS. Denote this RSSr 3. This time
assume the alternative hypothesis is true, in
which case the original model, which we call
the full or unrestricted model applies.
Estimate this, obtain the RSS and call it RSSu

24
Steps in Wald Test
  • 4. Construct the following statistic
  • (RSSr - RSSu)/m
  • F ---------------------
  • RSSu/(n-k)
  • Here k is the umber of parameters in the
    original (full or unrestricted) model including
    the intercept, and m is the difference in the
    number of coefficients in the full and
    restricted models.
  • Note that because RSSr gt RSSu,the above F
    ratio is a nonnegative number
  • In the multiple CNLR model the above ratio has
    an F distribution with m and n-k degrees of
    freedom

25
Steps in Wald Test
  • 5. Compute the above Wald F statistic and
    compare it with the critical F value at the
    chosen of level of significance.
  • The decision rule is as usual.
  • We can express the above F in terms of R2 from
    the unrestricted and restricted models
  • (R2u - R2r)/m
  • F ----------------
  • (1-R2u)/(n - k)

26
Applications of Wald Test
  • In using the Wald test, the main task is to find
    the restricted model.
  • Below I present the restricted model for testing
    a number of useful hypotheses in the context of
    the following quad-variate model,
  • Yt ß1 ß2X2t ß3X3t ß4X4t ut
  • Note that this will be the unrestricted model
    regardless of the null hypothesis considered.

27
Testing Statistical Significance of an individual
Regression Coefficient
  • H0 ß2 0 vs. H1 ß2 ? 0
  • In this case the restricted model is as follows,
  • Yt ß1 ß3X3t ß4X4t ut

28
Testing a Non-Zero Joint Hypothesis
  • H0 ß2ß2 and ß3ß3 vs. H1 ß2?ß2 or
    ß3?ß3
  • Here ß2 and ß3 are hypothesized (known) values
    of ß2 and ß3, respectively, e.g., 0 and 1,...
  • In this case the restricted model is,
  • Yt ß1 ß2X2t ß3X3t ß4X4t ut
  • or Yt - ß2X2t - ß3X3t ß1 ß4X4t ut

29
Testing Joint Significance of a Group of
Coefficients
  • H0 ß2 ß3 0 vs. H1 ß2 ?ß3 ? 0
  • This is a special case of the previous test,
    where ß is zero.
  • The restricted model is,
  • Yt ß1 ß4X4t ut

30
Granger Non-Causality Test
  • This is a useful application of the above test of
    significance of a group of coefficients.
  • I ask you to rely on your own notes and the text
    for this topic.

Warning You are expected (polite for required)
to study Section 17.14, Causality in Economics
The Granger Test, pp. 620-23 of Gujarati
31
Testing the Overall Significance of the Model
  • H0 ß2 ß3 ß4 0 vs. H1 ß2 ? ß3
    ? ß4 ? 0
  • This amounts to testing H0 R2 0 vs. H1 R2 ?
    0
  • In this case, the restricted model is
  • Yt ß1 ut
  • If you estimate such a model, youd find Yt ß1
  • In practice, we dont estimate the above
    restricted model to test the overall significance
    of the model.

32
Applications of Wald Test Testing the Overall
Significance of the Model
  • Instead, we use the F statistic we used for the
    same purpose in the two-variable model namely,
  • ESS/(k-1)
  • F ---------------
  • RSS/(n-k)

33
Testing Linear Restrictions
  • H0 ß2 ß3 c versus H1 ß2 ß3 ? c
  • where c is a known constant, e.g, 0, 1, 1/2, etc.
  • Find the restricted model by solving the null
    hypothesis for one of the parameters as a
    function of the other, e.g., ß2 c - ß3
  • Substitute this in the original model,
  • Yt ß1 (c - ß3)X2t ß3X3t ß4X4t ut
  • or Yt - cX2t ß1 ß3(X3t - X2t) ß4X4t
    ut

34
Testing Linear Restrictions
  • Thus, in order to find the RSS associated with
    the restricted model, you should generate two
    variables, Yt - cX2t and X3t - X2t and regress
    the former on the latter, a constant, and X4.
  • The above procedure is known as Restricted Least
    Squares (RLS).
  • Note that the restriction under H0 is linear
    since it holds as an equality.
  • An example of a nonlinear restrictions would be
    ß2 ß3 lt c, which cannot be handled by F test.

35
Testing Equality of two Regression Coefficients
  • H0 ß2 ß3 vs. H1 ß2 ? ß3
  • The restricted model is
  • Yt ß1 ß2X2t ß2X3t ß4X4t ut
  • ß1 ß2(X2t X3t) ß4X4t ut

36
Testing Stability of the Model
  • When we estimate a regression model, we assume
    implicitly that the regression coefficients are
    constant over time, that is, the model is stable.
  • However, regime changes can cause structural
    changes in the model.
  • Thus, it is important to test the the assumption
    of constancy or stability of the parameters of
    the regression model.

37
Testing Stability of the Model
  • Let the model representing the period before the
    event in question (the first n1 observations)
    be...
  • Yt ?1 ?2X2t ?3X3t u1 t 1, 2, , n1
  • Let the model representing the period following
    the change (the remaining n2 observations) be...
  • Yt ?1 ?2X2t ?3X3t u2 t 1, 2, , n2
  • The null hypothesis is NO structural change,
    i.e., the models representing the two sub-periods
    are one and the same, H0 ?1 ?1, ?2 ?2, ?3
    ?3

38
Applications of Wald Test
  • If H0 turns out to be true (i.e., if it is not
    rejected), we can estimate a single regression
    over the entire period by pooling the two
    sub-samples (using the full sample of n n1 n2
    observations).
  • The null hypothesis is tested as follows
  • 1. Estimate the model using the first
    sub- sample of n1observations, and save the
    RSS . Call this RSS1.
  • 2. Estimate the model over the second
    sub- sample using n2 observations, find the
    RSS, and call it RSS2.

39
Applications of Wald Test
3. The unrestricted RSS, which assumes H1 is
true (i.e. assumes there is a break in the
regression line) equals RSS1 RSS2. 4.
Estimate the model using all of the available
observations, that is the full sample of n n1
n2 observations. Obtain the RSS and denote
it RSSr. This is the restricted RSS because
estimating the model over the entire sample
period is valid only if H0 is true, that is if
there is no break in the model.
40
Applications of Wald Test
  • 5. Construct the following ratio
  • (RSSr - RSSu)/k
  • F ---------------------
  • RSSu/(n - 2k)
  • This has an F distribution with k and n-2k
    degrees of freedom.
  • The decision rule is as usual.
  • The above test is known as Chow Breakpoint Test
    and is available in EViews.

41

Other Applications of the t Test
  • As simple as it is, the t test has many
    applicat-ions, and when used properly has a high
    power.
  • So far, we have studied its use for testing zero
    and non-zero hypotheses on regression
    coefficients.
  • We see how it can be used for testing hypotheses
    involving more than one regression coefficients,
    which are typically tested using the F test.
  • We will also see how the t test can be used to
    test hypotheses on the simple correlation
    coefficient.

42
Testing Linear Restrictions using the t Test
  • Consider the following trivariate model,
  • Yt ß1 ß2X2t ß3X3t ut
  • Suppose you ant to test H0 ß2 ß3 c versus
    H1 ß2 ß3 ? c, where c is a known constant.
  • Rewrite the null hypothesis as, ß2 ß3 - c 0.

43
Testing Linear Restrictions using the t Test
  • Construct the following t ratio,

  • t ß2ß3-c/?(Var(ß2)Var(ß3)2Cov(ß2, ß3)
  • This has a t distribution with n-2 degrees of
    freedom.
  • The decision rule is as usual.

44
Testing Equality of two Regression Coefficients
using the t Test
  • Consider the following trivariate model,
  • Yt ß1 ß2X2t ß3X3t ut
  • Suppose you ant to test H0 ß2 ß3 versus
    H1ß2 ? ß3.
  • Write the null hypothesis as, ß2 - ß3 0.

45
Testing Equality of two Regression Coefficients
using the t Test
  • Construct the following t ratio,

  • t ß2 - ß3/?(Var(ß2)Var(ß3) - 2Cov(ß2, ß3)
  • This has a t distribution with n-2 degrees of
    freedom.
  • The decision rule is as usual.

46

Testing Hypothesis on the Correlation
Coefficient using the t Test
  • Recall the simple correlation coefficient between
    any two random variables is given by,
  • r12 S12/?(S11S22)
  • In the CNLR model,
  • t r12/ SE(r12) tn-2
  • follows the t distribution with df n-2.
  • Here, SE(r12) ?(1 - r2)/(n - 2)

47
Testing Hypothesis on the Correlation
Coefficient using the t Test
  • The above t statistic can be used to test a
    number of hypotheses about the correlation
    coefficient.
  • Some hypotheses of interest are
  • H0 r 0 versus H1 r lt 0 (one-tailed)
  • H0 r 0 versus H1 r gt 0 (one-tailed)
  • H0 r 0 versus H1 r ? 0
    (two-tailed)
  • The decision rule is as with any t test, both
    one-tailed and two-tailed.

48
Practical Aspects of Hypothesis Testing
  • Please study Section 5.8, pp. 129-134 of Gugarati

49
Reporting Results of Regression Analysis
  • If there is only one equation, report it as
    follows
  • Yi 91.1 20.5Xi u
  • (1.75) (2.67)
  • Significant at the 10 level (two tail)
  • Significant at the 1 level (one tail)
  • Indicate whether the numbers in parentheses are
    estimated standard errors, t ratios, or p values.
  • In the first two cases, the asterisks ( and
    ) would be needed, but not if you choose to
    report the p values, as long as you make it
    clear.

50
Reporting Results of Regression Analysis
  • If the data are time-series, report the
    estimation period and frequency of data, e.g.,
    1969-1988 for annual data, 1969.1-1988.4 for
    quarterly data, or 1969.01-1988.12 if the data
    are monthly.
  • It is also desirable to report the sample mean
    value of the dependent variable (and perhaps
    those of the independent variables).

51
Reporting Results of Regression Analysis
  • If there are several estimated equations,
    construct a table with the estimated parameters
    in rows or columns.
  • Define all the variables of the model.
  • Report data sources.
  • See the example below.

52
  • Table 1
  • Ordinary Least Squares Estimates of
  • Output Per Labor Hour in Selected Sectors of the
    U.S. Economy
  • 1955.1-1995.4
  • (t-values in parentheses)
  • Mining Farming Services
  • Constant 0.12657 0.25672 1.11298
    (2.09)
    (2.58) (1.09)
  • L 0.11659 0.40048
    0.99801
  • (1.99) (2.31)
    (0.98)
  • K 0.16667 0.33437
    1.28359
  • (2.39) (1.88)
    (1.11)
  • _
  • R2 0.54667 0.35347 0.58179
  • F 12.38 18.45
    11.98
  • SEE 0.0096 0.0210
    0.0061
  • Significant at the 10 level.
  • Significant at the 5 level.

53
  • Table 1-- continued
  • Glossary
  • L Natural log of hours of work of all persons
  • K Natural log of capital stock in the private
    non-farm business sector (1992 dollars).
  • Source of Data
  • The original source of all data is the U.S.
    Department of Labor, Bureau of Labor Statistics.
  • The data used in this study are taken from the
    DRI Basic Economics data tape, Chapter 7
    (Capacity and Productivity), Section 2
    (Productivity and Unit Costs), pages 7-3.
Write a Comment
User Comments (0)
About PowerShow.com