SLIDES PREPARED - PowerPoint PPT Presentation

1 / 141
About This Presentation
Title:

SLIDES PREPARED

Description:

15-1 Comparing Population Means Graphically. ... At this juncture we may also implement an appropriate form of technology to help ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 142
Provided by: lloydj8
Category:

less

Transcript and Presenter's Notes

Title: SLIDES PREPARED


1
STATISTICS for the Utterly Confused, 2nd ed.
  • SLIDES PREPARED
  • By
  • Lloyd R. Jaisingh Ph.D.
  • Morehead State University
  • Morehead KY

2
Chapter 15
  • One-Way Analysis of Variance

3
Outline
  • Do I Need to Read This Chapter? You should read
    the Chapter if you would like to learn about
  • 15-1 Comparing Population Means
    Graphically.
  • 15-2 Some Terminology Associated with
    Analysis of Variance (ANOVA)
  • 15-3 The F Distribution.

4
Outline
  • Do I Need to Read This Chapter? You should read
    the Chapter if you would like to learn about
  • 15-4 One-way or Single Factor ANOVA F
    Tests
  • 15-5 Technology integration for one-way
    ANOVA

5
Objectives
  • To graphically compare more than two population
    means.
  • To introduce some terminology associated with
    Analysis of Variance (ANOVA).

6
Objectives
  • To introduce one-way or single factor ANOVA F
    tests.
  • To introduce technology integration for one-way
    ANOVA.

7
15-1 Comparing Population Means Graphically
  • The objective in comparing several population
    means is to determine whether there is a
    statistical significant difference between them.
  • So when random samples are obtained from these
    populations, the respective sample means can be
    computed to help determine whether there is a
    significant difference between the population
    means.

8
15-1 Comparing Population Means Graphically
  • If the sample means are very different then it is
    likely that the true or population means will be
    different.
  • We need to determine whether the differences are
    due to random variation in the sample data or if
    there really are differences between the
    population means.

9
15-1 Comparing Population Means Graphically
  • One simple way of looking at differences of
    population means is to display the data through
    box plots.
  • Example A random sample of students on a
    college campus was asked to count the number of
    pennies, nickels, dimes and quarters they had on
    their person. The summary information is shown
    on the next slide.

10
15-1 Comparing Population Means Graphically
Note We can consider each of the four data sets
as samples from their respective Populations.
11
15-1 Comparing Population Means Graphically
  • Example (continued) Compute the sample means and
    display the data using box plots.
  • Solution The sample means for the pennies,
    nickels, dimes and quarters are respectively
    10.36, 4.444, 3.714, and 3.25.
  • Observe that the average number of pennies seem
    to be an outlying value relative to the values of
    the other means.

12
15-1 Comparing Population Means Graphically
  • Solution (continued) The box plots can give some
    insight as to whether these differences are
    significant. Observe that the box in the box
    plot for the number of pennies does not overlap
    with the boxes in the plots for the number of
    nickels, dimes and quarters.

13
15-1 Comparing Population Means Graphically
14
15-1 Comparing Population Means Graphically
  • Solution (continued) Another way to observe this
    difference is to compute the one sample
    confidence intervals for the means.
  • The confidence intervals for the number of coins
    are pennies - (7.5091, 13.2182) nickels -
    (2.95042, 5.93847) dimes - (1.40437, 6.02420)
    quarters - (1.85464, 4.64536).
  • From the confidence intervals, we see that the
    interval for the number of pennies does not
    overlap with the others.

15
15-1 Comparing Population Means Graphically
From the confidence intervals, we see that the
interval for the number of pennies does not
overlap with the others.
16
15-1 Comparing Population Means Graphically
  • Solution (continued) Both of these observations
    would indicate that the population average for
    the number of pennies carried by the students is
    significantly different from the rest of the
    denomination of coins.
  • Thus, it is unlikely that the difference in the
    sample averages is due to sample variation.
  • That is, we can say that the variability between
    the sample averages is large when compared to the
    variability within the samples.

17
15-1 Comparing Population Means Graphically
  • Solution (continued) Based on the above
    discussions, we can safely say that there is no
    significant difference between the averages for
    the populations of the number of nickels, dimes
    and quarters.
  • This can be further reinforced by observing that
    both the box plots and the confidence intervals
    overlap for these variables.

18
15-1 Comparing Population Means Graphically
  • Example The figure on the next slide shows
    random samples obtained from three different
    normal distributions. Discuss whether you think
    that the means of these populations are
    significantly different based on the samples.

19
15-1 Comparing population Means Graphically
Samples from normal populations with almost
equal means
20
15-1 Comparing Population Means Graphically
  • Solution Based on the display, we would expect
    the sample means to be nearly equal.
  • We would also expect the variation among the
    sample means (between sample) to be small,
    relative to the variation found around the
    individual sample means (within sample).
  • Thus one may infer that there would not be a
    significant difference between the population
    means.

21
15-1 Comparing Population Means Graphically
  • Example The figure on the next slide shows
    random samples obtained from three different
    normal distributions. Discuss whether you think
    that the means of these populations are
    significantly different based on the samples.

22
15-1 Comparing population Means Graphically
Samples from normal populations with
significantly different means.
23
15-1 Comparing Population Means Graphically
  • Solution Based on the display, we would expect
    the sample means to be significantly different.
  • We would also expect the variation among the
    sample means (between sample) to be large,
    relative to the variation found around the
    individual sample means (within sample).
  • Thus one may infer that there would be a
    significant difference between the population
    means.

24
15-1 Comparing Population Means Graphically
  • These three examples provide us with a sense of
    whether or not there is a significant difference
    among the population means.
  • However, they cannot help us evaluate how likely
    it is that any observed difference is due to
    sampling variation or variations in the sample
    data.

25
15-1 Comparing Population Means Graphically
  • In this chapter, we will present procedures which
    will help us determine how likely it is that the
    observed differences among the sample means are
    due to sampling error. Such procedures are
    called ANalysis Of VAriance (ANOVA).

26
15-2 Some Terminology Associated with ANOVA
  • Explanation of the term - ANOVA ANOVA is a
    statistical method for determining the
    existence of the differences among several
    population means.
  • Explanation of the term - experiment The term
    experiment in ANOVA is a statement of the problem
    to be solved.

27
15-2 Some Terminology Associated with ANOVA
  • Example A researcher would like to determine
    whether there is a difference in the average
    mileages for three different brands of gasoline.
    What is the experiment in this case?
  • Solution The problem to be solved in this
    example is to determine whether there is
    difference in the average mileage for the three
    different brands of gasoline. Hence, this is the
    experiment.

28
15-2 Some Terminology Associated with ANOVA
  • Explanation of the term experimental units
    Individuals or objects on which the experiment is
    performed are called experimental units.
  • Explanation of the term response variable A
    response variable in an experiment is a
    characteristic of an experimental unit on which
    information is to be obtained.

29
15-2 Some Terminology Associated with ANOVA
  • Example Suppose a researcher is interested in
    determining the effectiveness of four teaching
    methods for a given course. In such an
    experiment, the researcher would be interested in
    the final averages for each student in the course
    for the different methods of teaching.
  • Here we refer to the students as the experimental
    units and the final averages as the values for
    the response variable.

30
15-2 Some Terminology Associated with ANOVA
  • Note The response variable may be qualitative,
    such as whether or not you suffer from migraine
    headaches, or quantitative, such as the time it
    takes for your migraine to subside from a certain
    pain level.
  • Many experimental variables which we can control
    are called independent variables or factors.
    Values of the factor are called levels of the
    factor.
  • Note Factors may be qualitative or quantitative.

31
15-2 Some Terminology Associated with ANOVA
  • Explanation of the term qualitative factor A
    qualitative factor is a factor that have levels
    that may vary by category rather than by
    numerical values.
  • Explanation of the term quantitative factor A
    quantitative factor is a factor that have levels
    that may be counts or measurements.

32
15-2 Some Terminology Associated with ANOVA
  • Note Sometimes the word treatment is used
    interchangeably with the term level or maybe
    combined as treatment level.

33
15-2 Some Terminology Associated with ANOVA
  • Explanation of the term treatment (level) of a
    factor An experimental condition which is
    applied to the experimental units is called a
    treatment (level) of the factor.

34
15-2 Some Terminology Associated with ANOVA
  • Note The term treatment can also refer to the
    populations which are being analyzed. For
    example, if we are comparing the average income
    for four different counties in a particular
    state, we may refer to the four populations
    (counties) as four treatments.

35
15-2 Some Terminology Associated with ANOVA
  • Example A farmer would like to determine
    whether there is a difference in the average
    yield per acre for his corn crop for equal
    amounts of five different fertilizers. In this
    experiment, assume that there were equal amounts
    of corn plants per acre. Identify the factor,
    treatment levels, experimental units, and
    response variable of the experiment.

36
15-2 Some Terminology Associated with ANOVA
  • Solution Factor ? fertilizer.
  • Treatment levels ? the five different
    fertilizers.
  • Experimental units ? corn plants.
  • Response variable ? yield per acre.
  • Note So far, all the examples deal with a single
    factor. In this text, we will restrict our
    discussions to only one factor analysis. That
    is, we will only discuss one-factor or one-way
    ANOVA.

37
15-2 Some Terminology Associated with ANOVA
  • Explanation of the term one-factor or one-way
    ANOVA A one-factor or one-way ANOVA deals with
    experiments which involve a single factor with
    different levels. These levels could be
    quantitative or qualitative.

38
15-3 The Hypothesis Test of One-Way ANOVA
  • Suppose we have a single factor experiment in
    which there are r levels.
  • Thus we will be sampling from r populations or
    treatments.
  • We will select an independent random sample from
    each of these r populations.
  • Let the size of the sample from population i for
    i 1, 2, 3, , r, be equal to ni and the total
    sample size is n n1 n2 n3 nr.

39
15-3 The Hypothesis Test of One-Way ANOVA
  • The figure on the next slide shows the r
    populations from which the independent samples
    are selected.
  • Observe that each population has its own mean and
    the respective sample means are computed for the
    samples.
  • Also, indicated in the figure are the respective
    sample sizes.

40
15-3 The Hypothesis Test of One-Way ANOVA
41
15-3 The Hypotheses Associates with the
One-Way ANOVA
42
QUICK TIPS
  • When using the ANOVA technique to test for
    equality of population means, we usually would
    want r gt 2.
  • If r 2, we can use the simpler two-sample t
    tests.
  • The null hypothesis is called a joint hypothesis
    about the equality of several population means
    (parameters).

43
QUICK TIPS
  • It would not be efficient to compare two
    population means at a time to achieve what the
    ANOVA test will achieve.
  • If we test two population means at a time to
    achieve what ANOVA will achieve, we will not be
    sure of the combined probability of a Type I
    error for all the tests.
  • By using the ANOVA technique to compare several
    population means at the same time, we will have
    control of the probability of a Type I error.

44
Assumptions for the One-Way ANOVA
  • The required assumptions of a one-way ANOVA are
  • The random samples from the r populations are
    independent.
  • The r random samples are assumed to be selected
    from normal populations whose means may or may
    not be equal, but the populations have equal
    variances ?2.

45
Assumptions for the One-Way ANOVA
Three Normally Distributed populations with
Different Means but with Equal Variance
46
Validating the Assumptions for the One-Way
ANOVA
  • The assumptions for ANOVA should be validated
    before any inference is made on the population
    means.
  • If these assumptions are not met then the
    inference on the population means may not be
    reliable.
  • The assumptions are necessary in order for the
    test statistic used in the analysis to follow a
    certain probability distribution (discussed in
    the next section).

47
Validating the Assumptions for the One-Way
ANOVA
  • If the populations are not exactly normally
    distributed but are approximately normally
    distributed, then the ANOVA procedure will still
    produce reliable results.
  • If the distributions are highly skewed or very
    different from a normal distribution, or the
    population variances are not equal or
    approximately equal, then ANOVA will not produce
    reliable results.
  • In such cases, other tests, such as equivalent
    nonparametric tests, should be employed.

48
Assumptions for the One-Way ANOVA
  • Two simple graphical techniques can be used to
    establish the assumptions for a one-way ANOVA.
  • We can use the histogram with summary statistics
    to help establish the normality assumption, and
    we can use box plots to help establish the equal
    variance assumption.

49
EXAMPLE
  • Example Equal dosages of three drugs were used
    to ease a certain level of headache. Drug 1 was
    administered to ten patients and Drug 2 and 3
    were administered to nine patients each. The
    time, in minutes, for complete relief of the
    headache for the drugs are given on the next
    slide.

50
EXAMPLE (continued)
Before we use ANOVA to determine whether the
average relieve time for the three drugs are the
same, the validity of the ANOVA assumptions
should be checked. This must be done so that the
inference made on the population means will be
reliable.
51
EXAMPLE (continued)
  • Before we use ANOVA to determine whether the
    average relieve time for the three drugs are the
    same, the validity of the ANOVA assumptions
    should be checked.
  • This must be done so that the inference made on
    the population means will be reliable.

52
EXAMPLE (continued)
  • Graphical displays will be used to help check the
    validity of the ANOVA assumptions.
  • From the histograms on the next slide, one can
    observe that the histograms for Drug 1, Drug 2,
    and Drug 3 can all be approximated by a normal
    distribution.
  • Thus the assumption of normal populations has not
    been violated or severely violated.

53
EXAMPLE (continued)
54
EXAMPLE (continued)
  • Next, let us determine whether the equal variance
    assumption has been violated.
  • By looking at the box plots on the next slide,
    one can observe that the spread for the data sets
    are not the same.
  • However, the spread are similar enough for one to
    infer that it is likely that the observed
    difference in spread is due to sample variation.

55
EXAMPLE (continued)
  • Thus, one may assume that the equal variance
    assumption has not been violated.
  • Here, similar enough means that the range of
    values is approximately the same for the data
    sets. Also, the ranges for the middle fifty
    percent (length of the boxes) for the different
    data sets are approximately the same.

56
EXAMPLE (continued)
57
EXAMPLE (continued)
  • Since both the normality and the constant
    variance assumptions have not been violated or
    severely violated, one can now proceed to test
    for equality of the population means using the
    analysis of variance procedure.

58
15-4 The Test Statistic and the F
Distribution
  • The F-distribution will enable us to
    statistically compare different (at least three)
    population means through the ANOVA procedure.
  • The F distribution is obtained by taking the
    ratio of two chi-square distributions and thus
    has a numerator as well as a denominator degrees
    of freedom associated with it.

59
15-4 The Test Statistic and the F
Distribution
  • The numerator degrees of freedom is r -1 and the
    denominator degrees of freedom is n r, where r
    is the number of populations or treatments and n
    is the combined sample size from these r
    populations (total data values).

60
EXAMPLE (Data on Slide 10)
  • Example What are the numerator and denominator
    degrees of freedom if a one-way analysis of
    variance was run on the data.
  • Solution From the information given on Slide
    10, the number of populations is r 4 and the
    combined sample size is n 10 9 7 8 34.
    Thus, the numerator degrees of freedom is r 1
    4 -1 3, and the denominator degrees of freedom
    is n r 34 4 30.

61
A NOTE
  • The formulas associated with the computations are
    complex and it is time consuming to carry out the
    calculations by hand.
  • Thus, the computational technology is
    indispensable in most situations involving ANOVA.
  • Extensive use of technology will be integrated
    into the computations in these notes. We will
    assume that the appropriate technology is
    available to compute the F test statistic value
    for us.

62
15-4 The Test Statistic and the F
Distribution
  • The one-way ANOVA test statistic is given by

63
15-4 The Test Statistic and the
F-Distribution
  • We would have to compare this F test statistic
    value with a critical F value from a table with r
    1 and n r degrees of freedom and a given
    level of significance ?.

64
15-4 The Test Statistic and the F
Distribution
  • The general decision rule to reject the null
    hypothesis H0 ?1 ?2 ?3 ?r for a given
    significance level ? is given by

65
15-4 The Test Statistic and the F
Distribution
  • A general critical or rejection region for the F
    test is shown below.

Note If we use the P-value approach to
hypothesis testing for the one-way ANOVA, we will
reject H0 if the P-value lt ?.
66
EXAMPLE
  • Example For the data given on Slide 10,
    suppose an F test was conducted at the 5 percent
    significance level to determine whether there was
    a significant difference between the average
    number of pennies, nickels, dimes, and quarters.
  • What will be the F critical value for the test?

67
EXAMPLE (continued)
  • Solution From the information given, then r
    4, n 34, and ? 0.05. Now, since the
    numerator degrees of freedom r -1, then this
    value will be 4 1 3. Also, the denominator
    degrees of freedom n r, then this value will
    be 34 4 30.
  • From the F table in the Appendix of the test, we
    have F3,30,0.05 2.92.
  • Thus, the F critical value for the test will be
    2.92.

68
EXAMPLE (continued)
  • At this juncture we may also implement an
    appropriate form of technology to help with the
    solution.
  • We will apply the MINITAB software to help in
    finding the F critical value.
  • We use the Inverse Cumulative Distribution
    Function feature for the F distribution in
    MINITAB to determine the F critical value.
  • The result is shown on the next slide.

69
EXAMPLE (continued)
F critical value for the test using MINITAB
70
EXAMPLE
  • Example For the data given on Slide 50,
    suppose an F test was conducted at the 1 percent
    significance level to determine whether there was
    a significant difference between the average time
    for the headache to subside for the different
    drugs.
  • What will be the F critical value for the test?

71
EXAMPLE (continued)
  • Solution From the information given, then r
    3, n 28, and ? 0.01. Now, since the
    numerator degrees of freedom r -1, then this
    value will be 3 1 2. Also, the denominator
    degrees of freedom n r, then this value will
    be 28 3 25.
  • From the F table in the Appendix of the test, we
    have F2, 25, 0.01 5.45.
  • Thus, the F critical value for the test will be
    5.45.

72
EXAMPLE (continued)
F critical value for the test using MINITAB
73
15-5 One-Way or Single Factor ANOVA Tests
  • So far in all the previous examples, we had a
    single factor with different levels of the
    treatments.
  • In this section, we will present the F test for
    these single factor experiments.
  • We sometimes refer to this single factor F test
    as One-Way ANOVA F test.

74
Summary of the One-way ANOVA Hypothesis F Test
Using the Classical Approach
75
EXAMPLE
  • Example Perform a one-way ANOVA F test for the
    information given on Slide 10. That is, test
    whether there is a significant difference in the
    population averages for the number of pennies,
    nickels, dimes, and quarters for the student
    population at that particular campus. Test at a
    significance level of 0.05 and use the classical
    approach to hypothesis testing.

76
EXAMPLE (Continued)
  • Solution As mentioned earlier, because of the
    complexity of the formulas for the F test,
    appropriate technology will be integrated into
    the solution of these problems.
  • The MINITAB statistical software was used for the
    computations and the output is shown on the next
    slide.

77
EXAMPLE (Continued)
78
EXAMPLE (Continued)
  • Note From the MINITAB output
  • The numerator degrees of freedom is the Factor
    degrees of freedom (DF).
  • The denominator degrees of freedom is the Error
    degrees of freedom.

79
EXAMPLE (Continued)
  • Solution Observe that the F test statistic
    value from the output is 12.04.
  • The numerator degrees of freedom is r -1 4 1
    3, and the denominator degrees of freedom is n
    r 34 4 30.
  • Thus, the F critical value obtained from the F
    table is F3, 30, 0.05 2.92.

80
EXAMPLE (Continued)
81
EXAMPLE (Continued) Critical Region
82
From the MINITAB OUTPUT
  • The source due to Factor is the contribution from
    the between samples.
  • That is, the contribution when comparing the
    variability for the four samples means.
  • This is associated with the numerator in the test
    statistic F value.

83
From the MINITAB OUTPUT
  • The source due to Error is the contribution from
    the within samples.
  • That is, the contribution when comparing the
    variability for all the sample data combined.
  • This is associated with the denominator in the
    test statistic F value.

84
From the MINITAB OUTPUT
  • The first part of the MINITAB output (i.e.
    excluding the confidence intervals) is usually
    referred to as the One-Way ANOVA Table.
  • This involves information on the factor (between
    information) and the error (within information).

85
MULTIPLE COMPARISONS
  • Since the null hypothesis was rejected and we
    concluded that there is a significant difference
    between the population averages, then the
    question is which of the means are different from
    which ones?
  • We can use multiple comparisons to answer this
    question.

86
MULTIPLE COMPARISONS
  • One way to determine which population means are
    significantly different from which ones, one can
    compute the confidence intervals using the sample
    information.
  • The MINITAB output on slide 77 shows plots of
    the 95 percent confidence intervals.

87
MULTIPLE COMPARISONS
  • Observe that the confidence intervals for the
    average number of nickels, dimes and quarters all
    overlap.
  • This would indicate that there is not a
    significant difference between these averages.

88
MULTIPLE COMPARISONS
  • On the other hand, the confidence interval for
    the average number of pennies, do not overlap
    with any of the other confidence interval.
  • This would indicate that the average number of
    pennies is significantly different from the
    average number of nickels, dimes and quarters.

89
MULTIPLE COMPARISONS
  • In particular, since the confidence interval for
    the average number of pennies is to the right of
    the other intervals, one can conclude that this
    population average is significantly greater than
    the other population means.

90
Generally, when performing a one-way ANOVA, you
should follow this procedure.
91
Using the P-value Approach to a One-way ANOVA
Hypothesis Test
Refer to Slide 77 for the P-value.
92
TI- 83 SOLUTION
  • Use the TI-83 to help with the computations for
    the coin example.
  • Input the values for pennies, nickels, dimes, and
    quarters in lists L1, L2, L3, and L4
    respectively.
  • Select the STAT button and choose TESTS. Scroll
    down to F ANOVA and press ENTER.

93
TI- 83 SOLUTION
  • Input the lists L1, L2, L3, L4 and press ENTER.
  • The One-way ANOVA computations will be
    displayed.
  • You will need to scroll down to view all of the
    output. The output is shown on two screens on
    the next slide.

94
TI- 83 SOLUTION
Observe that the F test statistic value (12.04,
to two decimal places) is the same as that
produced in the MINITAB output. Also, the
P-value produced by the TI-83 is given as P
2.4089811E-5 ? 0.00002 0 just as in the MINITAB
output.
95
Validating the Assumptions for a One-way ANOVA
(Revisited)
  • When validating the one-way ANOVA assumptions
    previously, we used box plots to help check the
    constant variance assumption and we used
    histograms to help check the normality
    assumptions.
  • Because of the computer and readily available
    statistical software, it is easy to check these
    assumptions.
  • Following are two MINITAB outputs which we can
    analyze to help establish these assumptions.

96
Normality Assumption
  • We can use MINITAB (other technologies as well)
    to present a normality plot for the data and
    observe the P value for the normality test. The
    normality test is used to test
  • H0 The distribution from which the sample was
    drawn is normally distributed
  • H1 The distribution from which the sample was
    drawn is not normally distributed

97
Normality Assumption
98
Normality Assumption
  • Observe from the probability plots that all the
    P-values are large (P-value gt 0.05)
  • Hence the null hypothesis of normality for the
    sampling populations will not be rejected.
  • Hence the normality assumption has not been
    violated.

99
Constant Variance Assumption
  • Recall that in the ANOVA MINITAB display, 95
    percent confidence intervals for the means were
    displayed.
  • We can similarly use MINITAB (or other
    appropriate technologies) to construct confidence
    intervals for the standard deviations.

100
Constant Variance Assumption
101
Constant Variance Assumption
  • Observe that the intervals for the standard
    deviations overlap and hence one can assume that
    the constant variance assumption has not been
    violated.
  • P-values (0.026 and 0.055) for two separate tests
    for equal variance are displayed on the output.

102
Constant Variance Assumption
  • This small P-value of 0.026 is due to the fact
    that the point estimate of the standard deviation
    for the number of pennies falls just outside the
    upper limit of the interval for the number of
    quarters.
  • The Levenes test is not as sensitive as the
    Bartletts test which gives a P-value of
    0.055 (gt 0.05).

103
Constant Variance Assumption
  • So, from a statistical significance, using the
    P-value for the Bartletts test, one would
    conclude that the constant variance assumption
    has been violated.
  • However, since the intervals overlap and the
    P-value for the Levenes test is greater than
    0.05, one may conclude, from a practical
    significance, that the assumption has not been
    severely violated.

104
Assumption of Independence
  • The test for independence is beyond the scope of
    this text (see section on Beyond the Text), so we
    will assume that the data was collected in an
    independent manner.

105
FINAL COMMENT
In more advanced approach to One-Way ANOVA, a
model is usually presented and the assumption
tests are done on the errors for the model.
106
Beyond the Text
  • The mathematical model for a One-Way ANOVA
    experiment, may be written as follows
  • Yij ? ?j ?ij
  • where Yij represents the response value for the
    ith row and the jth column value ? is an overall
    mean value ?j is the treatment effect
    contribution to the response value ?ij is the
    error in the observed response value.

107
Beyond the Text
  • In the mathematical model, it is assumed that the
    errors ( ?ij ) are independent and normally
    distributed with a variance which equals the
    constant variance for the populations.

108
Beyond the Text
  • We will use the MINITAB technology to test for
    these assumptions.
  • Other software may be used as well (e.g. SPSS,
    SAS, etc.)

109
EXAMPLE
  • Example For the data given on slide 50 for the
    time to complete relief of the headache for the
    three different drugs, perform a one-way ANOVA
    for the data and validate the assumptions for the
    model.

110
EXAMPLE (continued)
The data may be entered into a MINITAB
worksheet as shown on the left. The procedure in
MINITAB is Stat ANOVA One-Way The
resulting dialog box, with appropriate entries,
is shown on the next slide
111
EXAMPLE (continued)
When the OK button is selected on the dialog
box, the following results will be displayed
in the session window as shown on the next slide.
112
EXAMPLE (continued)
113
EXAMPLE (continued)
  • Observe that the P-value for the test is 0.045.
  • If we test for equality of the means for the time
    for the headache to subside for the three
    different drugs at the 5 significance level, we
    will reject the null hypothesis since 0.045 lt
    0.05.

114
EXAMPLE (continued)-Hypothesis Test
  • H0 ?drug1 ?drug2 ?drug3
  • H1 Not all the means are the same.
  • T.S. P-value 0.045
  • D.R. Reject the null hypothesis if the P-value
    0.045 lt the significance level 0.05.

115
EXAMPLE (continued)-Hypothesis Test
  • Conclusion Since 0.045 lt 0.05, reject the null
    hypothesis. That is, the average time for the
    headache to subside for the different drugs are
    different at the 5 significance level.

116
EXAMPLE (continued)-Multiple Comparisons
  • Since we rejected the null hypothesis, and
    concluded that the means are different, then the
    question is which mean is different from which
    ones?
  • The confidence interval plots in Slide 112,
    allows us to answer this question. These are
    shown on the next slide.

117
EXAMPLE (continued)-Multiple Comparisons
  • From the plots, one can observe that although the
    intervals overlap, the average for drug2 falls
    outside the intervals for drugs 1 and 3.

118
EXAMPLE (continued)-Multiple Comparisons
  • We can infer from the plots that the average time
    for the headache to subside for drug 2 is smaller
    than that for drugs 1 and 3.
  • Also, we can infer that the averages for drugs 1
    and 3 are not significantly different since they
    overlap for almost the entire interval.

119
EXAMPLE (continued)-Validating the Assumptions
for the One-Way ANOVA
  • Recall, the assumptions are
  • The errors (or residuals) are independent of each
    other.
  • The errors (or residuals) are normally
    distributed.
  • The variances are equal (constant variance) for
    the sampling distributions.

120
EXAMPLE (continued)-Validating the Normality
Assumption
  • From the dialog box on slide 111, observe that
    the Store residuals (errors) option was checked.
    This will allow MINITAB to compute the errors and
    store them in the worksheet.

121
EXAMPLE (continued)-Validating the Normality
Assumption
  • MINITAB allows us to do an Anderson-Darling
    goodness-of-fit test for normality.
  • For this test, H0The distribution of the
    residuals is normally distributed against H1 The
    distribution of the residuals is not normally
    distributed.

122
EXAMPLE (continued)-Validating the Normality
Assumption
  • The MINITAB procedure for Anderson-Darling
    goodness-of-fit test is Stat Basic
    Statistics Normality Test.
  • The dialog box with the appropriate entries are
    shown on the next slide.

123
EXAMPLE (continued)-Validating the Normality
Assumption
124
EXAMPLE (continued)-Validating the Normality
Assumption
125
EXAMPLE (continued)-Validating the Normality
Assumption
  • The P-value for the normality test is given as
    0.938.
  • Thus the null hypothesis of normality test will
    not be rejected.
  • Thus, one may infer that the normality assumption
    has not been violated.
  • Note The plotted points follow a straight line.

126
EXAMPLE (continued)-Validating the Constant
Variance Assumption
  • Again, MINITAB allows us to test for the constant
    variance assumption.
  • The procedure follows Stat ANOVA Test for
    Equal Variances.
  • The dialog box with the appropriate entries is
    shown on the next slide.

127
EXAMPLE (continued)-Validating the Constant
Variance Assumption
128
EXAMPLE (continued)-Validating the Constant
Variance Assumption
  • Click on the OK button and the test for equal
    variances will be displayed along with confidence
    interval plots for the population standard
    deviations.
  • Observe that there are two tests Bartletts and
    Levenes.

129
EXAMPLE (continued)-Validating the Constant
Variance Assumption
  • Bartletts test is used when the samples (errors)
    are selected from normal distributions.
  • Levenes testis used when the samples (errors)
    are selected from continuous distributions.

130
EXAMPLE (continued)-Validating the Constant
Variance Assumption
  • Since the errors were established to be normally
    distributed, we will use Bartletts test.
  • H0 The sampling populations have equal variances
    against H1 The sampling populations do not have
    equal variances.

131
EXAMPLE (continued)-Validating the Constant
Variance Assumption
  • The results are shown on the next slide.
  • The P-value for the Bartletts test is 0.693.
  • Hence the null hypothesis will not be rejected
    and one can conclude that the constant variance
    assumption has not been violated.

132
EXAMPLE (continued)-Validating the Constant
Variance Assumption
  • Observe that the confidence intervals for the
    standard deviations all overlap which supports
    the conclusion of the Bartletts test.

133
EXAMPLE (continued)-Validating the Constant
Variance Assumption
134
EXAMPLE (continued)-Validating the Independence
Assumption
  • The independence assumption can be tested using
    the 1-lag autocorrelation function (ACF).
  • An autocorrelation coefficient, which indicates
    how the errors (residuals) are correlated with
    themselves, is often used to investigate the
    independence assumption.

135
EXAMPLE (continued)-Validating the Independence
Assumption
  • To compute this value , denoted by r1, we
    correlate the observed residuals (in time series
    order) with the same errors moved one position
    from the originals.
  • Thus, the lag 1 autocorrelation is computed for
    the pairs (e1, e2), (e2, e3), (e3, e4), ,
    (en-1, en), where ei are the observed errors.

136
EXAMPLE (continued)-Validating the Independence
Assumption
  • When the errors in the model equation are
    normally and independently distributed, the
    sampling distribution of the lag 1
    autocorrelation coefficient associated with a
    sample of size n is approximately normal with
    mean 0 and standard deviation 1/?n.

137
EXAMPLE (continued)-Validating the Independence
Assumption
  • Thus, independence of the errors should be
    questioned when the absolute value of r1 is
    greater than (1.96)?(1/?n). That is, when
    r1 gt (1.96)?(1/?n).
  • MINITAB can be used to compute r1 for the errors.

138
EXAMPLE (continued)-Validating the Independence
Assumption
139
EXAMPLE (continued)-Validating the Independence
Assumption
140
EXAMPLE (continued)-Validating the Independence
Assumption
  • From the previous slide, r1 0.1497.
  • (1.96)?(1/?n) (1.96)(1/?28) 0.3704.
  • Since r1 0.1497 lt 0.3704, one can assume
    that the assumption of independence has not been
    violated.

141
EXAMPLE (continued)-Validating the Independence
Assumption
  • H0 The errors are independent of each other
  • H1 The errors are not independent of each
    other
  • T.S. r1 0.1497
  • D.R. Reject the null hypothesis if r1 gt
    (1.96)?(1/?n) i.e. if 0.1497 gt 0.3704.
  • Conclusion Do not reject H0 since 0.1497 lt
    0.3704 and assume the assumption of independence
    of the errors has not been violated.
Write a Comment
User Comments (0)
About PowerShow.com