Analysis of Variance - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Analysis of Variance

Description:

Example: The rating of luxury cars: Type of Car. American. German. Japanese. Sample size ... Sample mean rating. 2.98. 3.15. 3.02 ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 35
Provided by: XieS
Category:

less

Transcript and Presenter's Notes

Title: Analysis of Variance


1
Chapter 15
  • Analysis of Variance
  • (ANOVA)

2
Single-Factor ANOVA
  • In Chapter 11 we compared the means of two
    populations or treatments.
  • To decide whether the means for more than two
    populations or treatments are identical, we use a
    method called single-factor analysis of variance.
  • Example The rating of luxury cars

Does the data support that the true average
(population mean) rating of all three types of
cars are identical, or at least two of them are
different from one another?
3
15.1 Single-Factor ANOVA and the F-Test
  • A single-factor analysis of variance (ANOVA)
    problem involves a comparison of k population or
    treatment means.
  • H0 µ1µ2 µk, against
  • Ha At least two of the µs are different.
  • The analysis is based on k independently selected
    samples, one from each population or for each
    treatment.

4
ANOVA Notation
  • k number of populations or treatments being
    compared
  • Population or treatment 1 2 k
  • Population or treatment mean µ1 µ2 µk
  • Population or treatment variance s12 s22
    sk2
  • Sample size n1 n2 nk
  • Sample mean
  • Sample variance s12 s22 sk2
  • The total number of observations in the data set
    is
  • N n1 n2 nk
  • The sum of all N observations-the grand total-is
    denoted by
  • T n1 n2 nk
  • The grand mean is

5
Example Strength of Cardboard
  • The following table presents the results of a
    single-factor experiment involving k 4 types of
    cardboard boxes with respect to compression
    strength (in pounds).

6
Assumptions for ANOVA
  • Each of the k population or treatment response
    distributions is normal.
  • s1 s2 sk (i.e. the k normal
    distributions have identical standard
    deviations).
  • The observations in the sample from any
    particular one of the k populations or treatments
    are independent of one another.
  • When comparing population means, k random samples
    are selected independently of one another. When
    comparing treatment means, treatments are
    assigned at random to subjects or objects (or
    subjects are assigned at random to treatments).

7
Definition of Treatment Sum of Square and
Error Sum of Squares
  • A measure of disparity among the sample means is
    the treatment sum of squares, denoted by SSTr
  • A measure of variation within the k samples,
    called the error sum of squares and denoted by
    SSE, is
  • Each sum of squares has an associated number of
    degrees of freedom
  • treatment df k - 1
  • error df N k

8
Mean Squares
  • A mean square is a sum of squares divided by its
    number of degrees of freedom.
  • mean square for treatments MSTr
  • mean square for error MSE

9
Example Cardboard Strength
10
Example Cardboard Strength Calculations
The number of degrees of freedom are treatment df
k - 1 3 error df N - k 6 6 6 6 - 4
20
11
Relationship between MSTr and MSE
  • When H0 is true (µ1 µ2 µk),
  • µMSTR µMSE
  • When H0 is false
  • µMSTR gt µMSE
  • And the greater the differences among the µs,
    the larger µMSTr will be relative to µMSE

12
The Single-Factor ANOVA F Test
  • Null hypothesis H0 µ1µ2 µk
  • Test statistics F
  • An F distribution always arises in a ratio with a
    sum of squares on the numerator and another sum
    of squares on the denominator.
  • When H0 is true and the ANOVA assumptions are
    reasonable, F has an F distribution with df1 k
    - 1 and df2 N - k.
  • All F test in this book are upper-tailed. P-value
    is the area captured in the upper tail of the
    corresponding F curve.
  • Reject H0 when P-value lt a.

13
F Curve and Upper-Tailed Test
Exercise Find P-value if the calculated F value
is 9.20 with df14 and df26.
14
Single-Factor ANOVA Table
15
Exercise Cardboard Strength Revisited
(Conclusion)
  • H0 µ1µ2µ3µ4
  • Ha At least two among µ1, µ2, µ3, and µ4 are
    different.
  • In Example 15.2 MSTr 42,455.86 and MSE
    1691.92
  • Conduct an F test.
  • df1 k - 1 3 and df2 N k 24 4 20.
  • Table of Upper-Tail F Curve Areas shows 8.10
    captures tail
  • area lt .001. Because 25.09 gt 8.10, it follows
    that P-value lt
  • .001.
  • Reject H0. True average compression strength
    appears
  • different on the types of box.

16
  • Example Hormones and Body Fat
  • To investigate the effect of four treatments on
    various body characteristics, each of 57 female
    subjects who were over age 65 was assigned at
    random to one of the following four treatments
    (1) placebo growth hormone and placebo
    steroid (P P), (2) placebo growth hormone
    and the steroid estradiol (P S), (3) growth
    hormone and placebo steroid (G P), and (4)
    growth hormone and the steroid estradiol (G S).
  • The table on the next slide lists data on change
    in body fat mass over the 26-week period
    following the treatments.
  • Carry out an F test to see whether true mean
    change in body fat mass differs for the four
    treatments. Use a significance level a 0.01.

17
Let µ1, µ2, µ3, and µ4 denote the true mean
change in body fat for the treatments P P, P
S, G P, and G S, respectively. H0 µ1 µ2
µ3 µ4 Ha At least two among µ1, µ2, µ3, and µ4
are different.
  • Use Excel to create ANOVA table and calculate
    the test statistics F (See next slides)
  • Enter the data.
  • Click Data
  • Choose Data Analysis
  • Choose ANOVA Single Factor
  • Select Input and Output Range, and click OK.

18
Choose Anova Single Factor in Data Analysis
19
The Excel generates an ANOVA table after you
enter the input range.
  • Conclusion Since P-value 0.0000162 lt a (0.01),
    we reject H0. We conclude that the mean change in
    body fat mass is not the same for all four
    treatments.

20
15.2 Multiple Comparison
  • When H0 µ1 µ2 . . . µk is rejected by the
    F test, we believe that there are differences
    among the k population or treatment means. A
    natural question is Which means differ?
  • A multiple comparisons procedure is a method for
    identifying differences among the µs once the
    hypothesis of overall equality has been rejected.
  • One such method is the Tukey-Kramer (T-K)
    multiple comparison procedure.
  • The T-K procedure is based on computing
    confidence intervals for difference between each
    possible pair of µs. For example, if k 3,
    there are three differences to consider
  • µ1 - µ2 , µ1 - µ3 and µ2 - µ3

21
The Tukey-Kramer (T-K) Procedure
  • Construct all confidence intervals for the
    difference between each pairs of µs
  • q is the critical value for the Studentized
    Range Distribution available in Appendix Table 7
    based on error df ( N k ) and confidence
    level.
  • Recall that we may use Excel to find MSE and
    error df.
  • When there are k populations or treatments
    being compared, you need to compute ½ k (k - 1)
    confidence intervals.
  • Two means are judged to be significantly
    different if the corresponding interval does not
    include 0.

22
Example Hormones and Body Fat Revisited
  • Recall that the data on the right list the change
    in body fat mass resulting from a double-blind
    experiment designed to compare the four
    treatments P P, P S, G P and G S.
  • Does the true mean change in body fat mass differ
    for the four treatments?

23
Recall The Single-Factor ANOVA F test concluded
that the mean change in body fat mass is not the
same for all four treatments since P-value 0.
  • Use the ANOVA table to find MSE 1.92 and error
    df 53.
  • Appendix Table 7 gives the 95 Studentized range
    critical value q 3.74 (using the error df 60,
    the closet table value of error degree N k
    53.)

24
includes 0. We conclude that µ1 is not
significantly different from µ2.
does not include 0. We conclude that µ1 is
significantly different from µ3.
25
Exercise Find the remaining T-K intervals
.
26
  • Summarizing the Results of the Tukey-Kramer
    Procedure
  • List the sample means in increasing order.
  • Use the T-K intervals to determine the group of
    means that do not differ significantly from the
    first in the list. Draw a horizontal line
    extending from the smallest mean to the last mean
    in the group.
  • Repeat the above procedure from the second
    smallest. Continue considering the means on the
    order list.

27
Exercise Sleep Time
  • As biologist wished to study the effect of
    ethanol on sleep time. A sample of 20 rats,
    matched for age and other characteristics, was
    selected , and each rat was given an oral
    injection having a particular concentration of
    ethanol per body weigh. The rapid eye movement
    (REM) sleep time for each rat was then recorded
    for a 24-hr period, with the results shown on the
    right.

Does true average REM sleep time depend on the
treatment used?
28
  • Answers to Exercise Sleep Time
  • Single-Factor ANOVA F Test
  • H0 µ1 µ2 µ3 µ4
  • Ha At least two among µ1, µ2, µ3, and µ4 are
    different.
  • Based on the ANOVA table, we reject H0 since
    P 0. The true REM sleep time is not the same
    for all four treatments.
  • The T-K intervals are list in the table on the
    right.
  • The corresponding underscore pattern is

29
Appendix Using SPSS for ANOVA F test and T-K
procedure.
  • Enter data in two columns
  • Column 1 groups
  • group 1 P P
  • group 2 P S
  • group 3 G P
  • group 4 G S
  • Column 2 change
  • (in body fat mass)

30
  • Click Analyze, click General Linear Model, then
    click Univariate. (see the figure on right)
  • You now see the Univariate dialog box below.
    Click group, and move it to Fix Factor(s) box.
  • Also in the Univariate dialog, click change, and
    move it to Dependent Variable box. (See the
    figure on the right below. )

31
  • Click Options. The Univariate Options dialog
    box appears.
  • Click group and then click ?to move group to the
    Display Means For box.
  • Click Descriptive statistics in the Display box.
  • Click Continue.

32
  • Click Post Hoc. The Post Hoc Multiple Comparison
    for Observed Means dialog box appears.
  • In the Factor(s) box, click group, and click
    ?to make it appear in the Post Hoc Tests for
    box.
  • In the Equal Variances Assumed box, click
    Tukey.
  • Click Continue.
  • Click OK.

33
  • The first half of the output is shown in the
    right figure.
  • The Descriptive Statistics provide the mean,
    standard deviation and sample size for each
    group.
  • The Tests of Between-Subjects Effects shows the
    ANOVA table.
  • The second half of the output continues on next
    slide.

34
  • The group table lists the 95 confidence
    interval for each group.
  • The T-K intervals are listed in Multiple
    Comparison table. Notice that the T-K interval
    for µ1-µ2 is (-1.0395, 1.7395), while the T-K
    interval for µ2-µ1 is (-1.7395, 1.0395), etc.
Write a Comment
User Comments (0)
About PowerShow.com