COMPARISON OF MEANS - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

COMPARISON OF MEANS

Description:

A random sample of kittens is fed a vitamin supplement from birth to see if the ... The kittens that took the supplement score 4 points higher on the average than ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 20
Provided by: eafr
Category:

less

Transcript and Presenter's Notes

Title: COMPARISON OF MEANS


1
COMPARISON OF MEANS
  • AED 616
  • SPRING 2007

2
Introduction
  • We will consider a frequently encountered
    problem how to compare the means of two samples
    for statistical significance.
  • Lets consider two examples.

3
Example 1
  • A researcher wants to determine whether there are
    differences between men and women voters in their
    attitudes toward welfare. Samples of men and
    women are drawn at random and administered an
    attitude scale to obtain a score for each
    subject. Means for the two samples were computed.
    Women had a mean of 40.00 (on a scale of 0 to
    50, where 50 is the most favorable). Men had mean
    of 35.00 The researcher wants to determine
    whether there is a significant difference between
    men and women. What accounts for the 5-point
    difference? One possible explanation is the null
    hypothesis, which states that there is no true
    differences between men and women that the
    observed difference is due to sampling errors
    created by random sampling.

4
  • Example 1 illustrates that two means may be
    obtained from a survey.

5
Example 2
  • A random sample of kittens is fed a vitamin
    supplement from birth to see if the supplement
    increases their visual acuity. Another random
    sample is fed a placebo that looks like the
    supplement but contains no vitamins. A the end of
    the study, both samples are tested for visual
    acuity and an average acuity score is calculated
    for each sample. The kittens that took the
    supplement score 4 points higher on the average
    than did the control group. What accounts for the
    4-point difference? One possible explanation is
    the null hypothesis, which states that there is
    no true difference between the two samples of
    kittens tht the observed difference is due to
    sampling errors created by random sampling.

6
  • Example 2 illustrates that two means may be
    obtained from an experiment a study in which
    treatments are given in order to observe for
    their effects.
  • Surveys and experiments are very frequently
    conducted, and they often yield two means each,
    so you can see how important it is to be able to
    test the null hypothesis for the difference
    between the two sample means.
  • A statistician named William Gosset developed the
    t-test for exactly the situations we are
    considering.
  • As a test of the null hypothesis, it yields a
    probability that a given null hypothesis is
    correct.
  • When the probability that it is correct is low
    say .05 or 5 or less we usually reject the
    null hypothesis.
  • What leads the t test to give us a low
    probability that the null hypothesis is correct?
    Here are three basic factors

7
Factor 1
  • The larger the samples, the less likely the
    difference between two means was created by
    sampling errors. Larger samples have less
    sampling error than smaller ones. Thus, when
    large samples are used, the t test is more likely
    to yield a probability low enough to allow us to
    reject the null hypothesis than when small
    samples are used.

8
Factor 2
  • The larger the differences between the two means,
    the less likely the difference was created by
    sampling errors. Random sampling tends to create
    many small differences and few large ones. Thus,
    when large differences between means are
    obtained, the t test is more likely to yield a
    probability low enough to allow us to reject the
    null hypothesis than when small differences are
    obtained.

9
Factor 3
  • The smaller the variance among the subjects, the
    less likely that the difference between the two
    means was created by sampling errors. To
    understand this, consider a population in which
    everyone is identical they al look alike, think
    alike, and speak and act in unison. How many do
    you have to sample to get a good sample? Only
    one, because the are all the same. Thus, when
    there is no variation among subjects, it is not
    possible to have sampling errors. As the
    variation increases, sampling errors are more and
    more likely to occur.

10
Types of t tests
  • There are two types of t tests
  • One is for Independent data (sometimes called
    uncorrelated data)
  • Other is for dependent data (sometimes correlated
    data).
  • Examples 1 and 2 have independent data.
  • Example 3 (to follow) describes a study with
    dependent data.

11
Example 3
  • In a study of visual acuity, same-sex siblings
    (two brothers or two sisters) were identified for
    a study. For each pair of siblings, a coin was
    tossed to determine which one received a vitamin
    supplement and which one received a placebo.
    Thus, in the control group, there is a subject
    who is a same-sex sibling of each subject in the
    experimental group.

12
  • The means that results from the study in Example
    3 are subject to less error than the means from
    Example 2.
  • Remember that in Example 2, there was no matching
    or pairing of subjects before assignment to
    conditions.
  • In Example 3, the matching of subjects assure us
    that the two groups are more similar than if just
    tow independent samples were used.
  • To the extent that genetics and gender are
    associated with visual acuity, the two groups in
    Example 3 will be more similar at the onset of
    the experiment than the two groups in Example 2.
  • The t test for dependent data takes the possible
    reduction for error into account. Thus, it is
    important to select the right t test.

13
  • Independent data are obtained when there is no
    matching or pairing of subjects across groups.
  • Here is how to compute t for independent data and
    how to interpret it using the t table.
  • Formula for t is simple
  • m1 m2
  • t --------------
  • SDm
  • Where
  • m1 is the mean of the group with the
    higher mean
  • m2 is the mean of the group with the
    lower mean
  • SDm is the standard error of the
    difference between means

14
  • Numerator is the difference between the two
    means. Larger the difference, the larger the
    value of t.
  • Denominator starts with the symbol S (for
    standard deviation). The subscripts (D for
    difference and m for means) tell us that it is
    the standard deviation of the difference between
    means this standard deviation is called the
    standard error of the difference between means.
  • We want to know whether the difference between
    two means is an unlikely event. If it is unlikely
    (for example, likely to occur less than 5 times
    in 100 due to chance alone), then we will declare
    the difference to be statistically significant
    that is, unlikely to be the result of random
    errors.

15
  • It is impractical to directly obtain the SDm for
    a given t test. What we do is estimate it given
    what we know about the sample size and the
    variance of the samples (the variance is simply
    the square of the standard deviation, whose
    symbol is s2)using the following formula (check
    out the board)

16
  • We call the result the observed value of t.
  • In this example, we observed a value of 3.300. To
    evaluate its meaning, we need to take account of
    the number of cases that underlie it, using this
    formula for the degrees of freedom (df)3
  • df n1 n2 2
  • Where
  • n1 is the number of cases in Group1
  • n2 is the number of cases in Group 2
  • 2 is a constant for this type of problem

17
  • For our example
  • df 12 11 2 23 2 21
  • To evaluate, we use the appropriate critical
    value of t found in the t table in backs of stats
    books.
  • Examining Table 4, we look up the degrees of
    freedom (21 for our example).
  • Look up 21 in the first column, then look to
    the right of the .05 column. There you find the
    critical value of 2.080.
  • We have found that for 21 degrees of freedom,
    only values as extreme as 2.080 are unlikely
    events at the .05 level.
  • Our observed value is 3.300. Is this an unlikely
    event?
  • Yes. Because the observed value of 3.300 exceeds
    the critical value for the .01 level for 21
    degrees of freedom, which is 2.831.
  • Thus we can reject the null hypothesis and
    declare the result to be statistically
    significant at the .01 level.

18
Reporting the Results of t Tests
  • We are considering the use of the t test to
    measure the difference between two sample means
    for significance.
  • Obviously, you should report the values of the
    means before reporting the results of the test on
    them.
  • In addition, you should report the values of the
    standard deviations and number of cases in each
    group.
  • This may be done within the context of a sentence
    or in a table.

19
  • The way it may be reported in the research
    literature is
  • The difference between the means is statistically
    significant (t 3.22, df 10, p
    test)
Write a Comment
User Comments (0)
About PowerShow.com