Goodness of Fit Tests - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Goodness of Fit Tests

Description:

QSCI 381 Lecture 40 (Larson and Farber, Sect 10.1) Multinomial Experiments A is a probability experiment consisting of ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 16
Provided by: PaulBo160
Category:
Tags: fit | goodness | mammal | tests

less

Transcript and Presenter's Notes

Title: Goodness of Fit Tests


1
Goodness of Fit Tests
  • QSCI 381 Lecture 40
  • (Larson and Farber, Sect 10.1)

2
Multinomial Experiments
  • A
    is a probability experiment consisting of a
    fixed number of trials in which there are more
    than two possible outcomes for each independent
    trial. The probability for each outcome is fixed
    and each outcome is classified into
  • .
  • Examples of multinomial experiments include
  • You sample 100 animals from a population. The
    categories could be age, length, maturity state.
  • You sample 1000 poppies in a field. The
    categories could be colour.
  • You sample 20 animals and calculate the frequency
    that each has a particular genetic haplotype.

multinomial experiment
categories
3
Goodness-of-fit Tests
  • A
    is used to test whether an observed
    frequency distribution fits an expected
    distribution.
  • We need to specify a null and an alternative
    hypothesis. Generally the null hypothesis is that
    the observed frequency distribution (the data)
    fits the expected distribution. The alternative
    hypothesis is that this is not the case.

chi-square goodness-of-fit test
4
Example-I
  • We expect that a healthy marine mammal
    population should consist of an equal number of
    males and females, and that 60 of the population
    should be mature. We sample 150 animals and
    assess the fraction in each of four categories to
    be

Mature Female Mature Male Immature Female Immature Male
30 40 32 48
5
Observed and Expected Frequencies
  • The
    of a category is the frequency for the category
    observed in the data.
  • The
    of a category is the calculated frequency for the
    category. Expected frequencies are obtained by
    assuming the specified (or hypothesized)
    distribution is correct. The expected frequency
    for the i th category is
  • Where n is the number of trials, and pi is the
    assumed probability for the i th category.

observed frequency O
expected frequency E
6
Observed and Expected Frequencies(Example)
Mature Female Mature Male Immature Female Immature Male
Observed frequency 30 40 32 48
Assumed probability 0.3 0.3 0.2 0.2
Expected frequency 45 (150 x 0.3) 45 (150 x 0.3) 30 (150 x 0.2) 30 (150 x 0.2)
7
The Chi-square goodness-of-fit Test-I
  • IF
  • the observed frequencies are obtained from a
    random sample, and
  • the expected frequencies are greater than or
    equal to 5 (pool categories if this is not the
    case).
  • then the sampling distribution for the
    goodness-of-fit test is a chi-square distribution
    with k-1 degrees of freedom where k is the number
    of categories. The test statistic is

8
The Chi-square goodness-of-fit Test-II
  • Identify the claim and state the null and
    alternative hypotheses.
  • Specify the level of significance, ?.
  • Determine the degrees of freedom, d.fk-1.
  • Find the critical value of the chi-square
    distribution and hence define the rejection
    region for the test.
  • Calculate the test statistic.
  • Check whether or not the value of the test
    statistic is in the rejection region.

9
Example (Test using ?0.01)
  • H0 the distribution of animals between sex and
    maturity classes equals that expected for a
    healthy population.
  • The degrees of freedomk-13.
  • The critical value of the chi-square distribution
    is 11.34 (CHIINV(0.01,3))

10
Example (Test using ?0.01)
Mature Female Mature Male Immature Female Immature Male
Observed frequency 30 40 32 48
Expected frequency 45 45 30 30
5 0.56 0.13 10.80
  • We reject the null hypothesis at the
  • 1 level of significance.

11
Example-A-1 (?0.05)
  • The probability of a particular bird species
    utilizing each of five habitats is known. We
    collect data for a different species (n137) and
    wish to assess whether the two species differ in
    their habitat requirements.

Habitat type Habitat type Habitat type Habitat type Habitat type
1 2 3 4 5
Expected p 0.2 0.1 0.05 0.5 0.15
Observed 30 17 0 72 18
12
Example-A-2 (?0.05)
Habitat type Habitat type Habitat type Habitat type Habitat type
1 2 3 4 5
Observed frequency 30 17 0 72 18
Expected frequency 27.4 13.7 6.85 68.5 20.55
0.25 0.79 6.85 0.18 0.32
The critical value is 9.49 we fail to reject
the null hypothesis
13
Testing for Normality
  • We can use the chi-square test in some cases to
    assess whether a variable is normally
    distributed.
  • The null and alternative hypotheses are that
  • The variable has a normal distribution.
  • The variable does not have a normal distribution.

14
Example
Class boundaries Frequency
5-15 6
15-25 23
25-35 53
35-45 45
45-55 22
Can we assume that these data are normal (assume
?0.05)?
15
Calculating the Test Statistic
Class boundaries Observed frequency O Cumulative normal Cumulative normal Expected p Expected Frequency E
Lower Upper Difference
5-15 6 0.0030 0.0368 0.0338 5.0 0.1822
15-25 23 0.0368 0.2037 0.1669 24.9 0.1407
25-35 53 0.2037 0.5526 0.3488 52.0 0.0202
35-45 45 0.5526 0.8627 0.3102 46.2 0.0318
45-55 22 0.8637 0.9800 0.1172 17.5 1.1746
? 149 0.977 145.57 1.5497
Eipi x 149
xi is the mid-point of each class
Write a Comment
User Comments (0)
About PowerShow.com