SLIDES PREPARED - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

SLIDES PREPARED

Description:

We should not expect the pattern to exactly fit a given distribution, so we can ... (a) If the viewing pattern has not changed, what number of students is expected ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 61
Provided by: lloydj8
Category:

less

Transcript and Presenter's Notes

Title: SLIDES PREPARED


1
STATISTICS for the Utterly Confused, 2nd ed.
  • SLIDES PREPARED
  • By
  • Lloyd R. Jaisingh Ph.D.
  • Morehead State University
  • Morehead KY

2
Chapter 14
  • Chi-Square Procedures

3
Outline
  • Do I Need to Read This Chapter? You should read
    the Chapter if you would like to learn about
  • 14-1 Properties of the chi-square
    distribution.
  • 14-2 The chi-square test for
    goodness-of-fit.
  • 14-3 The chi-square test for
    independence.
  • 14-4 Benfords Law.

4
Objectives
  • To introduce you to the chi-square
    distribution.
  • To use the chi-square distribution to perform
    tests for goodness-of-fit and independence.

5
Objectives
  • To introduce you to Benfords Law.
  • To introduce technology integration for
    chi-square tests.

6
14-1 The Chi-Square ( ?2 ) Distribution
- Properties
  • It is a continuous distribution.
  • It is not symmetric.
  • It is skewed to the right.
  • The distribution depends on the degrees of
    freedom, df n 1, where n is the sample size.

7
14-1 The Chi-Square ( ?2 ) Distribution
- Properties
  • The value of a ?2 random variable is always
    nonnegative.
  • There are infinitely many ?2 distributions, since
    each is uniquely defined by its degrees of
    freedom.

8
14-1 The Chi-Square ( ?2 ) Distribution
- Properties
  • For small sample size, the ?2 distribution is
    much skewed to the right.
  • As n increases, the ?2 distribution becomes more
    and more symmetrical.

9
14-1 The Chi-Square ( ?2 ) Distribution
- Properties
Family of ?2 distributions.
10
14-1 The Chi-Square ( ?2 ) Distribution
- Properties
  • Since we will be using the ?2 distribution for
    the tests in this chapter, we will need to be
    able to find critical values associated with the
    distribution.

11
Quick Tip
  • Extensive tables of critical values are
    available for use in solving confidence
    intervals and hypothesis testing problems that
    are associated with the ?2 distribution.

12
14-1 The Chi-Square ( ?2 ) Distribution
- Properties
  • Notation ?2?, n-1
  • Explanation of the notation ?2?, n -1 ?2?, n
    -1 is a ?2 value with n - 1 degrees of freedom
    such that ? area is to the right of the
    corresponding ?2 value.

13
14-1 The Chi-Square ( ?2 ) Distribution
- Properties
Diagram explaining the notation ?2?, n-1
14
14-1 The Chi-Square ( ?2 ) Distribution
- Properties
  • Values for the random variable with the
    appropriate degrees of freedom can be obtained
    from the tables in the appendix of the text
    (Table 4).
  • Example What is the value of ?20.05,10?

15
14-1 The Chi-Square ( ?2 ) Distribution
- Properties
  • Solution From Table 4 in the appendix, ?20.05,10
    18.307. (Verify).
  • Example What is the value of ?20.95,20?
  • Solution From Table 4 in the appendix, ?20.95,20
    10.851. (Verify).

16
14-2 The Chi-Square test for Goodness of Fit
  • Have you ever wondered whether a sample of
    observed data (frequency distribution or
    proportions) fits some pattern or distribution?
  • We should not expect the pattern to exactly fit a
    given distribution, so we can look for
    differences and make conclusions as to the
    goodness-of-fit of the data.

17
14-2 The Chi-Square test for Goodness of Fit
  • From the Figure on the next slide, one can
    clearly see that the pattern of the sample data
    does not quite follow the distribution of the
    population.
  • As a matter of fact, the sample data deviates
    quite severely from the population distribution.

18
14-2 The Chi-Square test for Goodness of Fit
19
14-2 The Chi-Square test for Goodness of Fit
  • Hence one may intuitively conclude in this case
    that the sample data did not come from the
    population to which it is compared because of the
    large deviations from the sample distribution to
    the population distribution.

20
14-2 The Chi-Square test for Goodness of Fit
  • From the Figure on the next slide, one can
    observe that the sample distribution follows
    quite closely to the population distribution.
  • In this case, one may intuitively conclude that
    the sample data did come from the population to
    which it is compared because of the very small
    deviation of the sample distribution from the
    population distribution.

21
14-2 The Chi-Square test for Goodness of Fit
22
14-2 The Chi-Square test for Goodness of Fit
  • Generally, we can assume that a good fit exists.
  • That is, we can propose a hypothesis that a
    specified theoretical distribution is appropriate
    to model the pattern.
  • Below is a summary of the tests for
    goodness-of-fit.

23
14-2 The Chi-Square test for Goodness of Fit
24
Quick Tip
  • The chi-square goodness of fit test is always a
    right-tailed test.

25
Quick Tip
  • For the chi-square goodness-of-fit test, the
    expected frequencies should be at least 5.
  • When the expected frequency of a class or
    category is less than 5, this class or category
    can be combined with another class or category so
    that the expected frequency is at least 5.

26
EXAMPLE
  • Example There are 4 TV sets that are located in
    the student center of a large university. At a
    particular time each day, four different soap
    operas (1, 2, 3, and 4) are viewed on these TV
    sets. The percentages of the audience captured
    by these shows during one semester were 25
    percent, 30 percent, 25 percent, and 20 percent,
    respectively. During the first week of the
    following semester, 300 students are surveyed.

27
EXAMPLE (Continued)
  • (a) If the viewing pattern has not changed, what
    number of students is expected to watch each soap
    opera?
  • Solution Based on the information, the expected
    values will be 0.25?300 75, 0.30?300 90,
    0.25?300 75, and 0.20?300 60.

28
EXAMPLE (Continued)
  • (b) Suppose that the actual observed numbers of
    students viewing the soap operas are given in the
    following table, test whether these numbers
    indicate a change at the 1 percent level of
    significance.

29
EXAMPLE (Continued)
  • Solution Given ? 0.01, n 4, df 4 1
    3, ?20.01, 3 11.345. The observed and
    expected frequencies are given below

30
EXAMPLE (Continued)
  • Solution (continued) The ?2 test statistic is
    computed below.

31
EXAMPLE (Continued)
  • Solution (continued)

32
EXAMPLE (Continued)
  • Solution (continued)

Diagram showing the rejection region.
33
14-3 The Chi-Square test for Independence
  • The chi-square independence test can be used to
    test for the independence between two variables.

34
EXAMPLE
  • Example A survey was done by a car manufacturer
    concerning a particular make and model. A group
    of 500 potential customers were asked whether
    they purchased their current car because of its
    appearance, its performance rating, or its fixed
    price (no negotiating). The results, broken down
    by gender responses, are given on the next slide.

35
EXAMPLE (Continued)
Question Do females feel differently than males
about the three different criteria used in
choosing a car, or do they feel basically the
same?
36
EXAMPLE (Continued)
  • One way of answering this question is to
    determine whether the criterion used in buying a
    car is independent of gender.

37
EXAMPLE (Continued)
  • That is, we can do a test for independence.
  • Thus the null hypothesis will be that the
    criterion used is independent of gender, while
    the alternative hypothesis will be that the
    criterion used is dependent on gender.

38
Quick Tips
  • When data are arranged in tabular form for the
    chi-square independence test, the table is called
    a contingency table.
  • Here the table on slide 35 has 2 rows and 3
    columns, so we say we have a 2 by 3 (2?3)
    contingency table.

39
Quick Tips
  • The degrees of freedom for any contingency table
    is given by (number of rows 1)?(number of
    columns 1). In this example,
  • df (2 1)?(3 1) 2.

40
EXAMPLE (Continued)
  • In order to test for independence using the
    chi-square independence test, we must compute
    expected values under the assumption that the
    null hypothesis is true.
  • To find these expected values, we need to compute
    the row totals and the column totals.

41
EXAMPLE (Continued)
  • The table on the next slide shows the observed
    frequencies with the row and column totals.
  • These row and column are called marginal totals.

42
EXAMPLE (Continued)
43
EXAMPLE (Continued)
  • Computation of the expected values (example)-
  • The total for the first row (male) is 185, and
    the total for the first column (appearance) is
    180. The expected value for the cell in the
    table where the first row (male) and first column
    (appearance) intersect will be (185?180)/500
    66.6.

44
EXAMPLE (Continued)
  • The table on the next slide shows the expected
    frequencies with the marginal totals.

45
EXAMPLE (Continued)
Let us use ? 0.01. So df (2 1)(3 1) 2
and ?20.01, 2 9.210.
46
EXAMPLE (Continued)
  • Solution (continued) The ?2 test statistic is
    computed in the same manner as was done for the
    goodness-of-fit test.

47
EXAMPLE (Continued)
  • Solution (continued)

48
EXAMPLE (Continued)
  • Solution (continued) Diagram showing the
    rejection region.

49
14-4 Benfords Law
  • Frank Benford, in the 1930s, noticed that
    logarithm tables (these were used by scientists
    long before the common use of computers and
    calculators) tended to be worn out on the early
    pages where the numbers started with the digit 1.

50
14-4 Benfords Law
  • Based on this observation and many others, he
    discovered that more numbers in the real world
    started with the digit 1 rather than with 2, and
    that more started with the digit 2 rather than
    with 3, and so on.
  • He later published a formula which describes the
    proportion of times a number will begin with the
    digit 1, 2, 3, etc.

51
14-4 Benfords Law
  • This formula is now called Benfords Law.
  • The Table on the next slide shows the
    distribution of the proportions, to three decimal
    places, for the leading digits of numbers based
    on Benfords Law.

52
14-4 Benfords Law
The next slide shows a graphical Depiction of
Benfords Law.
53
14-4 Benfords Law
54
14-4 Benfords Law
  • Example Students who attend college and apply
    for student loans must submit a FAFSA (Free
    Application for Federal Student Aid) form. Part
    of the information that is required is the annual
    income of the parent or parents. A sample of
    3,633 forms was sampled from a college records
    and the proportion, to three decimal places, of
    the leading digits for the total annual income
    for the parents were recorded. This information
    is presented on the next slide.

55
14-4 Benfords Law
  • Test at the 5 percent significance level whether
    the distribution of the first digits for the
    reported total salaries for the parents follow
    Benfords Law.

56
14-4 Benfords Law
  • Solution Plots of the proportions of the leading
    digits for both Benfords Law and the parents
    salaries are shown below.

57
14-4 Benfords Law
  • Solution (continued) The Table on the next slide
    shows the computations needed to compute the ?2
    test statistic.
  • The value of the test statistic is equal to
    507.527.
  • To obtain the expected frequencies based on
    Benfords Law one should multiply the total of
    3,633 by Benfords proportions.
  • For example, from the table, the expected
    frequency value of 639.408 is obtained from
    3,6330.176 639.408, etc.

58
14-4 Benfords Law
59
14-4 Benfords Law
60
EXAMPLE (Continued)
  • Solution (continued) Diagram showing the
    rejection region.
Write a Comment
User Comments (0)
About PowerShow.com