Using Statistics To Make Inferences 8 - PowerPoint PPT Presentation

1 / 73
About This Presentation
Title:

Using Statistics To Make Inferences 8

Description:

stupid. bright. K. Pearson Biometrika, 1906, 5, 105-146, data on page ... stupid. bright. For 1708 respondents the expected number of athletic bright boys is ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 74
Provided by: micha558
Category:

less

Transcript and Presenter's Notes

Title: Using Statistics To Make Inferences 8


1
Using Statistics To Make Inferences 8
  • Summary
  •  
  • Contingency tables.
  • Goodness of fit test.

2
Goals
  • To assess contingency tables for independence.
  • To perform and interpret a goodness of fit test.
  •  
  • Practical
  • Construct and analyse contingency tables.

3
Recall
  • To compare a population and sample variance we
    employed?

?2
Cc cc
4
Today
  • The distribution from last week is employed to
    tell if observed data confirms to the pattern
    expected under a given model.

5
Categorical Data - Example
  • Assessed intelligence of athletic and
    non-athletic schoolboys.

K. Pearson Biometrika, 1906, 5, 105-146, data on
page 144.
6
Procedure
  • Formulate a null hypothesis. Typically the null
    hypothesis is that there is no association
    between the factors.
  • Calculate expected frequencies for the cells in
    the table on the assumption that the null
    hypothesis is true.
  • Calculate the chi-squared statistic. This is for
    an r  x  c table with entries in row i and
    column j.

7
Procedure
  • Compare the calculated statistic with tabulated
    values of the chi-squared distribution with ?
    degrees of freedom.
  • ?    (rows - 1)(columns - 1) (r 1)(c 1)

8
Example
  • Assessed intelligence of athletic and non
    athletic schoolboys.
  •  
  • Observed

9
Probabilities
C C C C C C C C C C C C c
The probability a random boy is athletic is
The probability a random boy is bright is
Assuming independence, the probability a random
boy is both athletic and bright is
For 1708 respondents the expected number of
athletic bright boys is
10
Expected
The expected number of athletic bright boys is
11
Expected
The expected number of athletic stupid boys is
12
Expected
The expected number of athletic stupid boys is
1148 530.98 617.02
13
Expected
The expected number of lazy bright boys is
14
Expected
The expected number of stupid lazy boys is
15
Expected
The expected number of stupid lazy boys is
918 617.02 300.98
16
Expected
17
?2
Observed
Expected
18
?2
As a general rule to employ this statistic. All
expected frequencies should exceed 5. If this is
not the case categories are pooled (merged) to
achieve this goal. See the Prussian data later.
19
Conclusion
The result is significant (26.73 gt 3.84) at the
5 level. So we reject the hypothesis of
independence between athletic prowess and
intelligence.
20
SPSS
Raw data
21
SPSS
Data gt Weight Cases
22
SPSS
Analyze gt Descriptive Statistics gt Crosstabs
23
SPSS
24
SPSS
25
SPSS
Expected cell frequencies
26
SPSS
Pearson Chi Square is the required statistic
27
Aside
Two dials were compared. A subject was asked to
read each dial many times, and the experimenter
recorded his errors. Altogether 7 subjects were
tested. The data shows how many errors each
subject produced. Do the two conditions differ at
the 0.05 significance level (give the appropriate
p value)? Observed data 1 2 3 4 5 6 7 36 31 3
1 29 32 25 26 29 35 34 35 34 35 30

What key word describes this data?
28
Aside
C C C C C C C C C c
  • What tests are available for paired data?

One sample t test Sign test Wilcoxon Signed
Ranks Test
29
Aside
Cc C C C C C C C C C C C c
  • What tests are available for paired data? What
    assumptions are made?

normality
One sample t test
Sign test
No assumption of normality
Wilcoxon Signed Ranks Test
Resembles the Sign-Test in scope, but it is much
more sensitive. In fact, for large numbers it is
almost as sensitive as the Student t-test
30
Aside
C C C C C
  • What tests are available for paired data?

One sample t test
Wilcoxon Signed Ranks Test
Sign test
Sign test answers the question How Often?,
whereas other tests answer the question How Much?
31
Example
  • The table is based on case-records of women
    employees in Royal Ordnance factories during
    1943-6. The same test being carried out on the
    left eye (columns) and right eye (rows).
  • Stuart, Biometrika, 1953, 40, 105-110

32
Observed
Is there any obvious structure?
33
Expected
In general to find the expected frequency in a
particular cell the equation is Row total x
Column total / Grand total
34
Expected
In general to find the expected frequency in a
particular cell the equation is Row total x
Column total / Grand total So for highest right
and left the equation becomes 1976 x 1907 / 7477
503.98
35
Expected
Row total x Column total / Grand total 1976 x
1907 / 7477 503.98
36
Expected
Row total x Column total / Grand total
37
Expected
The missing values are simply found by subtraction
38
Expected
1976 503.98 587.22 662.54 222.26
39
Expected
1976 503.98 587.22 662.54 222.26
40
Expected
Similarly for the remaining cells
41
Expected
42
Short Cut
  • Contributions to the ?2 statistic,

for the top left cell the contribution is
43
Conclusion
The above statistic makes it very clear that
there is some relationship between the quality of
the right and left eyes.
44
Total ?2
45
Conclusion
The above statistic makes it very clear that
there is some relationship between the quality of
the right and left eyes.
46
SPSS
Raw data
47
SPSS
Expected cell frequencies
48
SPSS
Pearson Chi Square is the required statistic
49
Alternate applications
  • A similar approach may be employed to test if
    simple models are plausible.

50
?2 Goodness of Fit Test
The degrees of freedom are ? m n 1, where
there are m frequencies left in the problem,
after pooling, and n parameters have been fitted
from the raw data. For example
51
Example
  • The number of Prussian army corps in which
    soldiers died from the kicks of a horse in a
    year.
  •  
  • Typical industrial injury data

52
Which distribution is appropriate?
  • Is the data discrete or continuous?

ccccccccccccccccccccccc
Discrete, since a simple count
53
Check list of distributions
54
Check list of distribution parameters
n p
µ s2
cccccccccccccccccccccccccc
?
cccccccccccccccccccccccccc
?
Discrete, no n implies Poisson
ccccccc
55
Poisson Distribution
  • discrete events which are independent.
  • 2 events occur at a fixed rate ? per unit
    continuum.

56
Poisson Distribution
x successes
e is approximately equal to 2.718
? is the rate per unit continuum
the mean is ? the variance is ?
57
Casio 83ES
exp or e
exp(1) 2.7182818 exp(2) 7.389056
58
Observed Data
We need to estimate the Poisson parameter ?.
Which is the mean of the distribution.
59
Observed Data
60
Mean
ccccccccccccccccccccc
61
Expected
? 0.7 and e is a constant on your calculator
62
Expected
63
Expected Frequency
  • Expected frequency for no deaths 280 x 0.4966
    139.04

64
Expected Frequency
  • Expected frequency for remaining rows
  • 280 probability frequency

Note the two expected frequencies less than 5!
65
?2 Calculation
Pool to ensure all expected frequencies exceed 5
66
Conclusion
  • Here m (frequencies) 4,
  • n (fitted parameters) 1
  • then ? m n 1 4 1 1 2

The hypothesis, that the data comes from a
Poisson distribution would be accepted (5.991 gt
1.95).
67
Next Week
  • Bring your calculators next week

68
Read
  • Read Howitt and Cramer pages 134-152
  • Read Howitt and Cramer (e-text) pages 125-134
  • Read Russo (e-text) pages 100-119
  • Read Davis and Smith pages 434-448

69
Practical 8
  • This material is available from the module web
    page.
  • http//www.staff.ncl.ac.uk/mike.cox

Module Web Page
70
Practical 8
  • This material for the practical is available.

Instructions for the practical Practical 8
Material for the practical Practical 8
71
Assignment 2
  • You will find submission details on the module
    web site

Note the dialers lower down the page give access
to your individual assignment. It is necessary to
enter your student number exactly as it appears
on your smart card.
72
Assignment 2
  • As a general rule make sure you can perform the
    calculations manually.

It does no harm to check your calculations using
a software package.
Some software employ non-standard definitions and
should be used with caution.
73
Assignment 2
  • All submissions must be typed.
Write a Comment
User Comments (0)
About PowerShow.com