STAT131 W6La Association from Contingency Tables - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

STAT131 W6La Association from Contingency Tables

Description:

The cards have been stacked such that all red come first (or all black) ... The proportion of reds is higher than expected by chance ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 38
Provided by: AP39
Category:

less

Transcript and Presenter's Notes

Title: STAT131 W6La Association from Contingency Tables


1
STAT131W6La Association from Contingency Tables
  • by
  • Anne Porter
  • alp_at_uow.edu.au

2
Null and Alternative hypothesesActivity
  • Card game

3
Activity Outcomes
  • We draw a card from a pack until such time as
    there is a protest.
  • The cards have been stacked such that all red
    come first (or all black)
  • The draw is meant be be random ie a mix of red
    and black.
  • At some point students reject the idea of
    fairness
  • The proportion of reds is higher than expected by
    chance
  • (or blacks depending on which was drawn first)
  • Students are in fact rejecting the null
    hypothesis that the proportion
  • of red cards is 0.5.
  • (or that the proportion of red and black cards
    is equal)
  • They are accepting the hypothesis that the
    proportion is not equal 0.5

4
Null and Alternative hypotheses
  • Null hypothesis is that the proportion of red
    cards (females) is 0.5 (or that the proportion of
    red and black cards is equal)
  • Alternative hypothesis is that the proportion of
    red cards (females) is not equal 0.5

5
Null and Alternative hypothesesformal
Tests of proportions
  • H0 p 0.5 and
  • HA p ? 0.5
  • The p we refer to is the population proportion
  • We do not hypothesise about a sample proportion
  • We make inference about a population parameter p

6
Lecture Outline
  • Test hypotheses about association between
    categorical variables
  • Testing Hypotheses (5 steps)
  • Null and alternative hypotheses
  • a level of significance
  • Select test and state decision rule
  • Perform experiment
  • Draw conclusions
  • test of association AND
  • model fit
  • p values

7
Contingency tables
  • For this contingency table what is
  • P(Male)
  • P(Support)

20/70
40/70
8
Contingency tables
  • If event male is independent of event support
    then
  • P(Male and Support)

P(Male)xP(Support)
20/70 x 40/70 0.1632
9
Contingency tables
  • Given 70 observed people, if P(Male
    Support)0.1632
  • How many are expected to be male and support
    given independence?

11.43
0.1632 x70 11.43 if events Males and Support
are independent
10
Contingency tables
  • Knowing the expected frequency for (male and
    support) we have no more degrees of freedom, the
    remaining values are fixed.

11.43
20-11.438.57
30-8.5721.43
40-11.4338.57
Note We had 1 degree of freedom
11
Contingency tables
  • If we observe a sample of data we may ask if the
    variables sex and level of support are
    associated? To test this we formally test the
    hypotheses

E11.43
E8.57
E38.57
E21.43
12
Hypotheses no association
  • Ho Under model of independence, E distributed
  • (Row total column total)/grand total
  • Ha E not distributed
  • (row totalcolumn total)/grand total

E8.57
E11.43
E38.57
E21.43
13
2. Assign a
  • a is determined such that we have a desired
    level of confidence in our procedures (ie in our
    results).
  • For the chi-square test for association we will
    use a0.05
  • We will examine choosing alpha (a) later

14
Degrees of freedom
  • Knowing the expected frequency for (male and
    support) we have no more degrees of freedom, the
    remaining values are fixed.

11.43
20-11.438.57
21.43
38.57
Note We had 1 degree of freedom
15
Degrees of freedom
  • The degrees of freedom for a rows x column matrix
    may be calculated as (r-1)x(c-1)(2-1)x(2-1)1
  • r is the number or rows and c is the number of
    columns

11.43
8.57
21.43
38.57
Note We had 1 degree of freedom
16
Hypotheses no association
  • Ho Under model of independence, E distributed
  • (Row total column total)/grand total
  • Ha E not distributed
  • (row totalcolumn total)/grand total

E8.57
E11.43
E38.57
E21.43
17
3. Select a test statistic and... determine the
rejection region
  • To test about association in contingency tables
    we calculate
  • And determine the region of rejection ie how big
    chi-square has to be before we conclude that the
    observed are sufficiently different to the
    expected to reject the null hypothesis
  • eij expected count for the ith row and jth column
    of the table

18
3... determine the rejection region
  • For our contingency table
  • df1,

a0.05
Then reject Ho there is evidence that the
variables are not independent
If the calculated gt
3.841
19
3... determine the rejection region
  • For our contingency table
  • df1,

a0.05
Then reject Ho there is evidence that the
variables are not independent
If the calculated gt
3.841
20
4. Calculate
E11.43
E8.57
E28.57
E21.43
21
Decision
  • As calculated value of 0.70 lt 3.841 (tabulated
    value) there insufficient evidence to reject the
    model that sex and level of support are
    independent. That is there is no evidence of an
    association between sex and level of support. The
    profile of support by males is similar to the
    profile of support for females. 13/40
    (32.5)males support, 7/30 (23.3) females support

22
SPSS data entry looks like
  • Data, weight cases by freq has been selected
  • Analyse, Descriptives, Crosstabs and options have
    been selected

23
SPSS output contingency table
24
SPSS output Pearson Chi-Square
Value of chi-square
Assumption of expected frequencies gt 5 hold
25
SPSS output Pearson Chi-Square
Probability of getting a statistic as high or
greater than 0.706 is 0.401. This is high gt0.05
therefore retain Ho, we can get this chi value by
chance under independence
Value of chi-square
26
Example from Utts p. 528SPSS data
  • Yes / No Ear infection
  • P Placebo gum
  • X xylitol gum
  • L xylitol lozenge
  • Is there an association between ear infection and
    gum used?

27
Under Independence Expected frequency
28
Under Independence Expected
29
Under Independence Expected
Degrees of freedom
2
30
Hypotheses
  • Ho Under model of independence, E distributed
  • - (Row total column total)/grand total
  • Ha E not distributed
  • - (row totalcolumn total)/grand total
  • If
  • p1proportion who get an infection in population
    given placebo
  • p2proportion who get an infection given Xylitol
    gum
  • P3proportion who get an infection in a
    population given Xylitol lozenges
  • Ho
  • Ha

p1p2p3

p1, p2 and p3 are not all the same
31
5 step hypothesis test
  • Ho Under model of independence, E distributed
  • (Row total column total)/grand total
  • Ha E not distributed in this manner
  • a
  • df
  • Statistic and Region of rejection

0.05
(3-1)x(2-1)2
If calculated chi-square gt5.991 reject Ho there
is evidence that the variables are not independent
32
Conclusion using decision rule SPSS
  • Chi-square 6.690

gt5.991 therefore there is evidence that the data
do not fit the model of independence
33
P values (sig)
  • For chi-square test (one tailed) the p value is
  • the probability of getting this statistic or
    greater

34
Conclusion using p value from SPSS
The probability of getting a chi-square as high
as this or higher is 0.035. This is a small
probability (lt0.05) if the H0 were true. There is
evidence of an association between infection and
gum used
  • Chi-square 6.690

Assumptions re expected frequencygt5 OK
35
Significance Tests - Formal
  • 1. Null and alternative hypotheses
  • 2. Assign a
  • 3. Select a statistic and determine the rejection
    region
  • 4. Perform the experiment and calculate the
    observed value of c2 or T or Z orother statistic
  • 5. Draw conclusions in context of problem

36
Previous hypothesis testing situations
  • Model fit
  • Ho Expected distributed Binomial(2,0.5)
  • Ha Expected not distributed Binomial (2,0.5)
  • 2. Ho Expected distributed Poisson (0.4)
  • Ha Expected not Poisson (0.4)
  • 3. Ho Expected distributed as per the random
    stopping model
  • Ha Expected not distributed as per random
    stopping model

37
Future hypothesis testing situations
Tests of proportions
  • the null hypothesis may be proportion 0.5 and
  • alternative hypothesis proportion ? 0.5
  • the null hypothesis may be m 0 and
  • alternative hypothesis m ? 0.

Tests of means
Write a Comment
User Comments (0)
About PowerShow.com