Simpsons Paradox - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Simpsons Paradox

Description:

The variables 'death penalty verdict,' having categories yes, no. The 'race of ... Death Penalty ... estimated odds of the death penalty were 1.18 times as ... – PowerPoint PPT presentation

Number of Views:1180
Avg rating:3.0/5.0
Slides: 30
Provided by: myer
Category:

less

Transcript and Presenter's Notes

Title: Simpsons Paradox


1
Simpsons Paradox
  • Michael Kuykindall
  • Faculty Advisor Engin Sungur

2
Outline
  • Introduction
  • Simpsons Paradox
  • Data
  • Results
  • Conclusion

3
What is Simpsons Paradox?
  • Simpsons Paradox occurs when an association
    between two variables is reversed upon observing
    a third variable.

4
According to Colin R. Blyth Simpsons Paradox can
be defined mathematically as follows
  • P(AB)
  • while at the same time
  • P(ABC) P(A BC)
  • P(ABC) P(ABC).
  • This is important because
  • P(AB) An average of P(ABC) and P(ABC)
  • P(AB) An average of P(ABC) and P(ABC),
  • Which is easily seen to be true when all the
    conditional events have positive probabilities
  • P(AB)P(CB)P(ABC) P(CB)P(ABC)
  • P(AB)P(CB)P(ABC) P(CB)P(ABC)

5
Colin R. Blyths example of Simpsons Paradox
  • A doctor was planning to try a new treatment on
    patients mostly local (C) and a few in Chicago
    (C). A statistician advised him to use a table
    of random numbers and as each C patient became
    available, assign him to the new treatment with
    probability .91, leave him to the standard
    treatment with probability .09 and the same for
    C patient with probability .01 and .99
    respectively. When the doctor returned with the
    data the statistician told him that the new
    treatment was obviously a very bad one, and
    criticized him for having continued trying it on
    so many patients.

6
  • The doctor replied that he continued because the
    new treatment was obviously a very good one,
    having nearly doubled the recovery rate in both
    cities.

7
  • In this example A Alive, B New Treatment, C
    Local Patient P( ) refers to the probability
    for a patient chosen at random from among those
    recorded in the table, and coincides with
    proportions in that table, now being taken as a
    total available population. In that example we
    have
  • P(AB) .11
  • P(ABC) .10P(ABC).05,
  • P(ABC) .95P(ABC).95.
  • The initially surprising fact that an average of
    .10 and .95 is so much smaller than the average
    of .05 and .50 is easily explained by showing the
    numerical values in
  • .11 .99(.10) .01(.95)
  • .46 .10(.05) .90(.50)

8
Survival rate
Treatment
Patient Location
9
Smokers Example
  • In England a study was conducted to examine the
    survival rates of smokers and non-smokers. The
    result implied a significant positive correlation
    between smoking survival rates because only 24
    of smokers died as compared to 31 of
    non-smokers. When the data were broken down by
    age group in a contingency table, it was found
    that there were more older people in the
    non-smoker group. Thus age played a very
    significant role in the outcome but since it was
    overlooked the researchers were left with
    deceiving results. (Appleton French, 1996).

10
Survival rate
Smoker or Non Smoker
Age Group
11
Death Penalty Example
  • Effects of racial characteristics on whether
    individuals convicted of homicide receive the
    death penalty.
  • The variables death penalty verdict, having
    categories yes, no. The race of the defendant
    and the race of the victim, each having
    categories European American or African American.
  • Data 326 defendants were recorded as being
    indicted for homicide in 20 Florida counties
    during 1976-1977.

12
Frequencies for Death Penalty Verdict and
Defendant's Race
  • About 12 of European American defendants and
    about 10 African American defendants receive the
    death penalty. Ignoring victims race, the
    percentage of yes death penalty verdicts was
    lower for African Americans than for European
    Americans.

13
Death Penalty Verdict by Defendant's Race and
Victim's Race
  • When victim is European American, the death
    penalty was imposed about 5 percentage points
    more often for African American defendants than
    for European American defendants. When the
    victim is African American, the death penalty was
    imposed over 5 percentage points more often than
    for European American defendants.

14
Death Penalty Verdict by Defendant's Race and
Victim's Race
  • Controlling for victims race, the percentage of
    yes death penalty verdicts was higher for African
    American than for European Americans. The
    direction of the association is reversed.

15
Odds Ratios for Death Penalty (P), Victim's Race
(V), and Defendant's Race (D)
  • The estimated odds of the death penalty were 1.18
    times as high for European American defendants as
    for African American defendants. But, when the
    victim was European American, the estimated odds
    of the death penalty were .67 times as high for
    European American defendants as for African
    American defendant when the victim was African
    American, the estimated odds were .79 times as
    high for European American defendants as for
    African American defendants.

16
  • The odds of having killed a European American are
    estimated to be 25.99 times higher for European
    American defendants than for African American
    defendants.
  • The odds ratios relating death penalty verdict
    and victims race indicate the death penalty was
    more likely when the victim was European American
    than when the victim was African American.
  • So European Americans tend to kill other European
    Americans, and killing a European American is
    more likely to result in the death penalty.

17
Percent Receiving Death Penalty
  • Percent receiving death penalty by defendants
    race, controlling and ignoring victims race.
  • Each observation is represented by a letter
    giving the level of the victims race.
  • Surrounding each observation is a circle having
    area proportional to the number of observations
    at that combination of defendants race and
    victims race.
  • The largest circles occur when European Americans
    kill other European Americans or African American
    kill other African Americans.
  • These cause the marginal results whereby European
    Americans are more likely to receive the death
    penalty.

20
EA
15
Marginal effect
EA
x
10
x
AA
5
AA
0
European American
African American
x marginal effect of defendant's race,
ignoring victim's race.
18
Death Penalty
Race of Defendant
Race of Victim
19
NLSY79 Example
  • DATA The National Longitudinal Survey Handbook
    2001
  • The NLSY79 is a nationally representative sample
    of 12,686 young men and women who were 14 to 22
    years of age when first surveyed in 1979. During
    the years since that first interview, these young
    people typically have finished their schooling,
    moved out of their parents homes, made decisions
    on continuing education and training, entering
    the labor market, served in the military, married
    and started their own families. Data collected
    from the NLSY79 respondents chronicle these
    changes providing researchers with a unique
    opportunity to study in detail the life course
    experiences of a large group of adult
    representatives of all men and women born in the
    late 1950s and early 1960s and living in the
    United States when the survey began.

20
Variables
  • Dependent Highest grade completed
  • Independent Race (Hispanic, Black, Non-Black
    Non-Hispanic), Highest grade Completed by mother

21
Analysis Tools
  • SAS ANOVA Linear Model (Calculate Parameter
    Estimates, Hypothesis test type for F test Type
    III
  • SAS Graphs Box Plots

22
Highest grade completed by subjects


  • Dependent Variable HIGHEST GRADE COMPLETED (REV)
    1998




  • Standard
  • Parameter Estimate
    Error t Value
    Pr t


  • Intercept
    13.52547170 B 0.03722366 363.36
  • Hispanic 1 -1.13028058 B
    0.07076467 -15.97
  • Black 2 -0.69496322 B
    0.06083837 -11.42
  • Non B-H 3 0.00000000 B
    . .
    .
  • Source DF Type III SS
    Mean Square F Value Pr F


  • Race 2 1757.477371
    878.738686 149.57


23
Highest grad completed by subjects their
mothers


  • Dependent Variable HIGHEST GRADE COMPLETED (REV)
    1998






  • Standard
  • Parameter Estimate
    Error t Value Pr
    t


  • Intercept 9.804807070 B
    0.10941378 89.61
  • Hispanic 0.168059825 B
    0.07579780 2.22 0.0266
  • Black -0.297132340
    B 0.05879723 -5.05
  • Non B-H 0.000000000 B
    . .
    .
  • HGC By Mom 0.316727781
    0.00870516 36.38
  • Source DF Type III
    SS Mean Square F Value Pr
    F


  • Race 2
    216.301255 108.150627 21.84
  • HGC By MOM 1 6556.118704
    6556.118704 1323.79

24
(No Transcript)
25
(No Transcript)
26
Race
Childs Education
Mother Education
27
CONCLUSION!!!!
  • Simpsons paradox is a rare phenomenon! It does
    not occur often! Thus statisticians must be
    trained academically ethically well enough to
    make sure that if it has occurred they will
    detect and correct it. This is where practice,
    critical thinking skills, and repetition come
    into play!

28
Sources
  • Agresti, Alan. Categorical Data Analysis. John
    Wiley Son, Inc. Canada.1990 (135-138)
  • Blyth, Colin R.. Journal of the American
    Statistical Association, Vol. 67, No. 338. (Jun.,
    1972), pp. 364-366.

29
Acknowledgements
  • Engin Sungur
  • Jon Anderson
  • Laura Argys
  • Josephine Myers-Kuykindall
Write a Comment
User Comments (0)
About PowerShow.com