Assessing Binary Outcomes: Logistic Regression - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Assessing Binary Outcomes: Logistic Regression

Description:

Statistics for Health Research Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Derivation of probability of ... – PowerPoint PPT presentation

Number of Views:256
Avg rating:3.0/5.0
Slides: 45
Provided by: mcadd
Category:

less

Transcript and Presenter's Notes

Title: Assessing Binary Outcomes: Logistic Regression


1
Statistics for Health Research
Assessing Binary Outcomes Logistic Regression
Peter T. Donnan Professor of Epidemiology and
Biostatistics
2
Objectives of Session
  • Understand what is meant by a binary outcome
  • How analyses of binary outcomes implemented in
    logistic regression model
  • Understand when a logistic model is appropriate
  • Be able to implement in SPSS and
  • Interpret logistic model output

3
Binary Outcome
  • Extremely common in health research
  • Dead / Alive
  • Hospitalisation (Yes / No)
  • Diagnosis of diabetes (Yes / No)
  • Met target e.g. total cholesterol lt 5.0 mmol/l
    (Yes / No)
  • n.b. Can use any code such as 1 / 2 but
    mathematically easier to use 0 / 1

4
How is relationship formulated?
For linear simplest equation is
y is the outcome a is the intercept b is the
slope related to x the explanatory variable
and e is the error term or random noise
5
Can we fit y as a probability range 0 to 1?
Not quite! Y as continuous - any value from -8
to 8 Outcome is a probability of event, ? (or
p) on scale 0 1 Certain transformations of p
can give the required scale Probit is a normal
transformation of p but not easy to interpret
results
6
The logit transformation works!
We can now fit p as a probability range 0 to 1
And y in range -8 to 8
7
Logistic Regression Model
This has very useful properties The term p/(1-p)
is called the Odds of an event Note not the
same as the probability of an event p If x is
binary coded 0/1 then - exp (b) ODDS
RATIO for the outcome in those coded 1 relative
to code 0 e.g. Odds of death in men (1) vs.
women (0)
8
Logistic Regression Model
  • Consider the LDL data.
  • It has two binary outcomes
  • LDL target achieved
  • Chol target achieved
  • For example consider gender as a predictor Male
    1 Female 2

9
For a binary x we can express results as odds
ratios (available in crosstabs)
LDL target achieved
No
Yes
140 563
149 531
Gender
Odds yes 563/140
Male
Odds yes 531/149
Female
10
Odds ratio 4.02 / 3.56 OR 0.886 Female cf Male
LDL target achieved
Yes
No
Gender
140 563
149 531
Odds yes 563/140 4.02
Male
Odds yes 531/149 3.56
Female
N.b. Odds is different to prob Men p
563/(140563) 0.80 or 80
11
Odds ratio from Crosstabs
Obtain odds ratios for 2 x 2 tables from
crosstabs and select option risk
12
Results from Crosstabs
Odds ratios for achieving LDL target in females
vs. males
n.b. OR given for Female vs male 0.886
13
Fit Logistic Regression Model
Dependent is binary outcome LDL target met
(Yes 1, No 0) Independent Gender 1 M, 2
F Should get same as the crosstabs result
Select Analyze / Regression / Binary
Logistic Select option of 95 CI for exp (b)
14
Regression / Binary logistic..
15
Odds ratio from logistic model results for a
binary predictor
EXP (B) Odds ratio F vs. M Note that OR for Men
vs Women 1/0.886 1.13
16
Fit Logistic Regression Model continuous
predictor
Dependent is binary outcome LDL target
met Independent Continuous predictor
Adherence B represents the change in the ODDS
RATIO for a 1 unit increase in adherence B x 10
represents the change in the ODDS RATIO for a 10
unit increase in adherence
17
Odds ratio from logistic model results for a
continuous
EXP (B) Odds ratio for 1 increase in
Adherence OR for 10 increase is exp(10 x 0.010)
1.105 i.e. a 10.5 increase in odds of meeting
LDL target for each 10 increase in adherence
18
Fit Logistic Regression Model categorical
predictor
Dependent is binary outcome LDL target
met Independent APOE genotype (1 6) Choose a
reference category, in this case worst outcome is
genotype 6 so choose 6 to give ORs gt 1 B
represents the OR for each category relative to
the reference category
19
Regression / Binary logistic..
Choose Categorical
20
Odds ratios from logistic model results for a
categorical predictor
EXP (B) Odds ratio for APOE (2) vs APOE (6) OR
4.381 (95 CI 1.742, 11.021)
21
Epidemiological Designs
  • Logistic model common in epidemiological research
  • In case-control designs, case is coded 1 and
    controls as 0 and used as dependent variable
  • In cohort study outcome (e.g. death) is used as
    binary outcome in logistic model
  • Note in cohort study exp(b) is Relative Risk (RR)
    rather than OR

22
Definition- Clinical Prediction Rule
  • Clinical tool that quantifies contribution of
  • History
  • Examination
  • Diagnostic tests
  • Stratify patients according to probability of
    having target disorder
  • Outcome can be in terms of diagnosis, prognosis,
    referral or treatment

23
Thresholds for decision making
100
Treatment
Diagnosis / test threshold
Derived Probability of disease
Further diagnostic testing
Test / reassurance threshold
Reassurance
0
24
Ottawa ankle rule
25
Risk Stratification Kaiser-Permanente Pyramid
Identify high risk through risk stratification
and Intervene through case management at highest
risk
26
Framingham Risk Algorithm
  • Prediction of risk Cardiovascular (Framingham)

55 yr-old woman 15-20 5 yr risk
27
Increasing appearance of prediction models in
literature (ISI Web of Knowledge v3)
28
Stages of development and assessment of a CPR
Step 2 Validation Evidence of reproducible
accuracy Application of a rule in similar
clinical settings and population or better still
multiple clinical settings and different
populations with varying prevalence and outcomes
of disease
Step 1 Derivation Identification of factors with
predictive power
Step 3 Impact Analysis Evidence that rule changes
physician behaviour and improves patient outcomes
and /or reduces costs
Cross Sectional or Cohort
Cross Sectional or Cohort
Randomized Controlled Trial
29
How to derive a CPR?
  1. Toss a coin to make decision?
  2. Individual opinion and experience?
  3. Huddle of wise ones Delphi technique to reach
    consensus?
  4. Statistical prediction models !

30
Regression Models for prediction
  • In all of these models we combine a set of
    factors
  • Usually between 2-20 predictors
  • Occams razor suggests smaller is better
  • Fit a multiple regression model
  • Extract probabilities of outcome or diagnosis
  • Create CPR

31
Regression Models for prediction
  • Linear if outcome continuous
  • Binary Outcomes
  • Logistic regression model
    Survival models Cox PH, Weibull, log logistic,
    etc
  • Ordinal or nominal outcomes
  • Ordinal logistic regression

32
The logit transformation
We can now fit p as a probability range 0 to 1
And y in range -8 to 8
33
Statistical prediction Models
Logistic regression model
p probability of the Event and effect of factors
(x) increase or decrease risk of this event
34
Derivation of probability of events
Logistic regression model
Call Linear Predictor as a
linear function of the predictors x1, x2, x3,
etc.
35
Derivation of probability of events
Then
Take exp of both sides
36
Derivation of probability of events
Then rearrange
Or
37
Risk Stratification based on derived probabilities
Example PEONY model to predict risk of emergency
admission to hospital over the next year Now
implemented in NHS Tayside as part of Virtual
Wards management of LTC PEONY II model developed
watch this space! Donnan et al Arch Int Med 2008
38
Other binary models
The logistic model is only applicable whenever
the length of follow-up is same for each
individual e.g. 5-yr follow-up of a cohort For
binary outcomes where censoring occurs i.e.
people leave the cohort from death or migration
then length of follow-up varies and need to use
survival models such as Cox Proportional Hazards
model
39
Summary
  • Logistic model easily fitted in SPSS
  • Clear link with ODDS RATIOS
  • Common model for case-control, cohort studies as
    well as development of clinical prediction models

40
General References
  • Campbell MJ, Machin D. Medical Statistics. A
    commonsense approach. 3rd ed. Wiley, New York,
    1999.
  • Hosmer DW and Lemeshow S. Applied logistic
    regression. John Wiley sons, New Jersey, 2000.
  • Altman DG. Practical statistics for medical
    research. London Chapman and Hall, 1991.
  • Armitage P and Berry G. Statistical Methods in
    Medical research. 3rd ed. Oxford Blackwell
    Scientific, 1994.
  • Agresti A. An introduction to Categorical Data
    Analysis. Wiley, New York, 1996.

41
Practical Fit Multiple Logistic Regression Model
Dependent is binary outcome LDL target met
(Yes 1, No 0) Independent Gender 1 M, 2
F, add APOE, adherence, etc Remember Select
Analyze / Regression / Binary Logistic Select
option of 95 CI for exp (b)
42
3) Screening for variables to eliminate
  • Consider screening procedures to eliminate a
    number of variables under consideration
  • Test each variable separately
  • If p gt 0.3 then they would have to be very strong
    confounders to become significant on adjustment
    in a multiple regression so could be discarded
  • Hosmer-Lemeshow criteria

43
4) A mixture of automatic procedures and self
selection
  • Use automatic procedures as a guide
  • Compare stepwise and backward elimination
  • Think about what factors are important
  • Add important factors
  • Do not follow blindly statistical significance

44
Remember Occams Razor
Entia non sunt multiplicanda praeter
necessitatem Entities must not be multiplied
beyond necessity
William of Ockham 14th century Friar and
logician 1288-1347
Write a Comment
User Comments (0)
About PowerShow.com