Title: Basic Statistics
1Basic Statistics
Evidence-Based Medicine Noon conference series
2006-7
EBM
- Terry Shaneyfelt, MD, MPH
2Objectives
- Accurately interpret study findings
- Understand how data is summarized
- Understand the differences between p-values and
confidence intervals - Understand the effect of chance, bias and
confounding on study findings
3There are 3 kinds of lies- lies, damned lies,
and statistics - Mark Twain
4 - Case Presentation
- Symptom clusterSweaty palms Pale Increased
heart rateGlassy-eyed stareLoss of affect
5Diagnosis
- Photonumerophobia
- The fear that ones fear of numbers will come to
light
6Clopidogrel and Aspirin vs. Aspirin Alone for the
Prevention of Atherothrombotic Events
CHARISMA Trial
- In patients at high risk for atherothrombosis, is
long-term treatment with clopidogrel plus ASA
more effective than ASA alone in reducing
cardiovascular events? - Methods
- RCT (concealed, blinded)
- Clopidogrel ASA vs. ASA placebo
- Outcome MI or stroke or cardiovascular death
- ITT analysis, 99.5 f/u
7Baseline Characteristics
Continuous variable
Dichotomous variable
Variables
Quality, characteristic, constituent of a person
or thing that can be measured
Bhatt, D. et al. N Engl J Med 20063541706-1717
8Why we have statistics?
- Descriptive Statistics
- identify patterns
- leads to hypothesis generation
- Inferential Statistics
- distinguish true differences from random
variation (chance) - allows hypothesis testing
9- Continuous Data
- Use measures of central tendency dispersion
- Dichotomous Data
- Summarized by proportions (s)
10Descriptive Statistics Describing Data with
Numbers
- Measures of Central Tendency (Center of Data)
- MEAN -- average
- MEDIAN -- middle value of ordered data
- MODE -- most frequently observed value(s)
- Measures of Dispersion (Spread of Data)
- RANGE - highest to lowest values
- STANDARD DEVIATION - how closely do values
cluster around the mean value of the actual
sample data - STANDARD ERROR inferential stability of the
mean to the theoretical universal population from
which the sample came
11Extreme values affect the mean
Why not describe the mean age of the 2 cohorts?
39, 47, 41, 43, 95 Median 39, 41, 43, 47,
95 Mean 53
12At 28 months 6.8 of patients on clopidogrel
ASA suffered a primary outcome event compared to
7.3 on ASA alone
ARE THESE NUMBERS DIFFERENT?
136.8
7.3
Clopid/ASA
ASA only
14Statistics and EBM
- All the terms used to summarize outcomes in
studies are compared using inferential statistics - Diagnosis sensitivity, specificity, LR
- Therapy RRR, ARR, NNT
- Prognosis survival rates, median survival,
survival curves - Harm OR, RR
15Inferential Statistics
- Used to determine the likelihood that a
conclusion based on data from a sample is true - Alternative explanations for the conclusion
- Chance / Random Variation
- Confounding
- Bias
16Statistical Tests
- Mathematical formulas that produce p-values to
assess the likelihood that chance accounts for
the results observed in the study - Many different tests. Choice depends on several
factors - Type of data (continuous, dichotomous, etc)
- Distribution of data (normally distributed or
not) - Study design ( of groups, etc)
17(No Transcript)
18The Normal Distribution
- Mean median mode
- 68 of values fall between 1 SD
- 95 of values fall between 2 SDs
Mean, Median, Mode
1?
2?
19TRUTH
Difference
No difference
Alpha/ Type I error
Difference
P-value
Study Conclusion
Beta/ Type II error
No difference
20Beta / Type II Error
- Study doesnt find a difference when in fact one
exists - Most important in negative study
- Main determinant sample size
- Convention 10-20
- Used to determine POWER of the study
- 1 Beta error
- Probability of finding a difference when one
exists
21TRUTH
Difference
No difference
Alpha/ Type I error
Difference
POWER
Study Conclusion
Beta/ Type II error
No difference
22CHARISMA
- Statistical Analysis
- We estimated that 15,200 patients (7600 per
group) and 1040 primary events would be necessary
to detect a 20 percent relative risk reduction in
the primary efficacy end point, with 90 percent
power at the two-sided 0.05 significance level in
this event-driven trial... - A type I error of 0.049 was preserved for the
final analysis.
23TRUTH
Clopid better
No difference
4.9
90
Clopid better
Charisma
10
No difference
242 methods to assess the role of chance
- Hypothesis testing
- Confidence Intervals (Estimation)
25Statistical Approach to Compare 2 Groups
Group A
Group B
- Calculate
- Main effect
- Variance in main effect
State a null hypothesis (the main effect is 0)
Calculate the 95 confidence interval around the
main effect
Calculate the test statistic to determine p value
26P-value
- Probability that the results seen could have
occurred by chance alone - No p-value, however small, excludes chance
completely - Type I error rate (false positive rate)
- lt 0.05 usual (arbitrary) cut-off for statistical
significance
27P-value
- Cannot tell you if there is bias in the study
- Doesnt determine if effect is clinically
significant - Small effect in a study with large sample size
can have the same p value as a large effect in a
small study - Depends on
- How large the effect was
- How many patients were studied
- How consistent the effect was
28(No Transcript)
29(No Transcript)
30CHARISMA
Since the 95 CI of the RR crosses 1.0, the
difference is not significant
95 C.I.
1.05
0.93
0.83
Risk could be this high (increased by 5)
Risk could be this low (reduced by 17)
1.0
31Confidence Intervals
p ns
No difference between groups
Plt0.05
32(No Transcript)
33Confidence Intervals
- Range of plausible values
- 95 CI
- X 1.96 SE (or SD)
- 95 of such intervals will contain the true
population value - Range of values within which we can be 95 sure
that the population value lies - Clinical statistical significance
- Larger the trial the narrower the CI
34Alternative Explanations for the Observed
Conclusions of a Study
- Chance / Random variation
- Bias
- Confounding
35Bias
- Systematic error in a study that results in an
incorrect association - May mask an association or cause a spurious one
- May cause over or underestimation of the effect
size - Minimized through
- Rigorous design considerations
- Meticulous conduct of study
- Particular study designs are most vulnerable to
certain types of bias - Users Guides designed to detect biases in a study
36Alternative Explanations for the Observed
Conclusions of a Study
- Chance / Random variation
- Bias
- Confounding
37Confounding
- Distortion of the effect of exposure (clopidogrel
/ ASA) on the disease (Stroke) by that of a third
factor (e.g. hyperlipidemia) - Confounder has to be associated with BOTH the
exposure and the disease but not just a link in
the causal chain - May cause over or underestimation of the true
effect - May even change direction of observed effect
38Confounding
Clopidogrel / Aspirin
Stroke
Hyperlipidemia
39Confounding
- Can be controlled for in the design phase
- Randomization
- Restriction
- Matching
- Stratification
- Can be controlled in the analysis phase
- Stratified analysis
- Multivariable analysis
40Multivariable Analysis
- Statistical techniques to control for multiple
confounders simultaneously - Linear regression analysis
- Continuous outcome
- Logistic regression analysis
- Dichotomous outcome
- Cox proportional hazards analysis
- Time-to-event outcome
41Take Home Points
- Statistical significance is a requirement for
determining clinical significance, but is not
enough to signify a clinically important
difference - The P-value tells us the risk that the finding
was due to chance / random variation - Statistical tests generate p-values
- P-value only assesses statistical significance
42Take Home Points-2
- Confidence intervals help us to understand how
close the study estimate is to the "truth" - CIs assess both statistical and clinical
significance - Findings in a study could be due to
- Truth
- Chance / random variation
- Bias
- Confounding
43Photonumerophobia
CURED