Title: Introduction to Clinical Epidemiology Class 2
1Introduction to Clinical EpidemiologyClass 2
- Spring 1999 Elective
- UT-H HSC
- Jan Risser, PhD and Will Risser, MD PhD
2Case Reports / Case Series
- Epidemiology is involved with
- What case definition
- When time
- Where place
- Who person
- Why causes
3Case Reports / Case Series
- Case reports describe the experience of a single
patient or group of patients with a similar DX. - Does halothane cause hepatitis
- rare disease (halothane induced hepatitis)
- single anesthetist with recurrent hepatitis
- symptoms could be elicited with halothane
- hepatitis could be documented
- single case report clarified the
halothane/hepatitis association
4Case Reports / Case Series
- Case reports are susceptible to bias.
- NEVER use case reports to make treatment
decisions. - CATS (critically appraised topics - PBL)
- avoid cluttering your mind doing a CAT on a case
series - some are very interesting - but the
point of a CAT is to help in patient care - and
more rigorous study designs are necessary.
5Case Series
- Over a period of 18 months,
- 65 individuals were seen with
- fever, hypotension, diffuse rash, desquamation,
and impairment of multiple organ systems. - 59 female, 6 male
- 8 deaths
- age range 8 to 52.
6Development of Case Definition
- Case definition
- fever, hypotension, rash, and desquamation
- involvement of 3 organ systems
- absence of evidence of other etiologies
7CDC investigation of Toxic Shock Syndrome
- 1978 - new shock illness described
- staphylococcus aureus infections Todd et al.
- 1979 - 3 new cases reported
- Wisconsin State Health Department - all women
- Surveillance began - by January of 1980 -
- 12 cases, all women identified
- 11/12 menstruating at onset of illness.
- By May of 1980, 55 cases reported / 7 deaths
8Epidemiologic Approach
- What cases meeting TSS case definition
- When since Todds 1979 report
- Where any where in the U.S. (a bit vague)
- Who women (age, menstruating)
- Why proximate - staphylococcus aureus
- distal - Menstruation?
-
9How to evaluate case series
- Is this rare?
- How would we determine this?
- What would we use as denominator?
- Is the association with menstruation real?
- What is the case-fatality rate?
- Is this rate biased?
10Definitions in Epidemiology
- Bias
- Confounding
- Frequency Measures
- Prevalence
- Incidence
- Measures of Association
- Causal Inference
11Frequency measures
- Ratio value obtained by dividing one quantity
by another. - The ratio of male to female birth in U.S. in
1979 - 1,791,000 / 1,703,000 1.052
- Proportion a ratio where the numerator is
always part of denominator - The proportion of males among all birth in 1979
- 1,791,000 / 3,494,000 51.3
- Rate a change in one quantity per unit change
in another The rate of developing lung cancer
during a 5 year study is - 7 / 50 persons5 years or 7 / 250 person-years
- 0.028 cases of lung cancer per person-year
12Prevalence
- Prevalence (a proportion)
- the proportion of the population at a given time
that have the factor of interest. - Prevalence of an exposure
- what proportion of this class have BMI gt 25
- Prevalence of outcome
- what proportion of this class have hypertension
- Point Prevalence - existing cases at a point in
time - Period Prevalence - existing cases plus those
developing over a specified period of time
13Prevalence
- Numerator
- all those with the attribute at a particular time
- Denominator
- the population at risk of having the attribute
during that same time period
14Prevalence
- Choice of denominator may be difficult.
- in 1997 there were 1854 cases of syphilis in
Harris County - what should be used for the denominator?
- 55 cases of a new disease reported in three
states - what should be used for the denominator?
15Incidence
- Incidence density the probability (risk) of an
individual developing the disease (outcome)
during a specific period of time, using total
person-time as the denominator. One subject
followed one year contributes one person-year
(PY).
16Incidence
- Cumulative Incidence the probability (risk) of
an individual developing the disease (outcome)
during a specific period of time.
17Incidence, Prevalence
Onset
A
B
C
D
E
F
1994
1986
1988
1990
1992
What was prevalence of disease in 1992? What is
risk of developing disease within 2 years?
1 case (A) / 4 subjects) 25
18Incidence, Prevalence
Incidence within 2 years
1/6 17
19Measure of Disease Association
- Ratios
- rate ratio, risk ratio or relative risk (all
abbreviated RR) - odds ratio (OR), and
- prevalence ratios.
- Difference measurements of disease frequencies
include attributable risk
20Case Reports and Medical Advancement
- These all started with case reports - what study
design next? - Lyme Disease (1975)
- Legionellosis (1976)
- AIDS (1981)
- Hantavirus (1993)
- DES exposure (1989)
- EMS l-tryptophan (1970)
- TSS (1980)
21What next?
- CDC outbreak investigation guidelines
- create case definition
- active case finding
- descriptive epidemiology
- characterize the cases person, place, time
- formulate hypotheses
-
test hypotheses with case-control studies
22The case-control study
- Retrospective study design
- identifies cases
- finds controls
- asks about history of exposure
- Measure of association
- odds ratio
23Retrospective vs. prospective study designs
Retrospective (Case-control)
Disease
Present (Cases)
Absent (Controls)
Risk Factor
a
Present (Exposed)
b
Prospective (Cohort)
d
c
Absent (Not Exposed)
24Case control studies
Disease Status
Yes
No
Total
b
a
Yes
Exposure Status
a b
No
d
c
c d
b d
a c
N
25Case-control studies
- Selection of cases
- Case definition is very important
- All cases have an equal probability for
selection reduce selection bias - Selection of controls
- Identical in every respect except disease of
interest
26Case control studies
- Strengths
- Good for unusual or rare diseases
- Smaller in size, quick, easy, cost-effective
- Can use secondary data on disease
- More easily replicated
- Can test hypotheses
- Weaknesses
- Uncertainty is exposure-disease time relationship
- Representativeness of cases or controls
- Memory problems
- Rare exposure a problem
- Survivor problem
- Bias potential (selection)
27Case-control and the Odds Ratio
Disease
Y
N
How much risk is too much risk?
a
b
ab
Y
Exposure
cd
d
c
N
ac
N
bd
Odds of exposure if case a / (ac) / c /
(ac) a/c Odds of exposure if control b /
(bd) / d /(bd) b/d Odds exposure given
disease (a/c)/(b/d) (ad)/(cb)
28Case-control and the Odds Ratio
Y
N
Y
N
29TSS - 3 case-control studies
(50/0) / (43/7) 17.4
(30/1) / (71/22) 6.4
(12/0) / (32/8) 6.5
NOTE A correction factor of 0.5 was added to
each cell when 1 cell contained 0
30Study Methods
- CDC - 1 52 TSS cases with age-matched
acquaintance controls - Wisconsin Study 31 cases, 93 controls from
gynecologic clinics, matched only for
menstruation - Utah 12 TSS cases, 40 neighborhood-matched
controls
31Matched Pair analysis
OR 16 /1 16
How many cases used tampons continually? How many
cases did not use tampons continually? What about
controls?
32How Big is Big?
- Is an OR of 16 big?
- Is an OR of 16 statistically significant?
33BRIEF INTERLUDE - STATISTICS
- Before proceeding we need to know a little about
inference and statistical association
34How Big is Big?
- Is an OR of 16 big?
- Is an OR of 16 statistically significant?
35 - The whole purpose for doing research is to learn
something new. - The result of a research project is the goal
- this is the important information that the
researchers want the informed public to remember. - As we read the literature - we should ask
ourselves - What is the major result?
- What does this result mean?
36Statistical Issues in Epidemiology
- We have to remember that epidemiologic studies
draw inferences about the experiences of an
entire population based on an evaluation of only
a sample.
37Statistical Issues in Epidemiology
- When studying a sample of the population the
observed associations can be due to
- Or it can be due to a true association
38Statistical Issues in Epidemiology
- What do we mean by chance and how does this
relate to determining a - true association
39Statistical Issues in Epidemiology
- Association does not mean cause and effect
- Assessing causality involves judgement based on
the totality of evidence
- Making judgements about causality involves a
chain of logic that addresses two major areas
1. Whether the observed association is valid
2. Whether the totality of evidence supports a
judgement of causality
40Statistical Issues in Epidemiology
- The evaluation of the role of chance is done in 2
steps
1. Estimate the magnitude of the association
- We do this with OR, RR, correlations, AR
2. Hypothesis testing
- Calculate a test statistic, obtain a p value or
confidence interval
41Statistical Issues in Epidemiology
- p-value the probability of obtaining a sample
showing an association of the observed size or
larger by chance alone under the hypothesis that
no association exists.
- Confidence interval a range of values that one
can say, with a specific degree of confidence,
contains the true population value.
- Sample statistic a number which describes some
aspect of a sample which represents a population.
42Statistical Issues in Epidemiology
- This can be done by calculating a test statistic
of the general format - The selection of the particular test used depends
on the specific hypothesis being tested and
characteristics of the collected data.
43Statistical Issues in Epidemiology
- If we were to toss a coin 30 times while trying
to determine if it was a fair coin, and we got 24
heads, how would we determine if 24 was different
that the expected number? - Observed - Expected (under the null)
- Estimated variability in the sample
- We observed 24 - how many did we expect?
- How would we estimate variability?
44Statistical Issues in Epidemiology
- Observed - Expected (under the null)
- Estimated variability in the sample
- 24/30 - 15/30
- ? Variability
- Variability p(1-p)/n1/2 (24/306/30)/301/2
0.07 - (24/30) - (15/30) /0.07 4.3
- p lt0.001
45Statistical Issues in Epidemiology
- The p value indicates the possibility that
findings at least as extreme as those observed
were unlikely to have occurred by chance alone. - In 1000 experiments with 30 tosses with a fair
coin - we would expect only 1 to result in 24
heads or more.
46Statistical Issues in Epidemiology
- A statistically significant finding does not mean
that the results DID NOT occur by chance - only
that it is unlikely that they occurred by chance.
- A non-significant finding does not mean that the
results DID occur by chance.
47Statistical Issues in Epidemiology
- More often in epidemiology we are examining
discrete data - the 2 x 2 table presents discrete
data. Here we are testing whether the
distribution of counts in the 4 cells is
different than expected under the null
hypothesis.
48Statistical Issues in Epidemiology
- But how do we determine the expected value for
the cells of a 2 x 2 table? - O Observed Count in a category
- E Expected Count in a category
- å Sum of all categories
- df Degrees of freedom
49Statistical Issues in Epidemiology
- All tests of statistical significance lead to a
- probability statement
- usually expressed as a p value
- The p-value obtained is based on the principle
that, given the distribution of interest, it is
possible to calculate the exact probability or
likelihood of obtaining a result at least as
extreme as that observed by chance alone assuming
there is truly no association.
50Statistical Issues in Epidemiology
- A probability of 0.05 is the usual (arbitrary)
cut-off level for statistical significance - If p lt0.05, we conclude that chance is an
unlikely explanation for the finding. The null
hypothesis is rejected, and the statistical
association is said to be significant. - If p gt0.05, we conclude that chance cannot be
excluded as an explanation for the finding we
fail to reject the null hypothesis.
51Statistical Issues in Epidemiology
- No p value
- however small - completely excludes chance
- No p value
- however large - completely mandates chance
- p values only evaluate the role of chance
- they say nothing about other alternative
explanations or about causality - p values reflect the strength of the association
and the study sample size
52Statistical Issues in Epidemiology
- A small difference may achieve statistical
significance if the sample size is large - A large difference may not achieve statistical
significance if the sample size is too small
53Statistical Issues in Epidemiology
- We address these problems by calculating
confidence intervals (CI) - CI indicates the range within which the true
magnitude of effect lies with a certain degree of
assurance. The degree of assurance is defined by
the p value you assign. - The CI gives all the information of a p value
PLUS the expected range of effect sizes.
54Statistical Issues in Epidemiology
- If the null value is included in a 95 confidence
interval, then the corresponding p value is, by
definition, greater than 0.05. - If the null value is not included, the
association is considered to be statistically
significant. - WHAT IS THE NULL VALUE for Odds Ratios and
Relative Risks (Rate Ratios)?
55Statistical Issues in Epidemiology
- Test Based CI for either OR or RR
- NOTE variance for either RR or OR may be
estimated using the chi-square test statistic.
Miettinen, Am J Epidemiol 103226-235, 1976
56Statistical Issues in Epidemiology
- Taylor Series to estimate the lnOR variance
Woolf, Ann Human Gen 19251-253, 1955
Note e is a function on you calculator. You
need a key marked ex and you enter the OR times e
raised to the power of the results between the
brackets .
57Statistical Issues in Epidemiology
- Taylor Series to estimate the lnRR variance
Katz, Biometrics, 34469, 1973
Note e is a function on you calculator. You
need a key marked ex and you enter the OR times e
raised to the power of the results between the
brackets .
58Statistical Issues in Epidemiology
- Inference involves making a generalization about
a larger group of individuals on the basis of a
subset or sample. - The p value indicates the probability or
likelihood of obtaining a result at least as
extreme as that observed in a study by chance
alone, assuming that there is truly no
association between the study variables.
59Statistical Issues in Epidemiology
- HOWEVER Before we get to the major result - we
need to examine several issues - 1. What was the question that this study
intended to answer? - 2. What were the methods used to answer this
question? - 3. Are there errors in the study design that
might invalidate the results?
60Statistical Issues in Epidemiology
- For the purposes of critical understanding, we
want to consider information that is often not
given in the summary. - Is chance a likely explanation for the results?
- Is selection bias a likely explanation for the
results? - Is information bias a likely explanation for the
results? - Are the authors conclusions reasonable in terms
of the information presented?
61Cancer Case Series
What next with this case-series?
62DES exposure and Vaginal Cancer
- matched-pair analysis (1 case, 4 controls)
- maternal factors and breast fed
- no statistically significant differences in
maternal age
63Matched Case-control
- Case Control OR
- prior pregnancy loss Yes 6 5
- No 2 27
- Estrogens this pregnancy Yes 7 0
- No 1 32
- Breast Feeding Yes 3 3
- No 3 29
64DES associated with Vaginal Carcinoma
- What are the risks to women exposed to DES
- How could we determine the risks?
65Cohort studies
- longitudinal or prospective studies
- starts with people free of disease with varying
degree of exposure from cause to effect - two points in time, individual is unit of
observation and analysis