Title: Cohort Study
1Cohort Study
- Subodh S Gupta
- Dr. Sushila Nayar School of Public Health
- MGIMS, Sewagram
2(No Transcript)
3Origin of word cohort
- The word cohort has its origin in the Latin
cohors - cohors (Latin word) Refers to warriors and
gives notion of a group of persons proceeding
together in time - Group of persons with a common statistical
characteristic e.g. age, birth date
4Definition Synonyms
- Definition
- The cohort study is an observational
epidemiological study which, after the manner of
an experiment, attempts to study the relationship
between a purported cause (exposure) and the
subsequent risk of developing disease. - Synonyms
- Follow-up
- Longitudinal
- Prospective
- Incidence study
5The cohort design
- Groups are exposure based The group or groups of
persons to be studied are defined in terms of
characteristics manifest prior to the appearance
of the disease under investigation - The study is conceptually longitudinal The study
groups so defined are observed over a period of
time to determine the frequency of disease among
them - A definite beginning and end
6The cohort design
- Efficient for examining
- When there is good evidence of exposure and
disease. - When exposure is rare but incidence of disease is
higher among exposed - When follow-up is easy, cohort is stable
- When ample funds are available
- Common outcomes
7The cohort design
- Many different outcomes for same exposure
- The dynamic nature of many risk factors and their
relations in time to disease occurrence can be
captured here (cannot be done in cross-sectional
study and only with difficulty in case-control
study) - Associations (not cause and effect)
- Estimate incidence within risk factor groups
- Cannot estimate prevalence of risk factor
8Time
Case control study
Direction of enquiry
9Time
Cohort study
Direction of enquiry
10Types of cohort study
- Historical/ Retrospective/ Non-concurrent
- Prospective/ Concurrent
The distinction between retrospective and
prospective cohort studies is important, not
because of any conceptual difference or
differences in interpretability of findings, but
because of relevance to some practical issues,
mostly the ability to control confounding.
11Time
Point in time when enquiry begins?
Direction of enquiry
12Time
Both exposures and outcomes measured prospectively
Direction of enquiry
13Time
Exposures measured retrospectively and outcomes
prospectively
Direction of enquiry
14Time
Both exposures and outcomes measured
retrospectively
Direction of enquiry
15Advantages
- Direct estimate of risk and rate of disease
occurrence over time - An efficient means of studying rare exposures
- Assess multiple outcomes of a single exposure
- Establish temporal relationship between exposure
and outcome - Exposure definitely precedes the outcome
- Avoids recall bias, survival bias
- Does not require strict random assignments of
subjects - Can be done with original data or secondary data
Best observational design to establish association
16Disadvantages
- Very large sample sizes, especially for rare
outcomes - Expensive and time-consuming
- Attrition problem (Loss to follow-up)
- Differences in the quality of measurement of
exposure or disease b/w the cohorts may introduce
misclassification (information bias) - Can not infer causal relation
- Very specific finding
- Complexity of data analysis
- Ethical issues
- Study effects
17Alternate designs and concerns
- Two separate cohorts exposed and unexposed
subjects - Omission of non-factor group
- Use of external comparison
- Use of mortality than morbidity as outcome
- Event notification arises from routine
statistics, rather than special observations - Comparison of several groups
- Competing causes of death
18Cohort Study Steps
19Steps in conducting cohort study
- Identification of study population and initial
steps - Measurement of exposure
- Selection of study and comparison cohorts
- Follow-up (for outcome measurement)
- Data analysis
20Types of cohorts
- Closed or fixed cohorts
- Fixed group of persons followed from a certain
point in time until a defined endpoint - Starting point - exposure defining event
Endpoint occurrence of the disease, loss to
follow-up, death - The exposure is an event which occurs only once
- Open or dynamic cohorts
- Subjects may enter or leave the study at any time
- Exposure status may change over time
21Cohorts
- General population cohorts population groups
offering special resources for follow-up or data
linkage are chosen, and the individuals are
subsequently allocated according to their
exposure status - Special exposure cohorts Samples chosen on the
basis of a particular exposure - Exposures may be a particular event, a
permanent state or a reversible state
22General population cohorts (groups offering
special resources)
- Groups with readily available health records
- Certain professional categories
- Obstetric populations
- Volunteer groups
- Geographically identified cohorts
- Record linkage
23Special exposure cohorts (groups offering
special resources)
- Exposed to certain factor or event
- Occupational groups
- Based on qualitative characteristics
24Population-based Cohort Studies
- Advantages
- Estimation of distributions and prevalence rates
of relevant variables - Risk factor distributions
- Ideal setting in which to carry out unbiased
evaluation of relations
25Selection of comparison group
- Internal comparison
- Only one cohort identified
- Later on, classified into study and comparison
cohort based on exposure - External comparison
- More than one cohort identified
- e.g. Cohort of radiologist compared with
ophthalmologists - Comparison with general population rates
- If no comparison group is available we can
compare the rates of study cohort with general
population - Cancer rate of uranium miners with cancer in
general population
26Ideal Cohort
- Stable cohort
- Cooperative cohort
- Committed cohort
- Well informed cohort
-
27Exposure measurement
- Exposures exogenous and/ or endogenous
- Reference period
- Frequency of follow-up
- Challenge of prospective data collection
- Changes in instrument over time
- Use of repeated measures
- Data collection costs
28Sources of information
- Records
- Cohort members self-administered questionnaires,
interviews, telephone interviews, mailed
questionnaires, - Medical examination biomarkers Clinic
examinations lab tests - Measures of the environment level of air
pollution, quality of drinking water, airborne
radiation - Multiple methods
29Follow-up Types of outcomes
- Discrete events
- Single events
- Mortality
- First occurrence of a disease or health-related
outcome - Multiple occurrences
- Disease outcome
- Transition between states of health/ disease
- Transitions between functional states
- Level of a marker
30Examples of short duration Cohort Studies for a
PG dissertation
- Family conflict, adherence, and glycaemic control
in youth with Type 1 diabetes - Birth cohorts to find out association between
birth weight and hypertension childhood asthma
31Exercise 1
- An investigator wants to discover whether or not
being overweight in adolescence increases the
risk of cardiovascular mortality in adulthood. - Assuming historical records are available, would
a prospective or retrospective study be more
practical? - Who would comprise the investigator's cohort
under study? - Who would comprise the investigator's exposed and
unexposed groups in this cohort?
32Group Exercise
- Design a Cohort Study
- Outline the steps which you will require to do
for this study - Special efforts you may need to do for follow-up
of the study subjects - What care you will need to take to reduce
measurement bias - Calculate the sample size
33Challenges in conducting Cohort Study
34Challenge 1 multiple dimensions of time in
cohort study
35Challenge 2Retaining cohort study members
- Loss to follow-up
- Dropouts
- Can not be traced
- More concern those who cannot be traced May
have moved because they have developed the disease
36Effect of Nonresponse
- Nonresponse a major problem
- A differential nonresponse will distrorts the
true relationship b/w exposure and outcome
37Nonresponse random or selective?
- Exposure data find out if nonrespondents are
different from the respondents - Intensive efforts within the study design
- Follow-up of the nonrespondents as well as
respondents
38Challenge 3Large Modern Cohort Studies
- Huge requirements of resources and manpower
- Management of huge database
- Follow-up
- Exposure information
- Data quality?
- Collection of biologic samples?
39Challenge 4Long term follow-up
- Operational problems
- Cumulative risk getting closer to one
40Nested case control Case cohort study
- Nested Case Control Study
- Case Cohort Study
- Can be done in both population-based and
non-population based cohort settings - Meets the assumptions cases and controls come
from the same population - Avoids problems related to recall bias
41Cohort Study Analysis
42Standard 2 X 2 table(Relation between exposure
and outcome)
43Two types of measures for rate
- Cumulative incidence Proportion of study
subjects getting the outcome during the study
period - Incidence rate New cases/ Person-time under
observation
44 1. Cumulative incidence rate
- Number of new cases of disease occurring over a
specified period of time in a
population at risk.
45EXAMPLE
- A surveillance system for Hospital acquired
infection among the post-operative patients in a
month.
46Example
9 6 14 14 24 19 14 4 5 19 21 6
0 5 10
15 20 25
30
472. Incidence density
- Number of new cases of disease occurring over a
specified period of time in a population at risk
throughout the interval.
48- Incidence density requires us to add up the
period of time each individual was present in the
population, and was at risk of becoming a new
case of disease. - Incidence density characteristically uses as the
denominator person-years at risk. (Time period
can be person-months, days, or even hours,
depending on the disease process being studied.)
49USES OF INCIDENCE DENSITY AND CUMULATIVE INCIDENCE
- Incidence density gives the best estimate of
the true risk of acquiring disease at any moment
in time. - Cumulative incidence gives the best estimate of
how many people will eventually get the disease
in an enumerated population.
50Standard 2 X 2 table(Relation between exposure
and outcome)
51l X 2 table(Relation between exposure and
outcome)
52Comparing risks in different groups
- Relative risk OR Risk ratio (RR)
- Attributable risk OR Risk difference (AR)
- Attributable risk percent (AR)
- Population attributable risk (PAR)
- Population attributable risk percent (PAR)
- Odds Ratio (OR)
53Relative risk OR Risk ratio
- Ratio of the risk among exposed to the risk among
unexposed Risk (Exp) / Risk (Unexp) - Risk of disease among exposed a/ a b)
- Risk of disease among unexposed c/ c d)
- RR a/ a b) / c/ c d)
- For null hypothesis, Risk ratio will equal one
- SE
54Risk difference vs. Relative risk
22
Relative risk
Absolute risk
1
55Attributable risk OR Risk difference (Absolute
differences in risks or rates)
- Also known as attributable risk
- Risk (Exp) Risk (Unexp)
- Risk of disease among exposed a/ a b)
- Risk of disease among unexposed c/ c d)
- Risk difference a/ a b) - c/ c d)
- For null hypothesis, Risk difference will equal
zero
56Risk difference vs. Relative risk
Risk difference
Absolute risks(Exp Unexp)
57Attributable risk percent among exposed
- Among exposed, what percent of the total risk for
disease is due to the exposure - AR (Exposed)
- Risk (Exp) Risk (Unexp)/ Risk (Exp) X 100
- (RR 1)/ RR X 100
- (OR 1)/ OR X 100 (if risk is small)
58Attributable Risk Percent
22
risk due to exposure
Relative risk
Absolute risks (Exp)
risk due to background
1
59Attributable Risk Percent
p0RR
Relative risk
p0RR
p0(RR-1)
p0
1
Attributable risk Percent (RR-1)/ RR 100
60Population attributable risk
- In the general population, how much of the total
risk for disease is due to the risk factor - Risk (Total) Risk (Unexp)
- Risk (Total)
- Proportion population Exp X Risk (Exp)
- Proportion population Unexp X Risk
(Unexp) -
61Population attributable risk percent
- Among the general population, what percent of the
total risk for disease is due to the risk factor - PAR
- Risk (Total) Risk (Unexp)/ Risk (Total) X
100 - Pe (RR 1)/ 1 Pe (RR 1) X 100
62Population attributable risk percent
RR
(RR-1)(1-Pe)
Pe(RR-1)
(1-Pe)
Pe
1
Population Attributable risk Percent
Pe (RR 1)/ 1 Pe (RR 1) X 100
63Risk Reduction
- Risk (T/t) a/(ab)
- Risk (Exp) c/(cd)
- RR Risk (T/t)/ Risk (Exp)
- ARR Risk (Exp) Risk (T/t)
- RRR Risk (Exp) Risk (T/t) / Risk (Exp)
- 1-Risk(T/t)/Risk(Exp)
- 1-RR
- NNT 1/ARR
- 1/Risk(Exp)RRR
- NNH
-
64Analytical considerations
- Concurrent follow-up
- Varying follow-up dates
- Moving baseline dates
- Withdrawals
- Competing causes of death
65(No Transcript)
66Analytical considerations
- Concurrent follow-up
- Simple risk-based analyses
- Survival analysis
- Varying follow-up dates
- Simple risk analysis for all events up to, but
not exceeding, the minimum elapsed time - Survival analysis
- Moving baseline dates
- Ignore and measure elapsed time since recruitment
- Survival analysis
- Withdrawals
- Competing causes of failure
67Advanced methods
- Standardization
- Stratification
- Life Tables
- Multivariate analysis and Cox regression
68Exercise 2
- A cohort study to explore the relationship
between visual impairment and the risk of
injuries from falls among the elderly. - A total of 400 visually impaired (VI) persons gt70
yrs are compared against 400 controls without VI.
- Over a 5-year follow-up period, 80 VI persons and
20 non-VI persons have injuries from falls. - Construct a 2x2 table from the information above
- Calculate the followings with their CI
- Cumulative Incidence rate for exposed and
unexposed - Relative risk
- Attributable risk Attributable risk percent
69Exercise 2
- A cohort study to explore the relationship
between visual impairment and the risk of
injuries from falls among the elderly. - A total of 400 visually impaired (VI) persons gt70
yrs are compared against 400 controls without VI.
- Over a 5-year follow-up period, 80 VI persons and
20 non-VI persons have injuries from falls. - Construct a 2x2 table from the information above
- Calculate the followings with their CI
- Cumulative Incidence rate for exposed and
unexposed - Relative risk
- Attributable risk Attributable risk percent
70Exercise 3
- A retrospective cohort study to explore the
relationship between perimenopausal exogenous
estrogen use and the risk of coronary heart
disease (CHD). - A total of 5000 exposed and 5000 unexposed women
are enrolled and followed for 15 years for the
development of myocardial infarction (MI). - A total of 200 estrogen users and 300 nonusers
had MIs.
71Exercise 3 (Contd.)
- The risk (CI) of a MI among estrogen users
- The risk (CI) of a MI among nonusers of estrogen
- The relative risk (CIR) for MI
- Based on the results of this study is estrogen
use a causative or protective factor for MI?
72Exercise 4
- Shaper et. al. (1988)
- A random sample of 7729 middle-aged British men
- Each man asked, at baseline, his alcohol
consumption - Next 7.5 years, death certificates collected for
any subject who died
73Exercise 4 (Contd.)
- Calculate the risk and the relative risk for each
alcohol consumption group. - Why might the conclusion based on the above table
may be misleading? Given adequate funding,
describe how?
74Exercise 5
- In a cohort study of 34387 menopausal women in
Iowa, intakes of certain vitamins were assessed
in 1986. In the period up to the end of 1992,
879 of these women were newly diagnosed with
breast cancer. The table below shows data for two
vitamins, classified according to ranked
categories of intake.
75Exercise 5 (Contd.)
- For each vitamin, calculate the relative rates
(with 95 confidence intervals) taking the
low-consumption group as the base. Do your
results suggest any beneficial (or otherwise)
effect of additional vitamin C or E intake?
76Types of bias
- Selection bias
- Follow-up bias
- Information bias
- Confounding bias
- Post hoc bias
77Selection bias
- Group studied does not reflect the same
distribution of factors (such as age, sex, SES,
behavior etc.) as occurs in the general
population - Effect of volunteering
- Whole spectrum of independent variables not
represented in the study group - Presence of incipient disease
- Distribution of covariates
- Survival cohorts cohorts ascertained long after
exposure
78Example bias with survival cohort
Observed improvement
TRUE COHORT
True improvement
Measure outcome Improved 75 Not improved
75
Assemble Cohort (N150)
50
50
SURVIVAL COHORT
Assemble patients
Begin Follow-up (N50)
Measure outcome Improved 40 Not improved 10
50
80
Not observed (N100)
Dropouts Improved 35 Not improved 65
79Follow-up bias
- Also known as Migration Bias
- In nearly all large studies some members of the
original cohort drop out of the study - If drop-outs occur randomly, such that
characteristics of lost subjects in one group are
on an average similar to those who remain in the
group, no bias is introduced - But ordinarily the characteristics of the lost
subjects are not the same
80Example of lost to follow-up
EXPOSURE irradiation
EXPOSURE irradiation
30
30
DISEASE cataract
RR 30/4000 30/8000 2
RR 50/10000 100/20000 1
81Example. healthy worker effect
- Question association b/w formaldehyde exposure
and eye irritation - Subjects factory workers exposed to formaldehyde
- Bias those who suffer most from eye irritation
are likely to leave the job at their own request
or on medical advice - Result remaining workers are less affected
association effect is diluted
82Measurement / (Mis) classification
- Exposure misclassification occurs when exposed
subjects are incorrectly classified as unexposed,
or vice versa - Disease misclassification occurs when diseased
subjects are incorrectly classified as
non-diseased, or vice versa
83Misclassification bias due to measurement errors
- Systematic bias
- Measurement errors
- Non-differential observed relative risk biased
towards the null hypothesis - Differential This can lead to study results,
which can not be interpreted because the observed
relative risk may be biased towards the null,
away from the null, or cross over the null value
compared with the true relative risk
84Sources of measurement errors
- Selection/ design of the instrument to measure
the exposure - Omissions in the protocol for use of the
instrument - Poor execution of the study protocol
- Inherent subject characteristics
- Drift in accuracy of exposure measures over time
- Data processing and creation of exposure variables
85Reassignment to exposure category
- Changes in dichotomous exposure, if not taken
into consideration will tend to make the strength
of an observed association lower than that which
actually existed - Latency is likely to be short
- Exposure accumulates over time during the study
- Very accurate results desirable
- Reassignment may not be possible
- Close cohort as a rule
- Latency is very long
- Duration of follow-up is very long
Separate examination of outcome in those who
changed exposure status during the study
86Confounding bias
- Other factors which are associated with both
outcome and exposure variables do not have the
same distribution in the exposed and unexposed
group
87Examples confounding
HEART DISEASE
COFFEE DRINKING
(Smoking increases the risk of heart ds)
(Coffee drinkers are more likely to smoke)
SMOKING
88Resolving Confounding Bias
- Standardization
- Stratification
- Multivariate adjustment
89Post hoc bias
- Use of data from a cohort study to make
observations that were not part of original study
intent.
90Thank you
91Internal External validity