Title: Epidemiology Made Easy
1Epidemiology Made Easy
- Presented By
- Catherine Tapp, MPH
- August 23, 2005
2Overview of Topics
- General Introduction to Epidemiology
- Dynamics of Disease Transmission
- Measuring the Occurrence of Disease
- Surveillance Overview
- Validity and Reliability
- Statistics
- Epidemiologic Study Designs
- Estimating Risk
- Evaluation
- Chronic Disease Data Sources
- NAACCR Educational CD Overview
3What is Public Health?
- An organized community effort to prevent disease
and promote health (Institute of Medicine, 1988) - Goals are to reduce the burden of disease,
disability and premature death in a population. - A group of activities
- Composed of many different disciplines i.e.)
health education, MCH, biostatistics, lab
science, family planning, nutrition, health
policy development, veterinary health,
EPIDEMIOLOGY, etc.
4What is Epidemiology?
- Derived from the Greek words epi (upon) demos
(people) logy (study of) - Epidemiology is the dynamic study of the
distribution, determinants, occurrence and
control of diseases, health, and injuries in
human populations. - The core science of public health.
- Distribution - Descriptive Epidemiology
- Determinants - Analytic Epidemiology
5What is Studied in Epi.?
- Morbidity events and factors related to or
caused by disease or disability - Mortality events and factors related to death
6Other Definitions of Epi.
- The relationship of disease or health to the
population at risk - The determination, analysis, and interpretation
of rates - The study of the patterns of disease occurrence
- Identifying risk factors
7(No Transcript)
8Descriptive Epidemiology
- Examining the distribution of disease in a
population and observing the basic features of
its distribution in terms of person, place and
time.
9(No Transcript)
10Time When?
- Changing or stable?
- pertusis (whooping cough)
- Seasonal variation?
- influenza
- Clustered (epidemic) or evenly distributed
(endemic)? - cancer
11Place Where?
- Geographically restricted or widespread
(pandemic)? - Relation to water or food supply?
- Multiple clusters or one?
12Person Whom?
- Age
- Gender
- Ethnicity
- Race
- Socio-economic status
- Behavior
- Genetics
- Occupation
- Religion
- Stress
- Personal habits
- Marital status
- School
- Travel
13Analytic Epidemiology
- Focus on causation of disease by testing a
specific hypothesis about the relationship of a
disease to a cause (risk factor).
14Risk Factors/Determinants
- The presence of certain risk factors may be
associated with increased probability for disease
development. - What are some examples?
15Underlying Assumption
- Disease or illness does NOT occur RANDOMLY in a
population. - Everyone does not have
- equal risk because of characteristics that
predispose us to or protects us against a variety
of diseases.
16Objectives of Epidemiology
- To identify the etiology or cause of a disease
and the risk factors (characteristics that
increase an individuals risk for a disease). - Ultimate goal is to intervene to reduce morbidity
and mortality - To determine the extent of disease found in the
community. - Help planning programs, obtain resources, etc
- To study the natural history and prognosis of
disease. - Define the baseline of a disease for comparisons
post intervention
17Objectives Continued
- To evaluate both existing and new preventive and
therapeutic measures and modes of health care
delivery. - HMOs better outcomes? Worse?
- PSAs and survival status in prostate cancer
patients - To provide the foundation for developing public
policy and regulatory decisions relating to
environmental problems. - Radon in the homes
- Occupational risk in workers and required
regulations
18Changing Patterns of Health
- A major role of epidemiology is to provide clues
to changes in population health that take place
over time. - The goal is to plan for resources for research,
intervention and services.
19(No Transcript)
20Life Expectancy
21(No Transcript)
22Life Expectancy
- In the U.S., life expectancy has increased
dramatically over the past century. - 1900 avg. life expectancy at birth was 47.3
years - 1992 avg. life expectancy was 75.8 years
- Mortality reductions due to improved nutrition,
pasteurization of milk, immunization, smaller
family size, reduced risk of fatal infections,
prenatal carenow it is due to antibiotics,
better anesthesia and post-op care etc. - Gender and race differences
23Epidemiology Prevention
- A major goal of epidemiology is to identify
high-risk subgroups in the population. Why? - Identify risk factors that may be modifiable.
- Direct preventive efforts at these groups
- Screening program for early detection of disease
i.e.) breast cancer, prostate cancer
24Prevention
- Primary Prevention actions taken to prevent the
development of a disease in a person who is well
and does not have the disease in question. - Examples immunization, exercise, removal of an
exposure such as smoking or an environmental
agent - Active (seatbelt usage) or passive (fluoridation
of public water supplies) - Our ultimate goal in public health
25Prevention
- Secondary Prevention the identification of
people who have already developed a disease, at
an early stage in the diseases natural history
through early detection and early intervention. - Examples breast cancer screening, occult blood
in stool for colon cancer - If disease is identified early then intervention
measures will be more effectiveless costlyless
invasive
26Prevention
- Tertiary activities designed to reduce the
limitation of disability from disease and to
restore function. - Examples physical therapy for stroke victims,
cardiac rehab for heart attack victims, halfway
houses for recovering alcoholics
27Prevention
- Population-based approach a preventive measure
is widely applied to the population a public
health approach - Relatively inexpensive and noninvasive
- Examples dietary advice, advice against smoking
- High risk group approach usually requires a
clinical action to identify the high-risk group
to be targeted - Expensive, more invasive and inconvenient
- Examples screening for cholesterol in children
from high risk families
28(No Transcript)
29Epidemiology Clinical Practice
- Go hand-in-hand
- The practice of medicine is dependent on
population data - Diagnosis population-based process
- Prognosis the prediction of disease
population-based - Selection of therapy population-based
randomized clinical trials - Based on how groups of people respond to certain
therapies
30Epidemiology Clinical Practice - Metaphor
- Imagine a torrential flood contributed to by
a failure in a levee system. This flood is
washing away citizens in record numbers. In such
circumstances, it is the physicians task to
offer life-jackets to citizens one at a time.
The epidemiologist, on the other hand, attempts
to find the flaw in the levee system to prevent
further flooding. - Fixing the flaw in the levee is a matter of
public health.
31The Epidemiologic Approach
- How does the epidemiologist proceed to identify
the cause of a disease? Via a multi-step
process. - First determine if an association exists
between a factor (e.g., sun exposure) or a
characteristic of a person (e.g., moles, fair
skin) and the development of disease (e.g., skin
cancer). - Second is this association a causal one? (not
all associations are causal)
32Usual Sequence of Epi. Method
Theory or Observation
Review Existing Information
Define/Refine Hypothesis
Preventive Action
Descriptive Studies
Analytic Studies
Collect Analyze Data
Formulate Conclusions
33From Observation to Preventive Action
- Shoe Leather Epi.
- Edward Jenner and smallpox vaccine
- John Snow (the Father of Epidemiology) and
London cholera epidemic - Smoking and lung cancer
34Jenner Smallpox
- Late 18th century worldwide epidemic with 400,000
deaths a year. - Those who survived smallpox were then immune
- Jenner observed dairy maids developed cowpox and
did not contract smallpox during outbreaks - Hypothesis - cowpox was protective
- Jenner knew nothing about viruses or etiologyhe
operated purely on observational data that became
the basis for a preventive intervention.
35Snow Cholera
- Cholera was a major problem in England in the mid
19th century - Snow hypothesized that cholera was transmitted
through contaminated water - Broad Street Pump 600 deaths in one week
- Water was supplied via water supply companies
with intakes from the polluted part of the river - Lambeth water company moved intake upstream
- Mortality should then be lower in people getting
water from L.
36(No Transcript)
37(No Transcript)
38Take Home Message
- Although important to understand the biology and
pathogenesis of disease, it is not always needed
to take preventive action.
39In Summary
- Epidemiology is the study of the distribution and
determinants of disease in populations. - Public health uses epidemiologic study findings
to prevent and control health problems in human
populations. - Major causes of mortality have changed radically
over the current century. - Epidemiology is an invaluable tool in the control
of disease and the human suffering associated
with it.
40Dynamics of Disease Transmission
41The Dynamics of Disease Transmission
- Human disease results from an interaction of the
HOST, the AGENT and the ENVIRONMENT. - Communicable diseases are most commonly used to
describe the underlying principles of disease
transmission. - Disease causation is usually described in terms
of two models epidemiologic triad/triangle
web of causation
42Epi. Triad of Disease
HOST
VECTOR
AGENT
ENVIRONMENT
43Host Characteristics
- Include personal characteristics and behaviors,
genetic predispositions, immunologic and other
susceptibility-related factors that increase or
decrease the likelihood of disease. - Examples age, sex, race, religion, customs,
occupation, genetics, marital status, family
background, previous diseases, immune status
44Agent Characteristics
- Biological, physical, or chemical factors whose
presence, absence, or relative amount (too much
or too little) are necessary for the disease to
occur. - Examples bacteria, viruses, fungi, poison,
alcohol, smoke, drugs, trauma, radiation, fire
45Environmental Characteristics
- External conditions, other than the agent, that
contribute to the disease process. - Can be physical, biological, or social in nature.
- Examples temperature, humidity, altitude,
crowding, housing, neighborhood, water, milk,
food, radiation, air pollution, noise
46Web of Causation
- This model de-emphasizes agent and stresses the
multiplicity of interactions between the host and
environment. - Multiple actions and reactions occur between
promoters and inhibitors of disease. - Example diabetes, cancer
47Multiplicity of Factors Underlying Adult-onset
Diabetes
48Modes of Transmission
- Direct
- Person to person
- Indirect
- Common vehicle single, multiple, continuous
exposures - contaminated air, food, or water supply
- Vector
- Mosquito (West Nile Virus), deer tick (Lymes
Disease - Airborne
- dust, droplet nuclei
49Stages of Disease
- Important to recognize the broad spectrum of
disease severity - Iceberg concept the tip is only visible much
like the clinical appearance of diseasebulk of
the problem may be hidden from view. - Example tuberculosis cases are not always
clinically visible other examples??
50Iceberg Concept of Infectious Diseases
51Dog Bite Example
451,000 Medically Treated
3.73 Million Nonmedically treated
52Stages of Disease
- Clinical Disease signs and symptoms
- Nonclinical (inapparent) Disease can include
- Preclinical disease not clinically apparent but
destined to progress to clinical disease - Subclinical disease not clinically apparent and
disease will not develop diagnosed by antibody
response or culture - Persistent (chronic) disease infection persists
for years or for life - Latent disease no active multiplication of the
agent
53Stages of Disease Prevention
54Carrier Status
- Individual has the organism but no antibody
response or no evidence of clinical illness - Can infect others
- Carrier status may be of limited duration or last
for months or years - Example Typhoid Mary carried Salmonella typhi,
died in 1938, worked as a cook in NYC, caused 10
typhoid fever outbreaks 51 cases and 3 deaths
55Reservoirs
- Living organisms or inanimate matter (such as
soil) in which an infectious agent normally lives
and multiplies. - Essential for infectious agent to maintain and
perpetuate itself. - Examples humans, animals and environmental
sources
56The emics
- Endemic habitual presence or usual occurrence
of disease in a geographic area. (e.g.) chicken
pox in OK - Epidemic occurrence in a community or region of
a group of illnesses of similar nature, clearly
in excess of normal expectancy, and derived from
a common or propagated source. (e.g.) obesity in
U.S. - How do we know if there is an excess? Anything
above normal baseline. (On-going surveillance) - How much to expect? Really dont know no clear
cut example - Pandemic a worldwide epidemic (e.g.) certain
flu viruses
57Endemic vs. Epidemic
58Excess of Deaths in London
59Disease Outbreaks
- Common-vehicle exposure (e.g.) outbreak in a
group of people who ate a certain food - Single exposure (e.g.) food served only once
- Multiple exposures (e.g) food served more than
once and people eat more than once - Periodic or continuous (e.g.) a water supply is
contaminated with sewage b/c of leaky pipes
60Single-exposure, common-vehicle outbreak
- Such outbreaks are explosive a sudden and rapid
increase in the of cases of a disease - The cases are limited to those who share the
common exposure. - Cases rarely occur in persons who acquire the
disease from a primary case.
61Herd Immunity
- Resistance of a group to an attack by a disease
to which large proportions of the group are
immune. - If a large percent of the population is immune,
the entire population is likely to be protected,
not just those who are immune. - Why does it occur? Disease spreads from one
person to another in any community and once a
certain amount of immune people is reached, the
likelihood is small that an infected person will
encounter a susceptible person.
62Herd Immunity Continued
- Why is herd immunity so important? It is not
necessary to carry out 100 immunization in a
population to be highly effectivepart of the
population will be protected due to herd
immunity. - Certain conditions must be met
- Transmission must occur within one host species
- Assumes random mixing of the population
- Percentage of population necessary for immunity
varies with each disease (e.g.) est. that 94 of
the population requires immunity before the chain
of transmission of measles is interrupted.
63Incubation Period
- Interval from receipt of infection to the time of
onset of clinical illness. - Person usually feels well
- Person may be infectious during this period
- Different diseases have varying incubation
periods. - In noninfectious diseases the incubation period
is referred to as the latent period. - Variability may be due to differences in host
susceptibility, pathogenicity or the agent, or
dose of exposure
64Latent Periods of Bladder Tumors
65Epidemic Curve
- Distribution of times of onset of disease
- In a single exposure, common-vehicle epidemic,
the epidemic curve represents distribution of
incubation periods. - If the infection took place at one point in time,
the interval from that point to the onset of each
case is the incubation period in that person.
66Graphical Representation of an Outbreak
Classic epi. curve
67Outbreak Investigations
- Three critical variables in investigating an
outbreak or epidemic are - When did the exposure take place?
- When did the disease begin?
- What was the incubation period for the disease?
68Outline of an Epidemic Investigation
- Preliminary Analysis
- Verify the diagnosis
- Verify the existence of an epidemic
- Learn about the disease using existing
information - Describe the epidemic with respect to time, place
and person (cases numerator) - What population is at risk? (denominator)
- Formulate and test hypotheses
69Outline of an Epidemic Investigation Continued
- Further Investigation and Analysis
- Search for additional cases
- Collect additional data
- Analyze the data (case-control studies)
- Make a decision about the hypotheses considered
- Intervention and follow-up
70Outline of an Epidemic Investigation Continued
- Report of the Investigation
- Discussion of factors leading to the epidemic
- Evaluation of measures used for control of
present epidemic - Recommendations for future prevention of outbreaks
71Measuring the Occurrence of DiseasePart I
Measures of Morbidity
72Measuring the Occurrence of Disease
- Ones knowledge of science begins when he can
measure what he is speaking about and express it
in numbers. - - Kelvin
73Development of Disease in an Individual
74Transmission of Disease
- Measure the frequency of disease occurrence
- Measure the frequency of deaths
- How are these frequencies measured?
- Rates are used to express the extent of morbidity
and mortality from a disease how fast the
disease is occurring in a population - Proportions describe what fraction of the
population is affected
75Measures of Morbidity
- Incidence the number of new cases of a disease
that occur during a specified period of time in a
population at risk for developing the disease. - Prevalence number of affected (diseased)
persons present in the population at a specified
time divided by the number of persons in the
population at that time.
76INCIDENCE
Incidence per _________ of new cases in a
specified time of persons at risk for
developing the disease during that time period
77INCIDENCE
- NEW cases transition from non-disease to
disease (the numerator) - Measure of events
- Measure of risk
- Looked at in any population group
- Examples particular age-group, males, females,
occupational group, exposed group of people, etc.
78INCIDENCE
- Denominator - of people who are at risk for
developing the disease. - Those individuals included in the denominator
must have the potential to become part of the
group that is counted in the numerator. - Incidence of uterine cancer only in women
- A period of time must be specified for incidence
to be a measure of risk. - Arbitrary 1 week, month, year(s)
- Must be clearly specified.
79PREVALENCE
Prevalance per ______ of cases of disease in
pop at specified time of persons in the pop at
the specified time
80PREVALENCE
- Numerator existing cases (old and new) with
differing durations of disease - NOT a measure of risk but a measure of the
disease burden on the community - A slice through the population at a point in
time to determine who has disease and who does
not. - Does not determine when the disease developed
81PREVALENCE
- Point Prevalence prevalence of disease at a
point in time - Do you currently have asthma?
- Period Prevalence prevalence of disease at a
specified period of time (e.g.) a single calendar
year - Have you had asthma during the last 2 years?
82Incidence Prevalence
Point prevalence Numerator depends on when survey
is done.
5 cases (numerators) of a disease. What is the
numerator for incidence in 2000?
83Relationship Between Incidence and Prevalence
New Cases
Incidence (inflow)
Prevalence (water level)
Old Cases
Recovery or death
A continual addition of new cases is increasing
the prevalence, while death and/or cure is
decreasing the prevalence
Former Cases
84PREVALENCE
- A dynamic situation
- A continual addition of new cases increases the
prevalence - Death and/or cure decreases the prevalence
- Can be seen as a paradox a new measure is
introduced that enhances survival or detects
disease in more people thus increases prevalence
(not always bad if death is prevented) Example
insulin diabetes - Valuable for planning health services
85Comparison of Incidence Prevalence
86Problems with Incidence and Prevalence
Measurements
87Problems with Numerators
- Defining who has the disease
- Differing sets of diagnostic criteria
- Prevalence estimates are affected by the set of
criteria that is used - Who should be included in the numerator?
- How are cases found?
- Use available data
- Conduct a study that is designed to gather data
- Can involve interviewsmany errors can arise
(refer to Gordis table 3-4)
88Problems with Hospital Data
- Data from medical records are very important for
epidemiologic studies. - Hospital admissions are selective
- Medical records are not designed for research but
for patient care - Can be incomplete, illegible, missing data
- Problem defining denominators b/c no defined
catchment areas therefore difficult to calculate
rates
89Problems with Denominators
- Selective undercounting of certain population
groups - Example young males in ethnic minority groups
- Everyone in the denominator must have the
potential to enter the numerator not that
simple - Example hysterectomy and uterine cancer rates
90Corrected rates are higher. Why? Women who had
hysterectomies are removed from the
denominatorthis decreases and the rate gets
larger. Trend over time is not significantly
changed.
91Measuring the Occurrence of DiseasePart II
Measures of Mortality
92Mortality Rates
- Rates are used to address the risk of dying
- Same rules as morbidity apply n/d, time factor
- Several types of mortality rates
- Annual mortality rate from all causes
- Age-specific mortality rate
- Disease-specific or cause-specific mortality rate
- Simultaneous restrictions ex.) age and cause of
death - when a restriction is placed on a rate it is
called a specific rate
93Specific Rates
- Stratifies populations into more homogeneous
groups (strata) based on the demographic
characteristic thought to be related to the
outcome of interest. - Examples
- Age-specific, sex-specific, race-specific
94Age-Specific Mortality Rate
- Provide a broader view of mortality for subgroups
stratified by age - Numerator and denominator are limited to a
specific age group - Comparable across populations
95Crude Rates
- Summary statistics that ignore the heterogeneity
of the population under investigation
96Crude Rates
- Advantages
- Actual summary rates
- Easy calculation for international comparisons
- Disadvantages
- Difficult to interpret and differences in crude
rates because populations usually vary in
composition (e.g. age)
97Years of Potential Life Lost (YPLL)
- A mortality index to gauge the loss of productive
years in a person who dies. - Assumes that death in the same person at a
younger age results in a greater loss of future
productivity years than death in older age. - Helps prioritize resources to have maximum impact.
98Why look at mortality?
- It is an index of severity of a problem from both
clinical and public health standpoints. - It can be an index of risk of disease, especially
if the disease is quickly fatal but not if the
disease is mild and not fatal. (e.g., pancreatic
ca thus a good surrogate for incidence)
99Problems with Mortality Data
- Accuracy and trends over time may be influenced
by several factors - Changes in coding of underlying cause of death on
death certificates e.g., change to a newly
revised ICD - Changes in definition of diseases e.g., AIDS
- Countries and regions vary greatly in the quality
of data on their death certificates - Whenever a time trend of increased or decreased
mortality we must first askIs this real?
100Reported causes of death in early 1900s
- Died suddenly without the aid of a physician
- A mother died in infancy
- Deceased had never been fatally sick
- Died suddenly, nothing serious
- Went to bed feeling well, but work up dead
101Comparing Mortality in Different Populations
- Important use of morality data is to compare two
or more populations or one population over time. - Many characteristics that can affect mortality
(age, gender, race) may differ in populations
thus making comparisons of rates problematic. - Methods developed for comparing mortality in
differing populations that hold constant (adjust
for) certain characteristics like age or others. - Rate Standardization
102Age Adjustment Methodologies
- Allows comparisons of rates between populations
that differ by age, a common variable that can
influence the rate - Direct Method
- Indirect Method (Standard Mortality Ratios)
103Adjusted Rates
- Advantages
- Summary statements
- Differences in group composition removed,
allows unbiased comparison - Disadvantages
- Fictional rates
- Absolute magnitude is dependent on the standard
population that is chosen - Trends in subgroups can be masked
104Direct Age Adjustment
- Requires age-specific rates in the population
- The age for each case
- The population at risk for each age group in the
pop. - Requires an age structure of a standard pop.
- Requires a standard population, to which the
estimated age-specific rates can be applied - Can affect the magnitude of the age-adjusted
rates - 1940 vs. 1970 vs. 2000 U.S. Standard Population
105Interpreting Observed Changes in Mortality
- An increase or decrease in mortality could be
artifactual or real - If artifactual, could be a result of problems
with the numerator or denominator (Table 3-22 pg.
56) - If real, what are some possible explanations?
(Table 3-23)
106Overview of Surveillance
A state of continuing watchfulness the
systematic collection, analysis and dissemination
of data on adverse health outcomes occurring in a
defined area. -Centers for Disease Control
and Prevention
107Purpose of Public Health Surveillance
- Assess public health status
- Define public health priorities
- Evaluate programs
- Stimulate research
108Why Surveillance Data is Important
- Detect and control disease outbreaks
- Determine disease etiology and natural history
- Monitor disease trends
- Detect changes in health practice and health
behaviors
109Why Surveillance Data is Important Continued
- Evaluate the effectiveness of intervention
programs, policies and activities - Detect need for changes in provision of health
care services - Determine appropriate and efficient allocation of
resources and personnel, and development of
appropriate policies
110Cancer Surveillance
- The foundation for a national, comprehensive
strategy to reduce illness and death from cancer.
111Cancer Surveillance
- Guide planning evaluation of cancer control
programs and interventions - Are prevention measures making a difference?
- Determine cancer patterns in various populations
- Identify geographic and/or ethnic variations
- Monitor cancer trends over time
- Advance clinical, epidemiological, and health
services research - Help prioritize health resource allocation
112(No Transcript)
113Good surveillance does not necessarily ensure
the making of right decisions, but it reduces the
chances of the wrong ones. - A. Langmuir MD,
MPH former Director of Epidemiology for CDC
114- Since cancer is the second leading cause of
death in Arkansas, it is essential that specific
information concerning this group of diseases be
collected, analyzed and reported.
115Heart Disease and Cancer Mortality Arkansas,
1979 - 1998
Age-adjusted to 1940 U.S. population
116Cancer Registry
- Cornerstone of cancer surveillance
- Best tool to measure the nations progress
against cancer
The unique role of the central cancer registry
is to be the eyes through which cancer control
problems can be seen. - Thomas C. Tucker
117National Cancer Registries
- CDC NPCR (National Program of Cancer
Registries) - Cancer Registries Amendment Act in 1992
- Every state plus certain territories has a cancer
registry in place - SEER (Surveillance Epidemiology End Results)
NCI - Gathers in-depth data on cancer cases in specific
locations - NAACCR (North American Association of Central
Cancer Registries) - Standard setting organization certification
118(No Transcript)
119Data Utilization
- Data Linkages
- Mortality
- BreastCare data
- Occupational cohorts
- Site specific fact sheets
- Prostate Cancer Foundation
- Media inquiries
- Presentations
- General concern and curiosity
- Cluster Investigations
- Grants
- Community profiles
- Komen
- Cancer coalition
- Cancer Plan
- Research
- Cancer cluster investigations
- Interventions
- Data Requests
- GIS
120Data Resources
- NAACCR www.naaccr.org
- SEER http//seer.cancer.gov/
- ACS www.cancer.org
- NCI www.cancer.gov
- CDC/NPCR www.cdc.gov/cancer
- Various Publications
- CINA, USCS, Facts Figures, etc
- ACCRs Homepage - www.healthyarkansas.com/arkcan
cer/arkcancer.html - ACCR On-line Cancer Data Query System
http//cancer-rates.info
121Validity Reliability of Diagnostic Screening
Tests
122Validity Reliability
- Important to identify who has the disease and who
does nota challenge in both public health and
clinical settings. - Quality of screening and diagnostic tests is
critical - Main question to ask of all tests is, How good
is the test in separating populations of people
with and without the disease in question?
123Screening
- Screening is the identification of unrecognized
disease by application of rapid tests to separate
well persons who probably have the disease from
those who probably do not have the disease. - A screening test is not intended to be
diagnostic. - Persons with positive results should be referred
for diagnosis and treatment.
124Screening
- Main concern is with asymptomatic healthy
individuals. - Theoretically, if a disease has not yet reached
the threshold of clinical visibility (still at an
early stage), the chances are that cure is good. - The screening method should be reliable and cost
effective. - Treatment should be possible and facilities
should be made available to those who require
treatment.
125Screening vs. Diagnostic
- Screening tests aim to detect unknown disease in
an otherwise well-appearing person it is the
search for disease that has not been determined
to be, or is suspected of being, present. - Test examples temperature, Pap smear, mammogram,
FOBT, PSA - Diagnostic tests aim to test persons who have a
symptom or other evidence of potential disease. - Test examples chest x-ray, biopsy, blood/urine
test, colonoscopy
126Quality of Screening Tests
- Depends on
- Validity ability of the test to distinguish
between who has a disease and who does not - A perfect test would be perfectly valid
- Reliability repeatability of a test
- A perfectly reproducible method of disease
ascertainment would produce the same results
every time it was used in the same individual.
127Validity
- Components (expressed as percentages)
- Sensitivity the ability of the test to identify
correctly those who HAVE the disease the search
for diseased persons - Specificity the ability of the test to identify
correctly those who DO NOT HAVE the disease the
search for well persons - sensitivity and specificity quantify a tests
accuracy in the presence of known disease status - Note When calculating sensitivity or
specificity, another more definitive test (gold
standard) is used to know who really has or does
not have the disease, e.g.) FOBT then colonoscopy
w/ biopsy (the gold standard will determine true
presence of ca)
1282 X 2 Table
129Disease Screening
130Disease Screening
131Screening Measures
- Sensitivity A / AC or TP / TP FN
- Specificity D / BD or TN / TN FP
- Prevalence AC / N
132Screening
- The ideal screening test would identify all
tested subjects as True Positives or True
Negatives. - Ideally, we want a test to have 100 sensitivity
and 100 specificity. - The rise in sensitivity is usually compensated
for by a drop in specificity and vice versa.
133False Positives
- The complement for specificity is the false
positive value people who are well but are
classified as having the disease being screened
for. (1 - specificity FP) - These cases are usually brought for further
investigationthis can cause problems. - Burden on health care system/costly
- Needless anxiety
- Difficulty of removing label of disease
- Problems in employment
134False Negatives
- The complement for sensitivity is the false
negative value people actually with the disease
who are determined to be well. - (1 - sensitivity FN)
- Erroneously missed diseased
- If disease is serious and early intervention is
successful (e.g., cancer), this could be fatal. - Importance of false negative results depends on
- Nature and severity of disease being screened for
- Effectiveness of available measures
- Whether early effectiveness is greater
135Reliability (Repeatability) of Tests
- Can the results of a test be replicated?
- If a test is valid but NOT reliable, results are
meaningless - Factors that contribute to variation between test
results - Intra-subject variation variation within
individual subjects - E.g.) blood pressure variation in an individual
- Inter-observer variation variation between
those reading test results
136Inter-observer Variation
- Two observers may not always arrive at the same
results - The extent or magnitude of disagreement is
importantif disagreement is large the results
are less meaningful or less reliable - Variation between observers can be quantified by
calculating a percent agreement
137Kappa Statistic (k)
- Expect agreement about certain subjects by two
observers solely as a function of chance - Answers the question To what extent do the
results agree beyond what we would expect by
chance alone? - Kappa is a chance-corrected measure of
repeatability - Kappa 1 Complete agreement
- Kappa gt 0.75 Excellent agreement
- Kappa 0.40 0.75 Intermediate to good agreement
- Kappa lt 0.40 Poor agreement
138Kappa Statistic (k)
Kappa ( observed agreement) ( agreement
expected by chance alone) 100 - ( agreement
expected by chance alone)
139Relation Between Validity Reliability
- Graphically, a narrow curve indicates that the
results are quite reliable (repeatable)if far
from the true value then they are not valid. - A broad curve indicates low reliability but valid
if clustered around the true value. - Goal is to achieve results that are both valid
and reliable.
140Relation Between Validity Reliability
Valid but not reliable
Both valid and reliable
Reliable but invalid
141Statistics The Basics
142Definitions
- Statistics a branch of applied mathematics that
utilizes procedures for condensing, describing,
analyzing and interpreting sets of information. - Biostatistics a subset of statistics used to
handle health-relevant information.
143Types of Statistics
- Descriptive statistics methods of producing
quantitative summaries of information - Measures of central tendency
- Measures of dispersion
- Inferential statistics methods of making
generalizations about a larger group based on
information about a subset (sample) of that group
144Populations Samples
- Before the statistical test to use is determined
we must know if the information represents a
population or a sample - A population is an aggregate of cases, things,
people, etc. - A sample is a subset which should be
representative of a population if selected
randomly (i.e., each subject should have the same
chance for selection as every other subject)
145Classification of Data
- Qualitative - non-numeric or categorical (what
type?) - Examples gender, race/ethnicity
- Quantitative numeric or discrete, continuous
(how much?) - Examples age, temperature, blood pressure
146Classification of Data
- Discrete having a fixed number of values
- Ordinal (ordered), nominal (unordered)
- Examples marital status, blood type, number of
children in a family, number of attacks of
asthma/day - Continuous having an infinite number of values
in theory can take any value within a given range - Examples height, weight, temperature, heart rate
147Hint
- Qualitative (categorical) data are discrete
- Quantitative (numerical) data may be discrete or
continuous
148Qualitative Data Nominal
- Data which fall into mutually exclusive
categories (discrete) for which there is no
natural order - Examples race/ethnicity, gender, blood group,
marital status, ICD-10 codes, dichotomous data
(binomial data) such as HIV or HIV- yes or no
absent or present
149Qualitative Data Ordinal
- Data which fall into mutually exclusive
categories (discrete data) which have a rank or
graded order. - Examples grades, socioeconomic status, stage of
disease, tumor size, low medium - high
150Quantitative Data Interval
- Data which are measured by standard units
- The scale measures not only that one data point
is different than another, but by how much - Examples
- number of days since onset of illness (discrete)
- Temperature in Fahrenheit or Celsius (continuous)
151Descriptive Statistics
- Get a feel for the data.
- Assess the quality of the data
- Type of variables
- Summary statistics
- Distribution
- Pictorial representation
152Descriptive Statistics
- Used as a first step to look at health-related
outcomes - Examine numbers of cases to identify an increase
(epidemic) - Examine patterns of cases to see who gets sick
(demographic variables) and where and when they
get sick (space/time variables)
153Descriptive Statistics
- Measures of central tendency
- Mean
- Median
- Mode
- Measures of dispersion
- Variance
- Standard deviation (square root of the variance)
1sd, 2sd, 3sd - Percentiles 25th, 50th, 75th, 95th
- Range largest value-smallest value
154Mean
- Most commonly used measure of central tendency
- Arithmetic average
- The mean is affected by extreme values/sensitive
to outliers
155Mean - Example
156Median
- The value which divides a ranked set into two
equal parts - Order the data
- If n is even, take the mean of the two middle
observations - If n is odd, the median is the middle observation
157Median - Example
158Mode
- The number which occurs the most frequently in a
set. - Example 1, 2, 2, 2, 3, 4, 5, 5, 6, 7, 8 Mode
2
159(No Transcript)
160Example
161(No Transcript)
162Variance Standard Deviation
- Measures of dispersion (or scatter) of the values
around the mean - If the numbers are near the mean, variance is
small - If numbers are far from the mean, the variance is
large - The standard deviation is an estimate of the
variability of observationsit is a summary of
how widely dispersed the values are around the
mean.
163Variance Standard Deviation
164Standard Deviation
165Percentiles
A set of divisions that produce exactly 100 equal
parts in a series of continuous values, such as
childrens heights or weights.
166Range
- The difference between the largest and smallest
values in a distribution - Example 1, 2, 2, 2, 3, 4, 5, 6, 7, 8
- Range 8-1 7
167Rates
- The frequency of defined events in a specified
population for a given time period. - A rate is a proportion.
- A proportion is a ratio in which the numerator is
included in the denominator. - Usually expressed as fractions, decimals or
percentages.
168Frequency Diagram
169Histogram
170Histogram vs. Bar Chart
- There is a difference between the two
- A histogram shows the distribution of a
continuous variable thus, there should not be any
gaps between the bars. - A bar chart shows the distribution of a discrete
variable or a categorical one and will have
spaces between the bars.
171Histogram Example
Lead Concentration
172(No Transcript)
173(No Transcript)
174COHORT STUDIES
175Epidemiologic Studies
- Identify new diseases
- Identify populations at risk for a disease
- Identify possible causative agents of disease
- Identify factors or behaviors that increase risk
of a disease
176Study Designs
- Means to assess possible causes by gathering and
analyzing evidence. - The key to any epidemiologic study is in the
definition of what constitutes a case and what
constitutes an exposure.
177Types of Study Designs
- Analytic Studies (to test hypotheses)
- Experimental studies
- Randomized clinical trials
- Observational studies
- Cohort studies (prospective study)
- Case-control studies
- Cross-sectional studies
178Cohort Study Design
- Investigator starts with a group of individuals
apparently free of disease - The cohort is divided on the basis of exposure
- Those exposed to the possible risk factor
- Those not exposed to the risk factor
- The cohort is followed through time to determine
the outcome of interest - Measure/compare the incidence of disease or rate
of death from disease in the two groups - If there is a positive association, the incidence
rates in exposed group will be greater.
179TIME
The Present
Begin With
Disease
Exposure
No Disease -
Disease
No Exposure -
No Disease -
180Incidence rates of disease Exposed a / ab
Not Exposed c / cd
181(No Transcript)
182(No Transcript)
183Types of Cohort Studies
- Concurrent Cohort Study (prospective or
longitudinal) - Retrospective Cohort Study (historical cohort or
non-concurrent prospective study) - Both designs are identicalcomparing exposed and
non-exposed populations - The only difference is calendar time.
184Concurrent Cohort Study
- Investigator identifies original study population
at the beginning of the study. - The individuals are followed prospectively
through time until disease develops or does not
develop. - Disadvantages
- Requires long follow-up time (years)
- Expensive
- Age of investigator
185Retrospective Cohort Study
- The cohort is defined from historical data and
followed up for disease up to the present time - Can telescope the frame of calendar time for the
study and obtain results sooner - Disadvantage
- Quality depends on the historical data that are
available both to define exposure and to
identify the outcome.
186Types of Cohort Studies
- In a concurrent cohort study design, exposure and
non-exposure status are ascertained as they occur
in the study groups are followed-up for several
years into the future and incidence is measured. - In a retrospective cohort study design, exposure
is ascertained from past records, and outcome
(development or no development of disease) is
ascertained from existing records at the
beginning of the study.
187Assessment of Exposure
- Techniques used to measure exposure include
questionnaires (age, smoking habits), laboratory
tests (cholesterol, hemoglobin), physical
measurements (height, weight, blood pressure) and
various special procedures (EKG, x-rays) - Quantifying exposures includes information such
as date of onset, frequency of exposure and
duration and intensity of exposure
188Assessment of Exposure
- Changes in exposure status frequently occur.
- Example smokers quit smoking and people switch
occupations - This information must be incorporated into the
analysis and interpretation.
189Sources of Error in Epi. Studies
- Bias systematic preferences built into the
study design - Confounding occurs when a variable is included
in the study design that is related to both the
outcome of interest and the exposurecan lead to
false conclusions. - Example gambling and lung cancer drinking
coffee and lung cancer
190Potential Biases
- Bias in the assessment of the outcome
- Person deciding on whether disease developed or
not may be aware of exposure status of a subject
and may be potentially biased (blinding may be
helpful) - Information bias
- Incomparable quality of information between
exposed and non-exposed subjects
191Potential Biases
- Biases from non-response and loss to follow-up
- Non-participation and non-response can lead to
bias - Loss to follow-up can introduce bias. If people
with the disease are lost, interpretation of
results are difficult. - Analytic bias
- Preconceived notions of investigators may
unintentionally introduce bias - Epidemiologists have to work with dirty data.
The trick is to do it with a clean mind.
192Advantages of a Cohort Study
- Can assess multiple outcomes (effects) of a
single exposure - It is suitable for the study of rare exposures
- Can demonstrate a temporal relationship between
exposure and disease - If prospective, minimizes bias in the
ascertainment of exposure - Allows direct measurement of incidence of disease
in the exposed and non-exposed population
193Disadvantages of a Cohort Study
- Is inefficient for the evaluation of a rare
disease, unless a large sample size is obtained - Disease process may already be underway and not
known at the onset of the study - If prospective, can be very expensive and time
consuming - If retrospective, requires the availability of
existing records - Validity of the result can be affected by loss to
follow-up and tracking study subjects
194To Review
- The objective of the cohort study is to test a
hypothesis regarding the causation of disease. - The group of persons to be studied (cohort) are
defined in terms of characteristics manifest
prior to appearance of the disease being
investigated. - The defined study groups are observed over a
period of time to determine and compare the
frequency of disease among them.
195Exercise
- How would you design a cohort study of the
association between preterm delivery and
cigarette smoking?
196Results
- An exposed and a non-exposed group would first be
identified e.g.) women presenting to a local
county health department for prenatal care would
be classified by smoking status smokers and
non-smokers. - These women would be followed to determine
whether or not preterm delivery occurred. - The rates of the preterm delivery would be
compared among the smokers and non-smokers.
197CASE-CONTROL STUDIES
198Types of Study Designs
- Analytic Studies (to test hypotheses)
- Experimental studies
- Randomized clinical trials
- Observational studies
- Cohort studies (prospective study)
- Case-control studies
- Cross-sectional studies
199CASE CONTROL STUDIES
- Definition comparison of exposure frequencies
between persons with a specified illness or
injury (CASES) and other persons (CONTROLS). - The hallmark of this study design is that it
begins with people with the disease (cases) and
compares them to people without the disease
(controls).
200TIME
The Present
Begin With
Exposure
Disease
No Exposure -
Exposure
No Disease -
No Exposure -
201Proportions Exposed Cases Exposed a / ac
Controls Exposed b / bd
202SELECTION OF CASES
203Diagnostic Criteria
- Establish a set of objective diagnostic criteria
for case selection - Best to use standardized, expert criteria for
disease diagnosis e.g.) ICD code
204Criteria for Eligibility
- Establish a set of inclusion and exclusion
criteria for cases - Some reasons may exist to exclude cases
- Existence of a chronic disease other than the
disease under study - Mental problems that might preclude exposure
ascertainment
205Incident vs. Prevalent Cases
- Usually best to use incident (newly diagnosed)
cases - The inclusion of prevalent cases may introduce
bias and other problems with determining that
exposure preceded disease.
206Sources of Cases
- Representativeness of cases is very important!
- Hospital-based cases are the most commonly used
(depends on the disease being studied) - Random sample of the general population
- Highly representative
- Very time intensive and labor intensive
- Cases can be identified from other sources such
as cancer registries, ambulatory care facilities,
medical insurance companies, retirement groups,
etc
207SELECTION OF CONTROLS
- Challenge
- If a case-control study is conducted and more
exposure is found in the cases than in the
controls, then we would like to be able to
conclude that there is an association between the
exposure and the disease in question. The way
the controls are selected is a major determinant
of whether such a conclusion is valid.
208Controls
- Controls are intended to provide the frequency of
exposure (risk factor) among people without the
disease in the population from which cases were
identified. - Criteria for eligibility inclusion and
exclusion criteria should be the same for both
cases and controls
209Types of Controls
- Hospital-based Controls
- Individuals seeking medical care at the same
hospital as the cases for a condition believed to
be unrelated to the disease being studied - The hospital population may be very different
from the general population, thus less
generalizability of results - Neighborhood Controls
- Usually of similar socioeconomic status
- Similar environment
210Types of Controls Cont.
- General Population Controls
- Controls selected from a random sample of the
general population e.g.) via random digit dialing - Provides highly representative controls
- Costly method, refusal rates, lack of phone
coverage - Other Controls
- Friends, relatives, spouses, colleagues,
co-workers