Title: Summary of Molecular Cancer Epidemiology
1Summary of Molecular Cancer Epidemiology
- EPI243 Molecular Cancer Epidemiology
- Zuo-Feng Zhang,MD, PhD
2Molecular Epidemiology
- The goal of molecular epidemiology is to
supplement and integrate, not to replace,
existing methods - Molecular epidemiology can be utilized to enhance
capacity of epidemiology to understand disease in
terms of the interaction of the environment and
heredity.
3Molecular Epidemiology
- studies utilizing biological markers of exposure,
disease and susceptibility - studies which apply current and future
generations of biomarkers in epidemiologic
research.
4(No Transcript)
5Tasks for Molecular Epidemiologist
- The major tasks are
- to reduce misclassification of exposure,
- to assess effect of exposure on the target
tissue, - to measure susceptibility/inherited
predisposition to cancer, - to establish the link between environmental
exposures and gene mutations, - to assess gene-environment interaction.
- To set up prevention/intervention strategies.
6High Throughput Techniques
- Microarray technology
- DNA chips
- cDNA array format
- in situ synthesized oligonucleotide format
(Affymetrix) - Proteomics
- Tissue arrays
- These are powerful tools and high through put
methods to study gene expression, but they are
not the answers themselves - Individual targets/patterns identified need to be
validated - In epidemiological studies, these methods can be
used to identify specific exposure induced
molecular changes, individual risk assessments,
etc.
7(No Transcript)
8Proteomics
- Examine protein level expression in a high
throughput manner - Used to identify protein markers/patterns
associated with disease/function - Different formats
- SELDI-TOF (laser desorption ionization
time-of-flight) the protein-chip arrays, the
mass analyzer, and the data-analysis software - 2D Page coupled with MALDI-TOF (matrix-assisted
laser desorption ionization time-of-flight) - Antibody based formats
9(No Transcript)
10Fig 1
A, GTE (20?g/ml)
pI
9
MW (kDa)
8
8
9
10
2
2
1
10
1
5
5
11
11
13
13
7
17
6
7
6
17
18
16
16
18
12
12
14
14
3
3
15
15
4
4
B, GTE (40?g/ml)
pI
20
19
MW (kDa)
1
1
10
5
10
5
11
11
13
13
17
17
18
12
18
16
16
12
14
14
15
15
4
Time
48 hr
48 hr
24 hr
-
GTE
11Tissue Array
- Provide a new high-throughput tool for the study
of gene dosage and protein expression patterns in
a large number of individual tissues for rapid
and comprehensive molecular profiling of cancer
and other diseases, without exhausting limited
tissue resources. - A typical example of a tissue array application
is in searching for oncogenes amplifications in
vast tumor tissue panels. Large-scale studies
involving tumors encompassing differing stages
and grades of disease are necessary to more
efficiently validate putative markers and
ultimately correlate genotypes with phenotypes. - Also applicable to any medical research
discipline in which paraffin-embedded tissues are
utilized, including structural, developmental,
and metabolic studies.
12Bladder Array
Gelsolin
HE
13DNA Methylation
- DNA methylation plays an important role in normal
cellular processes, including X chromosome
inactivation, imprinting control and
transcriptional regulation of genes - It predominantly found on cytosine residues in
CpG dinucleotide, CpG island, to producing
5-Methylcytosine - CpG islands frequently located in or around the
transcription sites
14DNA Methylation (Contd)
- Aberrant DNA methylation are one of the most
common features of human neoplasia - Two major potential mechanisms for aberrant DNA
methylation in tumor carcinogenesis
Silencing tumor suppressor genes (e.g. p16 gene)
Point mutation C to T transition (e.g. P53 gene)
SourceRoyal Society of Chemistry
15Promoter-Region Methylation
- Promoter-region CpG islands methylation
- Is rare in normal cells
- Occur virtually in every type of human neoplasm
- Associate with inappropriate transcriptional
silence - Early event in tumor progression
- In tumor suppressor genes
- Most of the tumor suppressor genes are
under-methylated in normal cells but methylated
in tumor cells. Methylation is often correlated
with an decreasing level of gene expression and
can be found in premalignant lesions
16DNA methyltransferases
- DNMTs catalyze the transfer of a methyl group
(CH3) from S-adenosylmethionine (SAM) to the
carbon-5 position of cytosine producing the
5-methylcytosine - There are several DNA methyltransferases had been
discovered, including DNMT1, 3a, and 3b
17NORMAL CIN 1 CIN 2
CIN 3
NORMAL LGSIL HG SIL
HGSIL
18Additional Molecular Event
Exposure to Carcinogen
Precancerous Intraepithelial Lesions, (PIN,
CIN, PaIN..)
Cancer
Birth
Surrogate End Point Markers
Markers for Exposure
Markers of Effect
Tumor Markers
Genetic Suscep. Marker
CHEMOPREVENTION
19(No Transcript)
20(No Transcript)
21(No Transcript)
22Case-Control Studies
- Disease end-point as a major interest
- Clinical (Hospital)-based or population-based
case-control studies - Inclusion of both questionnaire data and
biological specimens - Biological markers can be measured and compared
between cases and controls when other variables
can be used as either confounding factors or
effect modifiers
23Prospective Cohort Studies
- Exposure is measured before the outcome
- The source population is defined
- The participation rate is high if specimen are
available for all subjects and follow-up is
complete
24Nested Case-Control Study
- The biomarker can be measured in specimens
matched on storage duration - The case-control set can be analyzed in the same
laboratory batch, reducing the potential for bias
introduced by sample degradation and laboratory
drift
25Case-Case Study Design
- Case-only, Case-series, etc.
- Studies with cases without using controls
- Can be employed to evaluate the etiological
heterogeneity when studying tumor markers and
exposure - May be used to assess the statistical
gene-environment or gene-gene interactions
26Intervention Studies
- In studies of smoking cessation intervention, we
can measure either serum cotinine or protein or
DNA adducts (exposure) or p53 mutation, dysplasia
and cell proliferation (intermediate markers for
disease) - Measure compliance with the intervention such as
assaying serum b-carotene in a randomized trial
of b-carotene.
27Intervention Studies
- Susceptibility markers (GSTM1) can also be used
to determine whether the randomization is
successful (comparable intervention and control
arms)
28Family Studies
- Does familial aggregation exist for a specific
disease or characteristic? - Is the aggregation due to genetic factors or
environmental factors, or both? - If a genetic component exists, how many genes are
involved and what is their mode of inheritance? - What is the physical location of these genes and
what is their function?
29Sample Size and Power
- False positive (alpha-level, or Type I error).
The alpha-level used and accepted traditionally
are 0.01 or 0.05. The smaller the level of alpha,
the larger the sample size.
30Power or Sample Size Estimate for Case-Control
Studies
- Alpha-level (false positive) 0.05
- Beta-level (false negative level 1-betapower)
0.20 - Delta-level Proportion of exposure in controls
and exposure in cases or expected odds ratio
31Interaction Assessment
Factor A
Absent Present
Factor A Absent RR00 RR01
Present RR10 RR11
32Sample Size Consideration for Interaction
Assessment
- Evaluation of interaction requires a substantial
increase in study size. For example, in a
case-control study involves comparing the sizes
of the odds ratios (relating exposure and
disease) in different strata of the effect
modifier, rather than merely testing whether the
overall odds ratio is different from the null
value of 1.0.
33Introduction
- Sample Collection, such as handling, labeling,
processing, aliquoting, storage, and
transportation, may affect the results of the
study - If case sample are handled differently from
controls samples, differential misclassification
may occur
34Information linked to Sample
- Time and date of collection
- Recent diet and supplement use,
- Reproductive information (menstrual cycle)
- Recent smoking
- current medication use
- Recent medical illness
- Storage conditions
35Quality Assurance
- Systematic Application of optimum procedures to
ensure valid, reproducible, and accurate results
36-70 freezers
37Types of Biospecimens Blood
- The use of skilled technicians and precise
procedures when perform phlebotomy are important
because painful, prolonged or repeated attempts
at venepuncture can cause patient discomfort or
injury and result in less than optimum quality or
quantity of sample.
38Types of Biospecimens Blood
- Plasma
- Serum
- Lymphocytes
- Erythrocytes
- Platelets
39Urine Collection
-
- Urine is an ultrafiltrate of the plasma. It can
be used to evaluate and monitor body metabolic
disease process, exposure to xenobiotic agents,
mutagenicity, exfoliated cells, DNA adducts, etc. -
40Tissue Collections
- Confirming clinical diagnosis by histological
analysis - Examining tumor characteristics at chromosome and
molecular level
41Laboratory Techniques with Tissue
tissue
RT-PCR
42Adipose Tissue
- Adipose tissue may be quite feasible for subject
and involve low risk. The tissue offers a
relatively stable deposit of triglyceride and
fat-soluble substances such as fat-soluble
vitamins (vitamins A and D). It represents the
greatest reservoir of carotenoids and reflect
long-term dietary intake of essential fatty
acids.
43Bronchoalveolar Lavage (BAL)
- BAL is used to assess and quantify asbestos
exposures - Induced sputum sample and BALF can also provide
sufficient DNA for PCR assays.
44Exhaled Air
- To evaluate exposure to different substances,
particularly solvents such as benzene, styrene - To be used as a source of exposure and
susceptibility markers (caffeine breath test for
p4501A2 activity) - Breath urea (presence of urease positive
organisms such as H. pylori)
45Hair
- Easy available biological tissue whose typical
morphology may reflect disease conditions within
the body - Provides permanent record of trace elements
associated with normal and abnormal metabolism - A source for occupational and environmental
exposure to toxic metals
46Nail Clippings
- Toenail or fingernail clippings are obtained in a
very easy and comfortable way. - They do not require processing, storage and
shipping condition and thus suitable for large
epidemiological studies
47Buccal cells
- No invasive
- Good for PCR-analysis
- Can measure both germline and somatic mutations
48Saliva
- It is an efficient, painless and relatively
inexpensive source of biological materials for
certain assays - It provides a useful tool for measuring
endogenous and xenobiotic compounds
49Breast Milk
- Measuring hormones, exposures to chemicals and
biological contaminants (Aflatoxin), selenium
levels - Cells of interests
-
50Feaces
- Certain cells of interest
- Infectious markers
- Oncogenes
51Semen
- Evaluate the effects of exposures on endocrine
and reproductive factors. - Sexual abstinence for at least 2 days but not
exceeding 7 days. - Should reach the lab within one hour.
52Storage
- Freezers may fail, leading to the necessity for
24 hour monitoring for the facility through a
computerized alarm system to alter personnel and
activate backup equipment. - Monitoring fire, power loss, leakage, etc.
53Shipping
- Sample shipping requirements depends on the time,
distance, climate, season, method of transport,
applicable regulations, type of specimen and
markers to be assayed. - Polyurethane boxes containing dye ice are used to
ship and transport samples that require low
temperature. For samples require very low
temperature, liquid nitrogen container can be
used - The quantity of dry ice should be carefully
calculated, based on estimated time of trip.
54Safety
- Protect specimen from contamination
- Workers safety, HIV, HBV
55Biomarker in Epidemiology Biomarkers of
Biological Agents
- HPV DNA by PCR-based assays
- HPV infection is often transient, especially in
young women so that repeated sampling is required
to assess persistent HPV infections
56(No Transcript)
57(No Transcript)
58Biomarker in Epidemiology Biomarkers of
Biological Agents
- HBV infection by serological assays.
- There are serological markers that distinguish
between past and persistent infections. HBV DNA
detection in sera further refines the assessment
of exposure.
59(No Transcript)
60BackgroundMetabolism of aflatoxin B1
61Main Effects of HBsAg, AFB1 levels, and IFNA17 on
liver cancer development
Variables Variables Case Case Control Control Crude Age Sex Adjusted Fully Adjusted
N () N () N () N () OR (95CI) OR (95CI) OR (95CI)
HBsAg - 72 (35.3) 312 (75.4) 1 1 1
132 (64.7) 102 (24.6) 5.61 (3.90-8.07) 5.21 (3.60-7.53) 5.68 (3.80-8.51)
AFB1 Mean (SD) 508.1 (328.7) 426.2 (250.4)
lt247 33 (18.1) 94 (24.9) 1 1 1
247.1-388.8 46 (25.3) 94 (24.9) 1.39 (0.82-2.37) 1.38 (0.81-2.37) 1.15 (0.61-2.14)
388.9-545 42 (23.1) 95 (25.2) 1.26 (0.74-2.16) 1.27 (0.74-2.20) 1.19 (0.64-2.21)
gt545.1 61 (33.5) 94 (24.9) 1.85 (1.11-3.08) 1.75 (1.04-2.94) 1.63 (0.90-2.96)
p(trend)0.031 p(trend)0.055 p(trend)0.109
IFNA17 II 33 (17.4) 94 (24.5) 1 1 1
RI 104 (54.7) 193 (50.4) 1.54 (0.97-2.44) 1.49 (0.93-2.38) 1.67 (0.95-2.93)
RR 53 (27.9) 96 (25.1) 1.57 (0.94-2.64) 1.58 (0.93-2.68) 1.99 (1.06-3.73)
p(HW)0.878 p(HW)0.878 p(trend)0.104 p(trend)0.102 p(trend)0.037
RIRR 157 (82.6) 289 (75.5) 1.55 (1.00-2.41) 1.52 (0.97-2.38) 1.77 (1.04-3.03)
Model includes age, sex, BMI, education,
alcohol consumption, tobacco smoking, HBsAg,
imputed AFB1 levels, anti-HCV
62Interaction between HBV and AFB1 and IFNA17
HBsAg HBsAg HBsAg Case Case Control Control Crude Age Sex Adjusted Fully Adjusted
N () N () N () N () OR (95CI) OR (95CI) OR (95CI)
AFB1 AFB1 AFB1
lt247 - - 12 (6.6) 69 (18.4) 1 1 1
247.1-388.8 - - 19 (10.4) 67 (17.8) 1.63 (0.74-3.62) 1.64 (0.73-3.65) 1.72 (0.73-4.08)
388.9-545 - - 15 (8.2) 71 (18.9) 1.22 (0.53-2.78) 1.22 (0.53-2.80) 1.34 (0.55-3.27)
gt545.1 - - 17 (9.3) 77 (20.5) 1.27 (0.57-2.85) 1.26 (0.56-2.82) 1.15 (0.48-2.74)
lt247 21 (11.5) 25 (6.6) 4.83 (2.08-11.23) 4.61 (1.97-10.80) 6.43 (2.56-16.16)
247.1-388.8 27 (14.8) 27 (7.2) 5.75 (2.55-12.96) 5.30 (2.34-12.02) 4.68 (1.92-11.38)
388.9-545 27 (14.8) 24 (6.4) 6.47 (2.84-14.74) 6.20 (2.70-14.21) 6.65 (2.72-16.25)
gt545.1 44 (24.2) 16 (4.3) 15.82 (6.84-36.57) 13.75 (5.90-32.06) 16.72 (6.60-42.38)
1ORint (95CI) 1ORint (95CI) 1ORint (95CI) 0.73 (0.24-2.24) 0.70 (0.23-2.18) 0.42 (0.12-1.45)
2ORint (95CI) 2ORint (95CI) 2ORint (95CI) 1.10 (0.35-3.49) 1.10 (.35-3.52) 0.77 (0.22-2.70)
3ORint (95CI) 3ORint (95CI) 3ORint (95CI) 2.58 (0.82-8.12) 2.38 (0.75-7.55) 2.27 (0.65-7.92)
IFNA17 IFNA17 IFNA17 IFNA17 IFNA17
II II II - 13 (6.8) 66 (17.3) 1 1 1
RIRR RIRR RIRR - 50 (26.3) 220 (57.6) 1.15 (0.59-2.25) 1.14 (0.58-2.23) 1.34 (0.64-2.82)
II II II 20 (10.5) 27 (7.1) 3.76 (1.64-8.62) 3.49 (1.51-8.04) 3.99 (1.54-10.32)
RIRR RIRR RIRR 107 (56.3) 69 (18.1) 7.87 (4.04-15.34) 7.17 (3.66-14.06) 9.18 (4.34-19.43)
ORint (95CI) ORint (95CI) ORint (95CI) 1.81 (0.71-4.62) 1.81 (0.71-4.63) 1.71 (0.60-4.92)
Model includes age, sex, BMI, education,
alcohol consumption, tobacco smoking, imputed
AFB1 levels, anti-HCV 1ORint for AFB1
(247.1-388.8 fmol/mg) and HBsAg 2ORint for AFB1
(388.9-545 fmol/mg) and HBsAg 3ORint for AFB1
gt545.1 fmol/mg) and HBsAg
63Interaction between HBsAg and IFNA17 stratified
by AFB1
AFB1 HBsAg IFNA17 Case Control Crude Age Sex Adjusted Fully Adjusted
N N OR (95CI) OR (95CI) OR (95CI)
lt388.9 - II 8 26 1 1 1
- RIRR 20 99 0.66 (0.26-1.66) 0.63 (0.24-1.62) 0.70 (0.24
II 9 13 2.25 (0.70-7.19) 2.04 (0.62-6.74) 2.07 (0.52-8.18)
RIRR 37 37 3.25 (1.30-8.11) 2.81 (1.10-7.19) 3.45 (1.21-9.83)
ORint (95CI) ORint (95CI) ORint (95CI) 2.20 (0.58-8.38) 2.20 (0.56-8.70) 2.39 (0.50-11.45)
gt388.9 - II 5 34 1 1 1
- RIRR 25 104 1.63 (0.58-4.60) 1.62 (0.58-4.59) 2.09 (0.64-6.86)
II 11 9 8.31 (2.29-30.10) 8.07 (2.21-29.42) 9.22 (2.08-40.86)
RIRR 57 27 14.35 (5.05-40.77) 13.88 (4.80-40.09) 21.80 (6.36-74.75)
ORint (95CI) ORint (95CI) ORint (95CI) 1.06 (0.25-4.44) 1.06 (0.25-4.45) 1.13 (0.22-5.81)
Model includes age, sex, BMI, education,
alcohol consumption, tobacco smoking, HCV
64(No Transcript)
65(No Transcript)
66 67(No Transcript)
68(No Transcript)
69(No Transcript)
70(No Transcript)
71(No Transcript)
72Biomarker of Dietary Intake
- Whether it is a good indicator of intake
- Whether it is a long- or short-term marker
- Whether there is a need for multiple measurements
- Whether it is acceptable for researcher and the
subject - Whether it is compatible with study design
73(No Transcript)
74(No Transcript)
75(No Transcript)
76Main component of green Tea Catechins
(-)-Epigallocatechin gallate ((-)EGCg)
77 PHIP DNA Adducts
78P32 postlabeling
79(No Transcript)
80(No Transcript)
81Susceptibility Markers
- Susceptibility markers represent a group of
biological markers, which may make an individual
susceptible to cancer. - These markers may be genetically inherited or
determined or acquired. - They are independent of environmental exposures.
82Biomarker of Genetic Susceptibility
- High risk genes
- Low risk genes
83Genetic Susceptibility to Cancer
010205
84McCarthy MI, Nature Review Genetics, 2008
85DNA damage repaired
Defected DNA repair gene
If DNA damage not repaired
If loose cell cycle control
86Non-homologous Recombination
homologous recombination
BRCA1
BRCA2
Damage recognition cell cycle delay response
(DRCCD )
ATM
CHEK2(RAD53
BRCA1
87(No Transcript)
88Baseline characteristics of each study
LA Study LA Study LA Study Taixing City Study Taixing City Study Taixing City Study Taixing City Study MSKCC study MSKCC study
Lung Cancer Cases () UADT cancer Cases () Controls () Stomach Cancer Cases () Esophageal Cancer Cases () Liver Cancer Cases () Controls () Bladder Cancer Cases () Controls ()
Total Total 611 601 1040 206 218 204 415 233 204
Age range Age range 32-59 20-59 17-65 30-82 30 84 22-83 21-84 32-84 17-80
Age, mean Age, mean 52.2 50.3 49.9 61.5 60.6 53.8 57.7 64.8 42.0
Gender Gender
Males 303 (49.6) 391 (74.2) 623 (59.9) 138 (67.0) 141 (64.7) 159 (77.9) 287 (69.2) 206 (83.4) 156 (77.2)
Females 308 (50.4) 136 (25.8) 417 (40.1) 68 (33.0) 77 (35.3) 45 (22.1) 128 (30.8) 41 (16.6) 46 (22.8)
Education Education
lt High school 265 (43.4) 240 (45.5) 300 (28.9) 204 (99.5) 215 (100.0) 204 (100.0) 405 (97.6) 95 (40.8) 34 (16.7)
gtHigh School 346 (56.6) 287 (54.5) 739 (71.1) 1 (0.5) 0 (0.0) 0 (0.0) 10 (2.4) 138 (59.2) 170 (83.3)
Smoking Smoking
Never 110 (18.0) 164 (31.1) 491 (47.3) 92 (45.8) 94 (43.1) 85 (44.3) 217 (52.4) 42 (17.3) 92 (46.0)
Ever 501 (82.0) 363 (68.9) 548 (52.7) 109 (54.2) 117 (53.7) 107 (55.7) 197 (47.9) 201 (82.7) 108 (54)
89Associations between 8q24 SNPs and smoking
related cancers
LA Lung UADT (squam) Oroph. Larynx
Naso.
90Associations between 8q24 SNPs and smoking
related cancers
Taixing Esoph. Stomach Liver MSKCC
Bladder
91Association between 8q24 and 7 smoking related
cancer sites, stratified by smoking status
92(No Transcript)
93(No Transcript)
94(No Transcript)
95(No Transcript)
96(No Transcript)
97(No Transcript)
98TP53 Mutations in Bladder Cancer
BP changes Reported, n200 Current study
Transitions
GC ?AT 41.0 37.5
(at CpG) 14.0 12.5
AT?GC 10.0 15.0
Transversions
GC?TA 13.0 12.5
GC?CG 19.0 10.0
AT?TA 3.0 0.0
AT?CG 2.0 2.5
Deletion/Insert. 12.0 10.0
99Smoking and TP53 Mutations in Bladder Cancer
Smoking TP53 TP53- OR 95CI
No 8 24 1.00
Yes 58 83 6.27 1.29-30.2
Adjusted for age, gender, and education
100Cigarettes/day and TP53 Mutations in Bladder
Cancer
Cig/day TP53 TP53- OR 95CI
No 8 24 1.00
1-20 8 21 2.07 0.22-19.9
21-40 36 47 5.50 1.08-28.2
gt40 17 18 10.4 1.90-56.8
Trend P0.003
Adjusted for age, gender, and education
101Years of Smoking and TP53 Mutations in Bladder
Cancer
Years of smoking TP53 TP53- OR 95CI
No 8 24 1.00
1-20 5 10 5.64 0.82-38.7
21-40 42 58 6.45 1.24-33.4
gt40 14 18 6.20 1.17-32.8
Trend P0.041
Adjusted for age, gender and education
102Association Studies of Genetic Factors
- 1st generation
- Very small studies (lt100 cases)
- Usually not epidemiologic study design 1-2 SNPs
- 2nd generation
- Small studies (100-500 cases)
- More epi focus a few SNPs
- 3rd generation
- Large molecular epi studies (gt500 cases)
- Proper epi design pathways
- 4th generation
- Consortium-based pooled analyses (gt2000 cases)
- GxE analyses
- 5th generation
- Post-GWS studies
Boffeta, 2007
103Issues in genetic association studies
- Many genes
- 25,000 genes, many can be candidates
- Many SNPs
- 12,000,000 SNPs, ability to predict functional
SNPs is limited - Methods to select SNPs
- Only functional SNPs in a candidate gene
- Systematic screen of SNPs in a candidate gene
- Systematic screen of SNPs in an entire pathway
- Genomewide screen
- Systematic screen for all coding changes
104Potential of GWAS
105Kingsmore, 2008
106Post-GWAS Epidemiology
- Functional SNP analysis
- Pathway-based analysis
- Deep sequencing and fine mapping
- Gene-Environmental Interaction