Title: Lung Cancer:
1Lung Cancer An interdisciplinary perspective on
lung cancer or how to eliminate the pain
and suffering of cancer by the year
2015. Translational Research in Clinical
Oncology November 10, 2003
Neil Caporaso, MD Genetic Epidemiology Branch,
NCI
2Premise Translational medicine and molecular
epidemiology are natural partners. We need both
to meaningfully advance prevention and treatment
of major cancers. Definitions Molecular
Epidemiology using biomarkers in
population studies Translational Medicine
optimizing information flow- basic and
clinical science (bench - bedside)
3Classic epidemiology
E exposure
D disease
Tobacco lung cancer
4Molecular epidemiology Add biomarkers of exposure
E
biomarker of exposure e.g., urine cotinine
E D
5Molecular epidemiology Using biomarkers for both
E and D Historic rationale for molecular
epidemiology We enter the black box or Gain
mechanistic insight
E
D
6Classic clinical trial
D disease
O outcome
Lung cancer cure or death
7Translational medicine Understand molecular
pathology
D
biomarker of disease e.g., p53 mutations, P16
methylation, telomere alterations, etc.
D O
8Why is lung cancer a focus for molecular
epidemiology? 1- Lung cancer is the leading cause
of cancer mortality in men and women in the
United States 2- Tobacco also accounts for many
other cancers and is of key importance to
public health 3- Lung cancer genetics is historic
theme in molecular epidemiology/complex
disease 4- Treatment is problematic 5- Screening
is controversial 6- Lung cancer is paradigm for
genetics of complex disease 7- Understanding of
tobacco as etiologic agent is opportunity to
unravel fundamental nature of GE in relation to
cancer 8- The clearest example of an exposure
with a important behavior strongly associated
with cancer 9- Chance to apply new technological
tools and resources e 10-Improved basic
understanding is needed
9- Tobacco and public health
- tobacco is the major cause of preventable
morbidity - and mortality in the Western world
- 1 in 5 US deaths (450,000 yr, 3M worldwide)
- 10 million tobacco related deaths/annum
- by 2030 (WHO estimate)
- 30 of all cancer, 8 major sites, all difficult
to treat - - tobacco related disease costs Medicare 10B/yr
- and Medicaid 13B/yr
- In spite of widespread knowledge of the health
- consequences of smoking
- - rates in US adolescents are increasing
- - declines in adults have leveled off
- - Individual smoking cessation difficult
Proctor RN Tobacco and the global lung cancer
epidemic, Nature Oct 2001
10- Epidemiologists use 5 criteria
- to support causal relations
- ..tobacco and lung cancer
- High relative risk (odds ratio)
- Consistency
- Dose response
- Temporal relationship
- Plausible mechanism
11- 7 Key questions involving lung cancer
-
- Why do people begin to smoke?
- Why do people persist in smoking?
- Why cant people quit smoking?
- What determines who gets lung cancer?
- What genetic lesions characterize LC?
- Is it possible to effectively screen LC?
- Is it possible to effectively treat LC?
12Chain of events that must occur to result in
death from lung cancer (population perspective)
Persist in smoking / Cant quit
Host Susceptibility / Molecular lesions
Start smoking
Cant detect early
Cant treat
13- Evidence hereditary variation in lung cancer
- - Lung cancer kindreds exist
- Tomizawa 1997, see GELC
- Casecontrol studies identify increased risks in
case relatives - Tokuhata 1963, Ooi 1986, Shaw 1991, Bromen 2000,
Lynch 1986 - Segregation analysis
- Sellers 1990
- Population databases
- Cannon-Albright 1994
- Twin studies (note- results mixed)
- Morison 1994, Paul 1987, Braun 1994,95
- Animal Models
- Manenti 1997, 1999 Dragani 1995
- Many plausible polymorphic candidate genes
14Lung cancer risk and Family History
No rel.w/LC Cont Cases OR(95
CI) 0 466 393 1.0 1 78 119 1.7 (1.2-
2.4) 2 8 20 2.9 (1.2- 6.6)
Adjusted for gender, smoking, passive smoking,
and the of 1st degree relatives
Shaw GL, Falk RT, et al. J Clin Epid 1990
15- Twin studies
- Concordance for lung cancer
- MZ DZ
- (n5933) (n7554)
- Obs pairs 10 21
- Exp pairs 3.4 5.3
- O/E (95 CI) 3.0 4.0
- (1.6-5.6) (2.4-5.8)
- Strong evidence for a familial effect that may
or may not be genetic - Braun, Caporaso, Page, Hoover, Lancet, 1994
16Issues Is there evidence that heredity
influences common cancers? If it does, is it
important? If it is important, why havent we
demonstrated it? What is to be done?
17Genes that contribute to cancer fall in 2
categories
Single Susceptibility Study
design family population Type linkage associ
ation Allele freq rare common of
genes one/few many D and G freq rare common R
isk high low Role of E low high Attrib
risk low high Concept deterministic probabili
stic Type Search anonymous directed example BR
CA1 CYP1A1/NAT2
Caporaso and Goldstein 1995
18Pharmacogenetics perspective.. - 20-95 of the
variability in drug disposition and effect is
under genetic control Evans and McLeod NEJM
2003 Example Drug Metabolizing Enzymes Drug
Receptors - drug clearance/AUC - receptor
sensitivity
AA Aa aa BB
Bb bb Combined influence of 2
single polymorphism genotypes (339)
results in 9 combinations of drug
phenotypes AABB/ AABb/ Aabb/ AaBB/
AaBb/ Aabb/ aa/BB/ aaBb/ aabb Thus, we expect a
wide range in - efficacy - toxicity -
therapeutic index (efficacy/toxicity index) -
carcinogen activation/deactivation
19Starter paradigm for identifying candidate
genes in lung cancer
Smoking causes most lung cancer Carcinogens in
tobacco must be metabolically activated Metaboli
c alteration is under genetic control
20Processing is often under hereditary
control example
tobacco nicotine (CYP2A6) aryl amines
(NAT2) PAH (CYP1A1, GSTM1) nitrosamines
(CYP2A6/13, CYP2E1)
21Observed frequency histogram debrisoquine
metabolic ratio
D/D and D/d
d/d
-2 -1 0 1 2 3 4
ln metabolic ratiodebrisoquine/4-OH debrisoquine
Caporaso, Pickle, Bale, Ayesh, Hetzel, Idle 1989
22Metanalyses Lung cancer
Gene studies OR (95 CI) author CYP1A1 22 1.2
(0.9-1.5) Houlson 2000 4 MspI 1.7
(1.3-2.3) dErrico 1999 3 exon7 2.3
(1.4-3.7) dErrico 1999 CYP2D6 16 1.3
(1.0-1.6) dErrico 1999 13 1.3 (0.9-2.0)
Christensen 1997 MPO 6 0.7
(0.4-0.8) Kantarci 2002 GSTM1 23 1.1
(1.0-1.3) Houlson 1999 13(C) 1.2
(1.1-1.4) dErrico 1999
recent large study null
23Metanalyses Bladder cancer
Gene studies OR (95 CI) Ref. NAT2 22 1.4
(1.2-1.6) Marcus 2000a 21 1.3
(1.1-1.6) Johns 2000 GSTM1 15 1.5
(1.3-1.9) Johns 2000 9 (C) 1.5
(1.3-1.8) dErrico 1999
24- Genetic Association Studies
- Hirschhorn et al Genetic Medicine 2002
- - 1986-2000
- 600 reported associations (133 diseases and 268
genes) - 166 studied 3 or more times
- only 6 consistently reproduced
- DVT and F5 (arg506Gln)
- Graves Disease and CTLA4 (Thr17Ala)
- Type 1 Diabetes and INS (5 VNTR)
- HIV/AIDS and CCR5 (32bp ins/del)
- Alzheimers and APOE (e 2/3/4)
- Creutfeldt-Jacob and PRNP (met129val)
- None involving cancer
- Key reasons cited
- 1) population stratification
- 2) linkage disequilibrium
- 3) GE/GG
- 4) power/publication bias/first study effect
25What are the challenges in demonstrating genes
involved in common cancers. 1 Population
stratification 2 Multiple comparisons 3 Power 4
Failure to consider pathways 5 SNP strategy
26Key challenges .. population
stratification
27Ethnic variation in GSTM1 null genotype frequency
among Caucasian groups Country City
Frequency Ref. France Clermont-Fer. 0.46 Baran
ova 97 France Nice 0.50 Fontana
97 Italy Parma 0.50 Alberti 96 Sweden Huddinge
0.50 Hou 95 USA San Francisco 0.50 Wiencke
97 Czech Prague 0.52 Topinka
1997 UK Edinburgh 0.53 Harrison 97 Germany U
Dortmund 0.54 Kiempes 96 Denmark U
Aarhus 0.55 Autrup 95 Spain Barcelona 0.59 Lafu
ente 95
28Population stratification
Definition in a mixed population, a trait will
demonstrate an association with any allele that
is by chance more frequent in one ethnic
group e.g. Pima Indians and Gm haplotype as risk
for diabetes Knowler 1988 Previously suggested
solutions. limit studies to 1- homogeneous
ethnic groups 2- use family controls (and TdT) 3-
use genetic markers to identify ethnicity NEW
SOLUTION Population stratification is a
confounding issue Usual confounding strategies
work
Wacholder S, Rothman N, Caporaso, N JNCI
2000
29Key challenges .. multiple comparisons?
30Early metabolic gene studies in lung
cancer
Gene exposure examples Phase 1 CYP1A1 PAH Kawa
jiri 1990, Nakachi 1991 CYP2C9 B(a)P London
1997 CYP2D6 NNK(?) Idle 1984, Caporaso
1990 CYP2E1 nitrosamines Uematsu 1991, Kato 1992
Phase II NAT2 HAA/AA Cascorbi 1996,
Martinez 1995 GSTM1 PAH Seidegard 1986,
McWilliams 1995 GSTT1 PAH Deakin 1996, Trizna
1995 GSTP1 PAH Saarikoski 1998, Ryberg
1997 epoxide hydrolase PAH Heckbert 1992
31Other gene categories in lung cancer
Gene rationale examples Phase 1 activates
carcinogen CYP1A1 (others) Phase 2 eliminates
carcinogen GSTM1 (others) Nutrient diet/nutrient
s implicated MTHFR (4 others) Hormone gender
differences(?) CYP1B1 (4 others) Immune biologica
lly plausible IGF2 (3 others) DNA
Repair biologically plausible XRCC1 (9
others) Oxidative injury biologically
plausible MnSOD (3) Methyltrans. DNA
methylation TMPT (4) Behavior influence
smoking CYP2A6 (16) Oncogene somatic mutations
common Hras vtr (3 others) TSG LOH common p53
(5 others) Cytogenetic 3p, 11q etc. sites many
Animal models 12p12.1 PAS1
32Example newly described DNA Repair Genes
Gene Polymorphism Allele Freq Author/Year HOGG1
CgtG, codon 326 0.43 Kohno 1998 MGMT GgtA, codon
160 0.08 Edara 1996 XPF CgtT at nt
1135 0.08 Fan 1999 MnSOD -9 of signal
peptide 0.50 Ambrosone 1999 XRCC1 GgtA, codon
280 0.08 Lunn, 1999 XRCC3 CgtT, codon
241 0.38 Shen 1998 XRCC9 Exon 7, codon
297 0.01 - XRCC1 GgtA, codon 299 0.30 Duell,
2000 ERCC1 AgtC at nt 8092 0.27 Chen
2000 XPD CgtT, exon 6 0.37 Duell
2000 RAD50 Exon 5, codon 191 0.01 - APE1 Exon
5, codon 148 0.38 Hadi 2000
33Finding SNPs in Silico
- PubMed Published Literature
- http//www.ncbi.nlm.nih.gov/entrez/query.fcgi
- dB SNP NCBI Database
- http//www.ncbi.nlm.nih.gov/SNP/
- LocusLink NCBI Database linked to RefSeq
- http//www.ncbi.nlm.nih.gov/LocusLink/
- SNP pipeline CGAP-GAI search of EST/Unigene
- http//lpgws.nci.nih.gov82/perl/snp/snp_cgi.pl
- Leelab SNP Database UCLA search of EST/Unigene
- http//www.bioinformatics.ucla.edu/snp/
- HG Base International Database
- http//hgbase.cgr.ki.se/
- I-SNP Curated collection by pathways
- http//www-dcs.nci.nih.gov/pedonc/ISNP/
34Multiple Comparisons
- The problem
- Many gene families
- Many genes within each family
- Many SNPs within each gene
- Technical capacity to conduct high throughput
genotyping - Therefore many new candidates to test but
because the number of true associations is limited
35Multiple Comparisons
- SOLUTIONS
- Bonferroni type adjustment
- - set sig. level for all tests divided by the
number of tests performed - p 5 10 8 produce a genome-wide false
positive of 5 - (Risch and Merikangas 1996)
- Bayesian approaches
- - adjust p-value based on positive predictive
value - Reduce number of comparisons
- - group genes
- test pathways/ construct scores for pathways
- use haplotypes
- Multidisciplinary approach
- link with functional studies
- Reproduce findings
- Network studies/form consortia
36Key challenges .. power
37Key challenges .. consider
pathways example of behavior
38Major genetic polymorphisms and the metabolism of
PAHs
Gene Role Activation - CYP1A1 B(a)Pgtphenolic
metabolitesB(a)P-7,8-diol - CYP1B1 -
EP forms highly reactive ()-anti-BPDE -
NQO1 Detoxification - GSTM 1
inactivates BPDE - GSTM 2,3 inactivates
BPDE - GSTP 1 inactivates BPDE Regulation -
AhR regulation - arnt regulation We may
need to assess all or most of genes in a pathway
to understand real contribution of heredity
Caporaso in press, Shields 1998, Lerman
1997,8,1999 a
Kriek 2000, Bartsch 2000, Shimada 2001
39Evidence supporting a role of genetics on
behavior? Animal Work 1- genetic factors
determine differences between species 2- single
genes have an important influence of
behavior e.g. ALDH 2- flushing reaction
associated with reduced alcohol intake 3-
sensitivity and toxicity are genetically based 4-
inbred species and transgenic models consistent
Crabbe 1994 Twin Studies 1 twins reared apart
(adoption studies) 2 broad heritability estimates
from MZ and DZ concordance Heath 1993, Bouchard
1990, Kendler 1990,1992, Swan 1990, Carmelli
1996 Human gene studies 1 Linkage Bergen 1999 2
Candidate genes Blum 1995, Comings 2000
40Key cancer-causing exposures have major
behavioral components 1- Tobacco 2- Alcohol 3-
Diet and account for a substantial proportion of
cancer mortality in the US
41Starter paradigm in lung cancer
Smoking causes most lung cancer Chemicals are
metabolically altered Metabolic alteration is
under genetic control
42Starter paradigm in behavior
Genes influence smoking Smoking causes most
lung cancer Chemicals are metabolically
altered Metabolic alteration is influenced
by hereditary variation
43Genetic factors contribute to lung
cancer
Overly simplistic model genetic
factor Smoking Lung cancer
44Genetic factors contribute to in a complex manner
to lung cancer
More realistic model genetic
factors initiation persistence
Smoking Lung cancer
45Candidate genes and smoking
Gene Name Allele Main effect SLC6A3 dopamine
transporter 9/ vs. / p0.04 DRD2 dopamine
receptor 2 Taq1 A1/ vs / null DRD4 dopamine
receptor 4 7/ v / null null TH tyrosine
hydroxylase 1-5 repeats null 5-HTT serotonin
transportor 44bp del null TPH tryptophan
hydroxylase A, C alleles null CYP2D6
debrisoquine hydroxylase G/P null DBH dopam
ine B hydroxylase - null MAO-B monoamin
e oxidase B - null
Lerman 1998, 1999, 2000, 2001 Shields 1998,
Caporaso 2001
Caporaso in press, Shields 1998, Lerman
1997,8,1999 a
46Key challenges .. SNPs
47Role of SNPs (single nucleotide polymorphisms) ?
- Advantages
- widely and increasingly available
- annotation proceeding rapidly
- amenable to high throughput
- may use to identify haplotypes
- Disadvantages
- less informative then microsatellites
- may not coincide with actual disease gene (LD
problem) - linkage disequilibrium (LD) varies widely in the
genome - robust specimen processing, informatics and
analytical resources required - for genome wide strategy pooling required
- unclear how many SNPs are needed and how to best
select
GTAACCGT GTCACCGT
Boehnke 2000, Risch 1998, Bansal 2001, Collins
1998, Camp 1997
48NCI-SNP 500
- Growing Core of Candidate Genes
- Standard for Intra/Extramural PIs
- Immediate Application
- Enlarging Resource over Time
- Target 500 SNPs in next months
- Immediate Relevance for Molecular Epidemiology
- Marriage of Bio-Informatics and Validation
49http//snp500cancer.nci.nih.gov
CGAP Resource
50Bio-informatic Analysis
SNP of interest Linked SNPs
Frequencies for Genotype Alleles Haplotype
Estimation (E-M)
51 candidate genes or whole genome search?
52Strategy Spectrum
- Whole genome search
- gt 10,000 SNPs
- expensive and unproven
- theoretically comprehensive
- pooling/high throughput/LD
- use of haplotypes etc. requires
- work
- Candidate Gene
- - 1-10 genes
- traditional
- straightforward
- focus on function
53- How to select SNPs..candidate genes
- A few guidelines
- - Use a study of appropriate design and size
- - Select relevant pathways taking into account
- the disease and the exposures
- - Include sufficient SNPS to cover each gene
- - Take haplotypes into consideration
- - Make sure SNPs are appropriate with regard to
- function and allele frequency
- Plan to confirm positive findings in other
studies - and settings
54Genetic Epidemiology of Lung Cancer Study
- Design
- 2000 cases from 13 Hospitals
- 2000 population-based controls
- Matched by age and gender
- Catchment area
- 5 cities surrounding municipalities in Lombardy
region, Italy - 2 questionnaires
- CAPI and Self-administered
- Biospecimen collection
- WB PBMC RBC Serum Plasma Buffy coat DNA
RNA Blood cards - Fresh frozen lung tissue Paraffin blocks Tissue
slides - Data base
- Cluster Microsoft SQL, Oracle-like server
55History of control recruitment and corresponding
response rates
- Date Characteristics Resp. rate
- Jan-March 01 Phone Survey 30
- Sep 01-Feb 02 Pilot study with physicians
49 - Feb-March 02 Pilot study with compensation
73 - Apr 02-to date Main study with compensation
76
56Reasons for refusals in phone survey
57Study Design Controls
- 2000 lung cancer / 2000 controls /500 sibs
35 never smokers 30 former 35
current n700 n600 n700
Test smoking persistence
Test smoking initiation
58Relation of Tobacco Use to Lung Cancer in Early
Data From the Italian Case-control Study
odds ratio lung cancer (95 CI)
pack years smoking
384 controls, 407 cases
59- Large studies provide key advantages
- - incorporate new technologies and disciplines
- test diverse hypotheses
- lower marginal costs
- bring interdisciplinary expertise to bear
- use resources efficiently
- get full scientific value from large study
platforms - Large studies should do everything possible to
- incorporate multiple domains to create
- a setting for the best science.
60Molecular epidemiology integrative view MANY NEW
CLASSES OF HYPOTHESES TO TEST
Tumor tissue
Gene
behavior
E
D
outcome
61Molecular epidemiology starting point
Persist in smoking / Cant quit
Host Susceptibility / Molecular lesions
Start smoking
Cant detect early
Cant treat
62Molecular epidemiology Integrative goals
Persist in smoking / Cant quit
Host Susceptibility / Molecular lesions
Start smoking
Cant detect early
Cant treat
63Conclusion
- To meaningfully advance understanding of the
genetics of complex disease like lung cancer we
advocate larger studies that incorporate multiple
study hypotheses because - Substantial size insures good power
- Subgroups will be well represented
- Provides a substantial tissue resource
(tissue/blood) - Will be well-positioned to use advanced
technology - Provide a mechanistic correlation with genetic
findings (somatic/germline) - Size affords economy of scale numbers, shared
infrastructure, efficiency - Enhanced collaboration with other groups
64Collaborators
Genetic Epidemiology of Lung Cancer and Smoking
Maria Teresa Landi Co-investigator Peggy
Tucker Michael Alavanja Lynn Goldin Glen
Morgan Andrew Bergen Sholom Wacholder Jay
Lubin Pharmacogenetics Studies Curt Harris Bob
Hoover Peter Shields Peggy Tucker Rashmi
Sinha Stephan Ambs Population
stratification China Lung Sholom
Wacholder Rose Yang Nat Rothman Linda M
Brown Montse Garcia-Closas BJ Stone, Qing Lan
Twin studies Tobacco Genetics Miles
Braun Sophia, Wang, Caryn Lerman Bob
Hoover Peter Shields Mayo Lung SNP
500/CGF Curt Harris Andrew Bergen Bill
Travis Steve Chanock Meredith Yaeger