Title: Genome Sequencing Technology and The Cancer Genome Atlas TCGA
1Genome Sequencing Technology and The Cancer
Genome Atlas (TCGA)
- Brad Ozenberger, PhD
- Program Director, The Cancer Genome Atlas
- Program Director, Technology Development
- National Human Genome Research Institute
- National Institutes of Health
- September 29, 2009
2Collins et al., Nature 4/24/03
3Human Genome Project Sequencing Centers
Slide credit Eric Green, NHGRI
4Emulsion Based Clonal Amplification
PCR Reagents
Emulsion Oil
Micro-reactors
Create Water-in-oil emulsion
Mix DNA Library capture beads (limited
dilution)
Prepare adapter-carrying library DNA
Break micro-reactors Isolate DNA containing
beads
Perform emulsion PCR
- Generation of millions of clonally amplified
sequencing templates on each bead - No cloning and colony picking
5Sequencing by cyclic synthesis (SBS)
Cycle 1 Add sequencing reagents
First base incorporated
Remove unincorporated bases
Detect signal
Cycle 2-n Add sequencing reagents and repeat
Do this on a microarray with hundreds of
thousands or millions of different molecules on a
surface!
6(No Transcript)
7Applied Biosystems SOLiD 2007
Polonator 2009
8NHGRI Centers - Installed base and experience
3 Large-scale sequencing centers Washington
University The Broad Institute Baylor
College of Medicine
9University of North Carolina Chapel Hill J.
Michael Ramsey
Nanopore Sequencing
For the sensor configuration shown at left, with
electrodes in the walls of a nanopore, modeling
shows that the distributions of current values
for each nucleotide will be sufficiently
different to allow for rapid sequencing.
M Di Ventra 2006 Nano Lett. 6,779-782
10Cancer in the U.S.
- gt10,000,000 in the US have cancer (1 in 30
people) - 1,400,000 people were diagnosed in 2008
- 700,000 will die from cancer this year (a death
every 45 seconds)
11Cancer A Disease of the Genome
- Goals of NHGRI Cancer Genomics Research
- Catalog genomic changes in tumors
- Identify new targets for therapy
- Enable individualized therapy based on the
genomic signature of a tumor - Develop technologies to investigate
heterogeneous and small tumor samples
12NCI-NHGRI Partnership
Cancer Genomics Ultimate Goal Create
comprehensive public catalog of all genomic
alterations present at significant frequency for
all major cancer types National Cancer Advisory
Board Report - Feb, 2005
Integrate
13TCGA Project Pipeline
Supplementary Figure 1
Tissue Sample
Analysis
Pathology QC
Sequencing
Data and Results Storage QC
Integrative Analysis
DNA RNAIsolation, QC
Expression,CNA LOH,Epigenetics
Comprehensive Knowledge of a Cancer
Analysis
Process Data Results
BCR
GSCs
CGCCs
DCC
Collaborators
14TCGA Pilot Components
15TCGA Connecting multiple sources, experiments,
and data types
Three Cancers - TCGA Pilot glioblastoma
multiforme(brain) squamous carcinoma(lung) s
erouscystadenocarcinoma(ovarian)
Biospecimen CoreResource with more than 13
Tissue Source Sites 7 Cancer GenomicCharacterizat
ion Centers 3 GenomeSequencingCenters Data
Coordinating Center
16GBM Findings
- September 2008, TCGA published study of
glioblastoma (GBM), reported discovery of new
mutations confirmed many maybes (Nature) - Data types integrated across labs and across the
genome, transcriptome, epigenome clinical data
and outcomes - Performed in-depth, integrated characterization
of the tumor genomes of 206 GBM patients - Identified three genes and three core biological
pathways commonly altered in GBM tumors - Discovered possible mechanism by which GBM tumors
become resistant to TMZ
17TCGA By the numbers
- GBM Project
- 262 cases complete for gene exp., SNP array,
methylation - Sequencing 144 cases / 1,300 genes, 56 cases /
6,000 genes, 12 whole genomes - Ovarian Project
- 379 cases undergoing comprehensive analysis
- Sequencing 238 cases / 6,000 genes, 12 whole
genomes - gt8 Terabases of sequence data generated in 4
months (equivalent to 1,000 Human Genome Projects)
18OVARIAN
Lost BRCA1 germline indel
NF1-EFCAB5 fusion gene probably
inactivatingvalidated by RNA-seq
Courtesy of Gad Getz Broad Institute
19Status TCGA Pilot Program
- Set up and functionalized all part of TCGA
network (10 centers, over 150 scientists) and
developed pipeline from samples to data
availability - Built an unprecedented team of scientists,
oncologists, pathologists, bioethicists,
technologists and bioinformaticists and a working
pipeline from sample to data release - Set a high bar for sample quality and percentage
of tumor nuclei drove data quality - Implemented 2nd generation sequencing methods -
with intensive effort on computational methods - WE CAN DO IT
20TCGA expansion
- Project will scale production level pipeline
for 20-25 tumors - Increased emphasis on an analysis pipeline
- Implementation of 2nd generation genome
sequencing technologies - Specific goals
- Standards for biospecimen acquisition - high
quality of all aspects of samples, clinical
information and data - Complete genome characterization each cancer case
- Two levels of data integration and analysis
advanced approaches and tools for visualization
and management of data
21Potential Tumors for TCGA
22Game Changing Research
- NIH to accelerate TCGA program with funds
provided by the American Recovery and
Reinvestment Act total commitment of 275
million over next 2 years - During two years of ARRA funding TCGA will
complete comprehensive genome characterization of
10 tumor types and initiate 10 more - The Cancer Genome Atlas project will forever
change genomic and cancer research technology
and analysis, tumor biomarkers, therapeutic
targets, individualized approaches to therapy,
and new hope for cancer patients
23Many to acknowledge
- The cancer patients who contribute specimens
- My staff and TCGA teammates at NIH
- The 100 researchers who work daily on The Cancer
Genome Atlas project - cancergenome.nih.gov