Sequencing by Ligation on Polony Beads - PowerPoint PPT Presentation

About This Presentation
Title:

Sequencing by Ligation on Polony Beads

Description:

Molecular Genomic Imaging Center (CEGS) Harvard / Wash U George Church, Rob Mitra Greg Porreca, Jay Shendure Sequencing by Ligation on Polony Beads – PowerPoint PPT presentation

Number of Views:598
Avg rating:3.0/5.0
Slides: 25
Provided by: George900
Category:

less

Transcript and Presenter's Notes

Title: Sequencing by Ligation on Polony Beads


1
Sequencing by Ligation on Polony Beads
Molecular Genomic Imaging Center (CEGS)Harvard
/ Wash UGeorge Church, Rob MitraGreg Porreca,
Jay Shendure
Personal Genomics, Stem Cells, ELSI
with Nick Reppas, Kun Zhang, Shawn Douglas, Mike
Wang, Abraham Rosenbaum, Agencourt
Synthetic Biology
2
Polymerase colony
2 vs. 1 immobilized primer in situ
polonies vs. emulsion PCR beads single molecule
vs. multi-molecule detection dNTP extension (SBE)
vs. ligation (SBL) (gt3X error 1e-6, 1/10 cost
of ABI E.coli )
  • Single chromosomes haplotyping (Zhang)
  • Single cells full sequence (Zhang Martiny)
  • Single RNA molecules RNA splicing (Zhu, Varma)

Shendure, Porreca, Mitra, Church
3
Polony Sequencing Overview
  • 1. In vitro construction of a complex
  • mate-paired library
  • 2. Template amplification to
  • one micron beads by emulsion PCR
  • 3. Cyclic Array
  • Sequencing by Ligation (SBL)

4
In vitro construction of a complex, mate-paired
library
common sequences
43 bp 32 25
1 kb genomic fragment
Fisseq
-
F
Fisseq
-
R
Fisseq
-
F


Left

Right
T30
Tag 2
Tag 1
Mid
Seq2
Seq1
paired genomic tags (17 to 18 bp each)
MmeI
Total 134-136 bp amplicon
5
Template Amplification
  • Emulsion PCR
  • to 1 micron beads
  • Dressman et al. PNAS'03

6
Enrichment by Hybridization
7
One of 750 megapixel frames of gel-immobilized
1.0 micron beads, 0.3 micron pixels, 4-colors
8
Sequencing by Ligation (SBL) with fluorescent
combinatorial 9-mers
Excitation Emission 647 700 555
605 572 630 555 700
5-Cy5-nnnnAnnnn-3 5-Cy3-nnnnGnnnn-3
5-TR-nnnnCnnnn-3 5-Cy3Cy5-nnnnTnnnn-3
nm
5'PO4
ACUCAUC (3)TAGAGT???
?????????????TGAGTAG(5)
9
Why low error rates?
Goal of Resequencing ? Discovery of Uncommon
Variation
Consensus Accuracy False Positives (E.coli) False
Positives (Human) 1E-3 4,000 3,000,000 1E-4
BERMUDA/ABI 400 300,000 1E-6 Polo
ny-SBL 4 3000
10
Genome engineeringSelect for cross-feeding
SecondPassage
First Passage
?trp/?tyrA pair of genomes shows the best
co-growth (syntrophs)
Reppas Lin
11
Co-evolution of cross-feedingTrp- Tyr- genome
pair
12
860,000 independent mate-pairing events
1 kb genomic fragment
980 96 bp
13
Aberrations in mate-pair distance indicative of
rearrangements
1,974,001 (MG1655)
1,978,000 (MG1655)
confirmed 776 bp deletion via tandem 8 bp repeats
14
Base-calling Tetrahedron
C
A
T
G
Fluorescent SBL data quality measured by distance
to the 4 vertices.
15
Raw Error Rate
Q40
Q30
Q20
Mean accuracy 99.5 Best 50 of base-calls
are 99.9 accurate
16
Consensus error rates
17
Mutation Discovery in Engineered Evolved
Trp-Strain
Position Type Gene Location ABI Confirmation Comments
986,334 T gt G ompF TATA box ? Only in evolved strain
931,960 8 bp del lrp frameshift ? Only in evolved strain
1,976,500 776 bp del insB_5 IS element ? MG1655 heterogeneity
3,957,960 C gt T ppiC 5' UTR ? MG1655 heterogeneity
4,654,533 T gt C cI Glu gt Glu ? l heterogeneity
4,647,960 T gt C ORF61 Lys gt Gly ? l heterogeneity
985,797 T gt G ompF Glu gt Ala (in progress)
454,864 T gt C tig Gly gt Gly (in progress)
4,648,691 G gt A exo Phe gt Phe (in progress)

18
Cost comparison projection
ABI 2004 Jun 2005 2006
gt2007 bp/expt - 2e7 3e7 3e8
60e9 Complexity (bp) - 74 4e6 3e9
6e9 Avg Fold Cov 8 3e5 6 0.1
10 Pix per bp - 300
1724 333 1 Read-length 900 14
(SBE) 25 (pair) 35 42 / Q20 kb
8e-1 - 8e-2 4e-2
1e-5 / 1X 3e9 b 2e6 -
2e5 5e4 1e2
(2e3) Indel Error 5e-3 0.6 1e-3
1e-3 1e-3 Subst Error
4e-3 4e-6 1e-3 1e-3 1e-3 3X Cons
Err 1e-4 -
1e-6 3e-7 1e-7 Kb / min
0.8 360 27 1e3 1e6 Pix /
sec - 2e5 2e6 6e6 2e7 Enz
/mg - 8 8 8 0.4
19
Challenges in 2000 genome
gt2007 bp/expt 60e9
20X of 3e9 10X diploid Complexity (bp) 6e9
Automated 96-well libraries Avg Fold Cov
10 (Currently align .4
pix .1 micron) Pix per bp 1
Sensitivity align CCD slide? Read-length
42 Is 34 enough? (next slide) /
Q20 kb 1e-5 (20X 3e9) /
1X 3e9 b 1e2 (2e3) Need haplotyping
too? (slide after next) Indel Error
1e-3 Subst Error 1e-3 3X Cons Err
1e-7 Kb / min 1e6 Pix / sec
2e7 Current camera is 3e7, but stage
is 2e6 Enz /mg 0.4 Realized for
many recombinant proteins
20
Human Resequencing with Mate-Paired 17 bp Tags
simulation
Assume paired 17-mers (i.e. read full tag
length) with 750-1150 bp distance
distribution (980 s96 bp observed) Exact
Matching (34/34) Zero
Unique Multiple Paired, no substitutions
---- 94.4 5.6 Paired, one substitution
98.3 0.5 1.3 Unpaired, no substitutions
98.8 0.3 0.9 Single Substitution or Exact
(33/34 or 34/34) Zero Unique Multiple Pai
red, no substitutions ---- 90.4 9.7 Paired,
one substitution ---- 92.8 7.2 Unpaired,
no substitutions 96.0 1.5 2.5
21
Single chromosome molecule haplotypes
GM10835
rs3778973 rs1557917 rs39284 rs10500042 rs4717028
C G C G C
T A T A T
153Mb
TT137 CT2 (TC1)
CC131
22
Amplifying sequencing whole genomes from single
cells
Escherchia Prochlorococcus
Zhang, Martiny, Chisholm, Church, unpub.
No template control
f29 real-time amplification
Affymetrix quantitation of 2 independent
amplifications
23
Polymerase colony
2 vs. 1 immobilized primer in situ
polonies vs. emulsion PCR beads single molecule
vs. multi-molecule detection dNTP extension (SBE)
vs. ligation (SBL) (gt3X error 1e-6, 1/10 cost
of ABI E.coli )
  • Single chromosomes haplotyping (Zhang)
  • Single cells full sequence (Zhang Martiny)
  • Single RNA molecules RNA splicing (Zhu, Varma)

Shendure, Porreca, Mitra, Church
24
Roundtable I
Shared Resources STTR Polymerase libraries
NEB MJR ABI Fuller CCDs spectra, cost,
pixels, sensistivity, speed
software Cancer Genome 12500 NCAB clonal?
enrichment MRD accuracy read
length Cost estimates distribute template
spreadsheet
Write a Comment
User Comments (0)
About PowerShow.com