Title: Deep Sequencing
1Deep Sequencing
2IV. Background--Deep Sequencing
3Illumina Sequencing Technology
- Relies on the attachment of randomly fragmented
and amplified DNA (clonal clusters) to a planar,
optically transparent surface (flow cell). - High-sensitivity fluorescence detection is
achieved using laser excitation and total
internal reflection optics. - Sequence reads are aligned against a reference
genome and genetic differences are called using
proprietary software.
4Illumina Cluster Station
- Amplification of clonal clusters from single
molecule fragments covalently bound to the flow
cell surface. - Software-controlled system.
- Walk-away automation.
- In 5 hours (lt1 hour of hands-on time).
- Generates clonal clusters for up to eight samples
in parallel. - Where DNA or RNA is prepared for
sequencing-by-synthesis on the Genome Analyzer.
5Paired-End Module
- After completion of the first read, the templates
can be regenerated in situ to enable a second gt50
bp read from the opposite end of the fragments. - The Paired-End Module directs the regeneration
and amplification operations to prepare the
templates for the second round of sequencing. - First, the newly sequenced strands are stripped
off and the complementary strands are bridge
amplified to form clusters. - Once the original templates are cleaved and
removed, the reverse strands undergo
sequencing-by-synthesis. - The second round of sequencing occurs at the
opposite end of the templates, generating gt50 bp
reads for a total of gt10 Gb of paired-end data
per run.
6Illumina Flow Cell
- One (1) flow cell.
- Eight (8) channels.
- Each channel can run up to twelve (12)
differently tagged libraries (Multiplexed
Sequencing). - Input requirement 0.11.0 µg (single- and
paired-end reads), - 10 µg (Mate Pair reads).
- 1.4-mm wide channels.
- Relies on the binding of randomly fragmented DNA
to attached oligos on a planar, optically
transparent surface (flow cell). - 96-120 million reads (clusters) per flow cell,
each containing 1,000 copies of the same
template.
7Work Flow
8Sequencing Technology Overview
9Sequencing Technology Overview
10Sequencing Technology Overview
11Sequencing Technology Overview
12VI. The Different Sequencing Platforms Offered
by the DGML.
13 14(No Transcript)
15Single-read sequencing for shorter reads to
achieve quicker turnaround times.
16Paired-end libraries (200-500 bp) best for large
and small insertions, deletions (indels),
inversions, and other rearrangements. Good for
characterizing repetitive sequence elements and
filling gaps.
17Mate pair libraries (2-5 kb) are best for de
novo assembly, including genome scaffolding
and genome finishing. Best for identifying
large structural variants.
18 19RNA-Seq
- Full sequence of all RNAs
- mRNAs, miRNAs, etc,
- Quantifiable
- Alternate splice variants
- Alternate poly(A) variants
- Rare transcripts
- Novel transcripts
- Full coverage
- Great depth
- 36 cycles
20Digital Gene Expression (DGE)-Tag Profiling
- Great depth
- Quantifiable
- Only 16 cycles required
Contains a MmeI site
MmeI cuts 17 bp downstream
Restriction site every 256 bp, will cut 99 of
mRNAs. Can also use DpnII.
21Illumina Tag Sequencing vs. Microarrays 't Hoen
et al. (2008). Nucleic Acids Res. 36(21) e141.
- Solexa/Illumina deep sequencing technology and
five different microarray platforms were
compared. - The hippocampal expression profiles of wild-type
and transgenic mutant dC-doublecortin-like kinase
mice. - Illumina DGE 2.4 million sequence tags per
sample, spanning four orders of magnitude. - Results were highly reproducible, even across
laboratories. - Found differential expression of 3,179
transcripts with an estimated FDR of 8.5. - Overlap was most significant for Affymetrix.
- Changes in expression observed by deep sequencing
were larger than observed by microarrays or
quantitative PCR. - While undetectable by microarrays, antisense
transcription was found for 51 of all genes and
alternative polyadenylation for 47.
22miRNAs, non-coding RNAs, anti-sense RNAS
miRNAs, non-coding RNAs, anti-sense RNAS
Isolate small RNA
Universal Platform Analyze any small RNA without
prior sequence or secondary structure
information. Customizable Size Selection
Investigate any small RNA between 17 and 35
nucleotides. Broad Dynamic Range Profile gt10
million small RNA sequences in each flow cell
channel
23 24ChIP-Seq and RNA-IP-Seq
25Genomic DNA Methylation
- Generate genomic DNA libraries with and without
bisulfite conversion. - Continue with Paired-end and/or Mate-Pair
sequencing.
26- Multiplexed
- Sequencing
- Allows multiple
- samples to be
- sequenced in a
- single channel.
- Great money saver!!
27 28Third-party Genome Analyzer Data Analysis
ToolsGenome Alignment BrowsersGbrowse -
Genomic BrowsingGeneric Model Organism Database
Project http//www.gmod.org/wiki/index.php/Gbrows
e UCSC Browser - Genome browsing and
comprehensive annotationGeneric Model Organism
Database Project http//genome.ucsc.edu/goldenPat
h/help/customTrack.html Staden Tools (GAP4,
TGAP) - Alignment and Visualization for Small
Data SetsJames Bonifield (initially developed by
Rodger Staden), Wellcome Trust Sanger Institute
http//sourceforge.net/projects/staden/
Alignment and Polymorphism DetectionBFAST
Blat-like Fast Accurate Search ToolNils Homer,
Stanley F. Nelson and Barry Merriman, University
of California, Los Angeles, http//genome.ucla.edu
/bfast MAQ Mapping and Assembly with
QualityHeng Li, Sanger Centre
http//maq.sourceforge.net/maq-man.shtml Bowtie
- An ultrafast memory-efficient short read
alignerBen Langmead and Cole Trapnell, Center
for Bioinformatics and Computational Biology,
University of Maryland http//bowtie-bio.sourcefo
rge.net/ Genomic AssemblyVelvet De novo
assembly of short readsDaniel Zerbino and Ewan
Birney, EMBL-EBI http//www.ebi.ac.uk/zerbino/ve
lvet/ SSAKE Assembly of short readsRene
Warren, et al, British Columbia Cancer Agency
http//bioinformatics.oxfordjournals.org/cgi/conte
nt/full/23/4/500 Euler Genomic AssemblyPavel
Pevzner and Mark Chaisson, University of
California, San Diego http//nbcr.sdsc.edu/euler/
ChIP SequencingChIP-Seq Peak FinderBarbara
Wold, Cal Tech and Rick Meyers, Stanford
University http//woldlab.caltech.edu/html/softwa
re/ Digital Gene ExpressionComparative Count
DisplayAlex Lash, NIH ftp//ftp.ncbi.nlm.nih.gov
/pub/sage/obsolete/bin/ccd/ SAGE DGED
Tool Cancer Genome Anatomy Project
http//cgap.nci.nih.gov/SAGE/SDGED_Wizard?METHODS
S10,LS10ORGHs
29Microarrays
- RNA 470 / array
- miRNA 480 / array
- CpG Island 530 / array
- CGH 580 / array
- Note Prices may drop depending on volume.
30Sequencing
- Library construction (optional service)
- 1000 / sample for single reads
- 1250 / sample for paired-end reads
- Kits
- DNA Prep (10 samples) 500
- ChIPseq (10 samples) 350
- Small RNA Prep Kits (10 samples) 350
- Sequencing
- Single-end runs 1000 / lane
- Paired-end runs 1700 / lane
- Multiplexing
- Multiplexing samples makes the total costs
to an investigator who performs a typical
experiment such as ChIPseq quite reasonable. For
example, with the purchase of a single (10
sample) 500 ChIPseq kit and the ability to
barcode the non-IP-selected and ChIP-selected DNA
with two different DNA linkers, an investigator
can obtain ChIP sequences and control genomic
sequences in a single 1000 run. Given the costs
of whole genome tiling arrays, few investigators
would opt to do this experiment by ChIP-on-chip.
31Contact us. http//dms.dartmouth.edu/dgml/Lab
603-653-9978Can contact anyone in the
pipeline and will be properly directed.
32The END