GigAssembler - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

GigAssembler

Description:

GigAssembler * * * * * * * * * * * * * * * * * * * * * * * * * Currently available single-molecule next-gen sequencing platforms monitor the sequencing of single DNA ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 31
Provided by: Taejoo9
Category:

less

Transcript and Presenter's Notes

Title: GigAssembler


1
GigAssembler
2
Genome Assembly A big picture
http//www.nature.com/scitable/content/anatomy-of-
whole-genome-assembly-20429
3
GigAssembler Preprocessing
  • Decontaminating Repeat Masking.
  • Aligning of mRNAs, ESTs, BAC ends paired reads
    against initial sequence contigs.
  • psLayout ? BLAT
  • Creating an input directory (folder) structure.

4
http//www.triazzle.com The image from
http//www.dangilbert.com/port_fun.html Reference
Jones NC, Pevzner PA, Introduction to
Bioinformatics Algorithms, MIT press
5
RepBase RepeatMasker
6
GigAssembler Build merged sequence contigs
(rafts)
7
Sequencing quality (Phred Score)
8
Sequencing quality (Phred Score)
Base-calling Error Probability
http//en.wikipedia.org/wiki/Phred_quality_score
9
GigAssembler Build merged sequence contigs
(rafts)
10
GigAssembler Build merged sequence contigs
(rafts)
11
GigAssembler Build sequenced clone contigs
(barges)
12
GigAssembler Build a raft-ordering graph
13
GigAssembler Build a raft-ordering graph
  • Add information from mRNAs, ESTs, paired plasmid
    reads, BAC end pairs building a bridge
  • Different weight to different data type (mRNA
    highest)
  • Conflicts with the graph as constructed so far
    are rejected.
  • Build a sequence path through each raft.
  • Fill the gap with N.
  • 100 between rafts
  • 50,000 between bridged barges

14
Bellman-Ford algorithm
http//compprog.wordpress.com/2007/11/29/one-sourc
e-shortest-path-the-bellman-ford-algorithm/
15
Find the shortest path to all nodes.
Take every edge and try to relax it (N 1 times
where N is the count of nodes)
5
B
C
-2
6
8
A
-3
7
-4
7
D
E
2
9
16
Find the shortest path to all nodes.
Take every edge and try to relax it (N 1 times
where N is the count of nodes)
5
B
C
-2
6
8
A
-3
7
-4
7
D
E
2
9
17
Find the shortest path to all nodes.
Take every edge and try to relax it (N 1 times
where N is the count of nodes)
5
B
C
Inf.
Inf.
-2
6
8
A
-3
7
START
-4
7
D
E
2
Inf.
Inf.
9
18
Find the shortest path to all nodes.
Take every edge and try to relax it (N 1 times
where N is the count of nodes)
5
B
C
6 (? A)
Inf.
-2
6
8
A
-3
0 START
7
-4
7
D
E
2
7 (? A)
Inf.
9
19
Find the shortest path to all nodes.
Take every edge and try to relax it (N 1 times
where N is the count of nodes)
5
B
C
6 (? A)
4 (? D)
-2
6
8
A
-3
0 START
7
-4
7
D
E
2
7 (? A)
2 (? B)
9
20
Find the shortest path to all nodes.
Take every edge and try to relax it (N 1 times
where N is the count of nodes)
5
B
C
4 (? D)
2 (? C)
-2
6
8
A
-3
0 START
7
-4
7
D
E
2
7 (? A)
2 (? B)
9
21
Find the shortest path to all nodes.
Take every edge and try to relax it (N 1 times
where N is the count of nodes)
5
B
C
4 (? D)
2 (? C)
-2
6
8
A
-3
0 START
7
-4
7
D
E
2
7 (? A)
-2 (? B)
9
22
Answer A-D-C-B-E
5
B
C
4 (? D)
2 (? C)
-2
6
8
A
-3
0 START
7
-4
7
D
E
2
7 (? A)
-2 (? B)
9
23
Next-generation sequencing
24
Mardis ER, Annu. Rev. Genomics Hum. Genet., 2008
Illumina
25
Mardis ER, Annu. Rev. Genomics Hum. Genet., 2008
Illumina
26
Mardis ER, Annu. Rev. Genomics Hum. Genet., 2008
Roche/454
27
Mardis ER, Annu. Rev. Genomics Hum. Genet., 2008
SOLiD
28
An example of single molecule DNA
sequencing, from Helicos (approx. 1 billion reads
/ run)
Pushkarev, D., N.F. Neff, and S.R. Quake. Nat
Biotechnol (2009) 27, 847-50 Harris, T.D., et al.
Science (2008) 320, 106-9
29
Mapping program
Trapnell C, Salzberg SL, Nat. Biotech., 2009
30
Two strategies in mapping
Trapnell C, Salzberg SL, Nat. Biotech., 2009
Write a Comment
User Comments (0)
About PowerShow.com