Using Network Processors in Genomics - PowerPoint PPT Presentation

About This Presentation
Title:

Using Network Processors in Genomics

Description:

search nucleotide/protein database for query. BLAST discovers similarity ... 1.4 GB genome (Zebrafish) IXP1200: 90 sec with DFA. IXP1200: 129 sec with 'trie' ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 24
Provided by: csVu
Category:

less

Transcript and Presenter's Notes

Title: Using Network Processors in Genomics


1
Using Network Processors inGenomics
  • Herbert Bos Kaiming Huang
  • herbertb,khuang_at_liacs.nl
  • Leiden Universiteit, Netherlands
  • Vrije Universiteit, Netherlands
  • http//www.liacs.nl/herbertb/projects/biocomp/

2
Case study BLAST
  • search nucleotide/protein database for query
  • BLAST discovers similarity rather than exact
    match
  • two main phases
  • scoring (registering where query and DNADB match)
  • alignment (dynamic programming)
  • only the first phase on NPUs

3
Window matching
4
Window matching
5
Window matching
6
Window matching
7
Window matching
  • naïve approach roughly WNM comparisons
  • does not scale
  • string search algorithms Aho-Corasick
  • all windows matched at the same time
  • shifting genome one nucleotide at a time
  • matching algorithm transformed in a DFA
  • DFA may be quite large

8
Aho-Corasick
  • Alphabet acgt
  • Window size 3
  • Query acgccga
  • Windows acg,cgc,gcc,ccg,cga

9
Aho-Corasick
  • Alphabet acgt
  • Window size 3
  • Query acgccga
  • Windows acg,cgc,gcc,ccg,cga

a
c
g
0
1
2
3
t
c
g
c
4
5
6
a
12
c
g
10
11
g
c
c
7
8
9
10
Aho-Corasick
  • Alphabet acgt
  • Window size 3
  • Query acgccga
  • Windows acg,cgc,gcc,ccg,cga

a
c
g
0
1
2
3
t
c
g
c
4
5
6
a
12
c
g
10
11
g
c
c
7
8
9
11
Aho-Corasick
  • Alphabet acgt
  • Window size 3
  • Query acgccga
  • Windows acg,cgc,gcc,ccg,cga

a
c
g
0
1
2
3
t
c
g
c
4
5
6
a
12
c
g
10
11
g
c
c
7
8
9
tacgcga
12
IXPBlast Architecture
Gbps ports
NPU (IXP1200)
ME
ME
scratch
ME
ME
DRAM
Control Processor
ME
ME
Pentium
StrongARM
Microengines
PCI Bus
PCI
13
IXPBlast Architecture
Gbps ports
NPU (IXP1200)
ME
ME
scratch
ME
ME
DRAM
Control Processor
ME
ME
Pentium
StrongARM
Microengines
PCI Bus
PCI
14
IXPBlast Architecture
Gbps ports
NPU (IXP1200)
ME
ME
scratch
ME
ME
DRAM
Control Processor
ME
ME
Pentium
StrongARM
Microengines
PCI Bus
PCI
15
IXPBlast Architecture
Gbps ports
NPU (IXP1200)
ME
ME
scratch
ME
ME
DRAM
Control Processor
ME
ME
Pentium
StrongARM
Microengines
PCI Bus
PCI
16
IXPBlast Architecture
Gbps ports
NPU (IXP1200)
ME
ME
scratch
ME
ME
DRAM
Control Processor
ME
ME
Pentium
StrongARM
Microengines
PCI Bus
PCI
17
IXPBlast Architecture
Gbps ports
NPU (IXP1200)
ME
ME
scratch
ME
ME
DRAM
Control Processor
ME
ME
Pentium
StrongARM
Microengines
PCI Bus
PCI
18
IXPBlast packet handling
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
  • packets read and processed in batches of 100.000
  • spilling must be taken into account
  • currently no feedback

19
Results
  • 232 MHz IXP1200 1.8GHz Pentium-4
  • 1611 Nucleotide query (MyD88)
  • 1.4 GB genome (Zebrafish)
  • IXP1200 90 sec with DFA
  • IXP1200 129 sec with trie
  • P4 132 132 sec with trie
  • number of matches 524856

20
Results
21
Conclusions
  • NPUs are useful in other application domains
  • Newer hardware is expected to perform much better
  • Throughput processors
  • Adapting our current approach to use BLAST
    tricks/heuristics

22
Network processors
  • geared for high throughput
  • used exclusively in network systems
  • example intrusion detection
  • similar to looking for gene onin genomes
  • differences

Radisysixp1200 board
23
Application domain Genomics
  • example search genome for occurrence of
    patterns
  • similar problems as IDS, poor performance on
    GPP? cannot exploit parallelism
  • throughput-driven
  • how about FPGAs?
  • how about clusters?
  • NPU
  • easier to program than FPGAs
  • cheaper than cluster computing
  • on the desktop ? IP never leaves the room
Write a Comment
User Comments (0)
About PowerShow.com