Hardware Acceleration of Sequence Alignment Algorithms

About This Presentation

Title:

Hardware Acceleration of Sequence Alignment Algorithms

Description:

Hardware Acceleration of Sequence Alignment Algorithms. Laiq Hasan. Zaid Al ... Kestrel ... The UCSC Kestrel parallel processor (query size 100) A. e. a. Di ... – PowerPoint PPT presentation

Number of Views:58

Avg rating:3.0/5.0

Slides: 23

Provided by: Todo3

Category:

more less

Transcript and Presenter's Notes

Title: Hardware Acceleration of Sequence Alignment Algorithms

1
Hardware Acceleration of Sequence Alignment
Algorithms

Laiq Hasan
Zaid Al-Ars
May 30, 2007
Computer Engineering Lab
TU Delft

2
Outline

Introduction
Bioinformatics Overview
Major Research Areas
Sequence Alignment Methods
Global/local Methods
Exact/approximate Methods
Hardware Acceleration of Smith-Waterman (S-W)
Algorithm
FPGAs
SIMD Solutions
Profiling Results
Future Work

3
Introduction

Bioinformatics
A new hybrid field
Partly molecular biology and partly computer
science

Compute 3D protein structure ? protein sequence
Sequence Alignment A way of arranging the
primary sequences of DNA, RNA, or protein to
identify regions of similarity
Find genetic relationships between sequences /
species
Identify and classify genes on the genome
4
Sequence Alignment Methods
Some form of S-W is also the core of FASTA and
BLAST, making its hardware acceleration even
more useful since it can later be used for
improving the performance of FASTA and BLAST as
well.

Global End to end alignment
Local Looks for internal small stretches of
similarity

Approximate Methods
Exact Methods
5
Dot Plot

The most basic method of comparing two sequences
The sequences to be compared are arranged along
the margins of a matrix
A dot is placed at every point in the matrix
where the two sequences are identical
A diagonal stretch of dots indicates regions
where the two sequences are similar
For clarity, dots are marked in the table as xs

6
Dot Plot (Continued)
7
Needleman-Wunsch (N-W) Algorithm

Straightforward Dynamic Programming (DP)
algorithm to find the optimal global alignment of
two sequences A and B
The algorithm is based on finding the elements of
a matrix H according to
where Si,j is the similarity score of comparing
Ai to Bj and d is the penalty for a mismatch.
The matrix is initialized with H0,0 0

8
N-W Example

The two sequences to be globally aligned are
G A A T T C A G T T A (sequence 1), G G A T C G
A (sequence 2)
so M 11 and N 7 (the length of sequence 1
and sequence 2, respectively)
A simple scoring scheme is assumed where,
Si,j 1 if the residue at position i of sequence
1 is the same as the residue at position j of
sequence 2 (match score) otherwise
Si,j 0 (mismatch score)
d 0 (gap penalty)
Three steps in dynamic programming
1. Initialization Step
2. Matrix Fill Step
3. Trace back

9
N-W Example (Matrix Fill Step)
10
N-W Example (Trace back)
11
Smith-Waterman (S-W) Algorithm

When obtaining the local S-W alignment, Hi,j for
N-W algorithm is modified as follows
Initialization and matrix fill step are the same
as those of N-W.
The difference lies in the trace back.

12
S-W Example
A A G G T A C and B C A G C G T T G
Assume that
The best score found in the matrix is 6, and the
corresponding optimal local alignment is
13
Comparison of various sequence alignment
algorithms.
Space Complexity
Time Complexity
Search Method
Local/Global
Algorithm
O(MN)
O(MN)
Basic
Global
Dotplot
O(MN)
O(MN)
DP
Global
N-W
O(MN)
O(MN)
DP
Local
S-W
O(MN)
O(MN)
Heuristic
Local
FASTA
O(20w MN)
O(MN)
Heuristic
Local
BLAST
14
Hardware Acceleration of S-W

FPGAs
Reconfigurable data processing devices
Advantageous for implementing massively-parallel
algorithms, like the Smith-Waterman.

SIMD
An acronym for Single-Instruction stream,
Multiple-Data stream
an array of processing elements, like a multi
core architecture
however, the most common recent meaning of SIMD
is for uniprocessors i.e. vector instructions
typically used for multimedia

15
Using FPGA Custom Instructions

To improve the computational processing time of
the S-W algorithm
Write the S-W algorithm in pure software
Replace the computationally intensive portion
with FPGA custom instruction
Compare the processing runtime between the pure
software and the hardware acceleration versions
Calculate the percentage of runtime improvement.

16
Runtime Reconfiguration (RTR)

RTR is an approach to reuse hardware for too
large operations that do not fit in the FPGA or
to customize the resources according to the
length of the input sequences, targeting
performance and/or power benefits.
Highly desirable for sequence alignment,
particularly for very long sequences.
High performance can be achieved using
off-the-shelf FPGA boards.

17
Systolic Arrays

Two vector array inputs, M and N
The processing cells have a value, Uij
Which is usually a result due to a predefined
algorithm within the cells
Theoretically perfect hardware implementation
of S-W
Capable of extracting the maximum parallelism
Practical problems with memory accesses
Also sequences of different lengths make
the design of an efficient systolic array
difficult

18
SIMD Solutions

Micro Grained Array Processor (MGAP)
a general purpose fine-grained architecture
possesses the capability to solve computationally
intensive problems in Molecular Biology
efficiently and inexpensively
Kestrel Parallel Processor
is a single-board coprocessor with a 512-element
linear array of 8-bit, SIMD processing elements
designed and built at the University of
California at Santa Cruz,
Motivated by the work on the Human Genome Project
and other bioinformatics applications
Graphics Processing Units (GPUs)
single-chip processors (using SIMD instructions),
used primarily for computing 3D functions
inexpensive, high-performance SIMD architecture
is a good match for bioinformatics applications
e.g. S-W algorithm for sequence alignment

19
Comparison of the work reviewed in literature
20
Profiling Results
21
Future Work

Identifying a common measure to compare different
acceleration methods in the literature.
Designing a CCU for the maxSWCandidate function
identified in the profiling results
Implementation on FPGA
Looking for techniques that exploits maximum
parallelism in the S-W algorithm

22
Thanks

Write a Comment

User Comments (0)

About PowerShow.com

Hardware Acceleration of Sequence Alignment Algorithms - PowerPoint PPT Presentation

Hardware Acceleration of Sequence Alignment Algorithms

Hardware Acceleration of Sequence Alignment Algorithms. Laiq Hasan. Zaid Al ... Kestrel ... The UCSC Kestrel parallel processor (query size 100) A. e. a. Di ... – PowerPoint PPT presentation