Title: Hardware Acceleration of Sequence Alignment Algorithms
1Hardware Acceleration of Sequence Alignment
Algorithms
- Laiq Hasan
- Zaid Al-Ars
- May 30, 2007
- Computer Engineering Lab
- TU Delft
2Outline
- Introduction
- Bioinformatics Overview
- Major Research Areas
- Sequence Alignment Methods
- Global/local Methods
- Exact/approximate Methods
- Hardware Acceleration of Smith-Waterman (S-W)
Algorithm - FPGAs
- SIMD Solutions
- Profiling Results
- Future Work
3 Introduction
- Bioinformatics
- A new hybrid field
- Partly molecular biology and partly computer
science
Compute 3D protein structure ? protein sequence
Sequence Alignment A way of arranging the
primary sequences of DNA, RNA, or protein to
identify regions of similarity
Find genetic relationships between sequences /
species
Identify and classify genes on the genome
4Sequence Alignment Methods
Some form of S-W is also the core of FASTA and
BLAST, making its hardware acceleration even
more useful since it can later be used for
improving the performance of FASTA and BLAST as
well.
- Global End to end alignment
- Local Looks for internal small stretches of
similarity
Approximate Methods
Exact Methods
5Dot Plot
- The most basic method of comparing two sequences
- The sequences to be compared are arranged along
the margins of a matrix - A dot is placed at every point in the matrix
where the two sequences are identical - A diagonal stretch of dots indicates regions
where the two sequences are similar - For clarity, dots are marked in the table as xs
6Dot Plot (Continued)
7Needleman-Wunsch (N-W) Algorithm
- Straightforward Dynamic Programming (DP)
algorithm to find the optimal global alignment of
two sequences A and B - The algorithm is based on finding the elements of
a matrix H according to - where Si,j is the similarity score of comparing
- Ai to Bj and d is the penalty for a mismatch.
The matrix is initialized with H0,0 0
8N-W Example
- The two sequences to be globally aligned are
-
- G A A T T C A G T T A (sequence 1), G G A T C G
A (sequence 2) - so M 11 and N 7 (the length of sequence 1
and sequence 2, respectively) -
- A simple scoring scheme is assumed where,
- Si,j 1 if the residue at position i of sequence
1 is the same as the residue at position j of - sequence 2 (match score) otherwise
- Si,j 0 (mismatch score)
- d 0 (gap penalty)
-
- Three steps in dynamic programming
- 1. Initialization Step
- 2. Matrix Fill Step
- 3. Trace back
9N-W Example (Matrix Fill Step)
10N-W Example (Trace back)
11Smith-Waterman (S-W) Algorithm
- When obtaining the local S-W alignment, Hi,j for
N-W algorithm is modified as follows - Initialization and matrix fill step are the same
as those of N-W. - The difference lies in the trace back.
12 S-W Example
A A G G T A C and B C A G C G T T G
Assume that
The best score found in the matrix is 6, and the
corresponding optimal local alignment is
13Comparison of various sequence alignment
algorithms.
Space Complexity
Time Complexity
Search Method
Local/Global
Algorithm
O(MN)
O(MN)
Basic
Global
Dotplot
O(MN)
O(MN)
DP
Global
N-W
O(MN)
O(MN)
DP
Local
S-W
O(MN)
O(MN)
Heuristic
Local
FASTA
O(20w MN)
O(MN)
Heuristic
Local
BLAST
14Hardware Acceleration of S-W
- FPGAs
- Reconfigurable data processing devices
- Advantageous for implementing massively-parallel
algorithms, like the Smith-Waterman.
- SIMD
- An acronym for Single-Instruction stream,
Multiple-Data stream - an array of processing elements, like a multi
core architecture - however, the most common recent meaning of SIMD
is for uniprocessors i.e. vector instructions
typically used for multimedia
15Using FPGA Custom Instructions
- To improve the computational processing time of
the S-W algorithm - Write the S-W algorithm in pure software
- Replace the computationally intensive portion
with FPGA custom instruction - Compare the processing runtime between the pure
software and the hardware acceleration versions - Calculate the percentage of runtime improvement.
16Runtime Reconfiguration (RTR)
- RTR is an approach to reuse hardware for too
large operations that do not fit in the FPGA or
to customize the resources according to the
length of the input sequences, targeting
performance and/or power benefits. - Highly desirable for sequence alignment,
particularly for very long sequences. - High performance can be achieved using
off-the-shelf FPGA boards.
17Systolic Arrays
- Two vector array inputs, M and N
- The processing cells have a value, Uij
- Which is usually a result due to a predefined
algorithm within the cells - Theoretically perfect hardware implementation
of S-W - Capable of extracting the maximum parallelism
- Practical problems with memory accesses
- Also sequences of different lengths make
- the design of an efficient systolic array
difficult
18SIMD Solutions
- Micro Grained Array Processor (MGAP)
- a general purpose fine-grained architecture
- possesses the capability to solve computationally
intensive problems in Molecular Biology
efficiently and inexpensively -
- Kestrel Parallel Processor
- is a single-board coprocessor with a 512-element
linear array of 8-bit, SIMD processing elements - designed and built at the University of
California at Santa Cruz, - Motivated by the work on the Human Genome Project
and other bioinformatics applications -
- Graphics Processing Units (GPUs)
- single-chip processors (using SIMD instructions),
used primarily for computing 3D functions - inexpensive, high-performance SIMD architecture
- is a good match for bioinformatics applications
e.g. S-W algorithm for sequence alignment
19Comparison of the work reviewed in literature
20Profiling Results
21Future Work
- Identifying a common measure to compare different
acceleration methods in the literature. - Designing a CCU for the maxSWCandidate function
identified in the profiling results - Implementation on FPGA
- Looking for techniques that exploits maximum
parallelism in the S-W algorithm
22Thanks