Title: SPIRE2005
1 Normalized Similarity of RNA Sequences
Rolf Backofen Danny Hermelin Gad M. LandauOren
Weimann
2RNA sequences
3RNA sequences
C G
C G
G C
C
A U
U A
C G
A
G
U
A
G
U
C
G
U
A
G
U
A
C
C
A
C
A
G
U
U
G
C
G
G
4LCS of Strings
S1
C
G
U
A
G
U
A
C
C
A
C
A
G
U
G
U
G
G
C
S2
A
G
C
A
G
C
C
U
C
G
G
G
C
G
G
A
A
U
U
Global LCS
Hirschberg 1977
5LCS of RNA sequences
Left arc match
Right arc match
RNA Global LCS Klein
1998
6Global Similarity - LCS
7Local Similarity Normalized LCS
- Report the most similar substring pair according
to some scoring scheme. - In our case, we look for the substrings
(with their arcs) that maximize - Can be viewed as measure of the density of the
matches.
8Local Similarity in Strings
- Local edit distance O(nm) Smith Waterman 1981
- Normalized LCS O(mnlogn) Arslan Pevzner 2001
- Normalized LCS for sparse matrices O(rLloglogn)
- Efraty Landau 2004
9Our Result
- A novel local similarity metric for comparing RNA
sequences. - An time algorithm for computing
this metric. - As fast as the global algorithm (in contrast to
the case of strings).
10Definitions
- A chain is a sequence of matches that is strictly
increasing in rows and columns.
- The length of a chain from (i,j) to match (i,j)
is i-ij-j.
- A k-chain(i,j) is the shortest chain of k
matches starting from (i,j).
- The normalized value of k-chain(i,j) is k
divided by its length. ( )
11- General idea - Construct (k1)-chain(i,j) by
concatenating (i,j) to k-chain(i,j) . -
a b c a d e c f h c
a
g
g
b
f
h
e
c
g
g
g
f
d
e
f
12Decomposing k-Chains
13Decomposing k-Chains (non arc match)
Best (k-1)-Chain
14Decomposing k-Chains (mismatch)
15Decomposing k-Chains (right arc match)
Best k-Chain
16Decomposing k-Chains (left arc match)
17Decomposing k-Chains (left arc match I)
18Example 2-Chain
19Decomposing k-Chains (left arc match II)
20Decomposing k-Chains (left arc match II)
k lcs
Best (k-lcs)-Chain
21Decomposing k-Chains (left arc match III)
k lcs
22Example 3-Chain
23The Algorithm (Given R1,R2)
- Run Kleins algorithm to get LCS of every arc in
R1 with every arc in R2. - For k1,2,,n
- Construct all k-chains from
- bottom right to top left using DP.
- Report best k-chain.
- Total of - as fast as
global LCS
24The DP
25Muchas Gracias por la atencion