SPIRE2005 - PowerPoint PPT Presentation

About This Presentation
Title:

SPIRE2005

Description:

Smith-Waterman the highest scoring alignment between any pair of substrings of ... The weakness of Smith Waterman approach [AP 2001] ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 23
Provided by: ore94
Category:

less

Transcript and Presenter's Notes

Title: SPIRE2005


1


Local Alignment of RNA Sequences with Arbitrary
Scoring Schemes
Rolf Backofen Danny Hermelin Gad M. LandauOren
Weimann
2
RNA sequences
3
RNA sequences
C G
C G
G C
C
A U
U A
C G
A
G
U
A
G
U
C
G
U
A
G
U
A
C
C
A
C
A
G
U
U
G
C
G
G
4
RNA sequences
C G
C G
A U
G C
C
U A
C G
A
G
U
A
G
U
C
G
U
A
G
U
A
C
C
A
C
A
G
U
U
G
C
G
G
5
Alignment of Strings
S1
U C A C C G __ A __ G
S2
U C G C G G U A U G
Global Alignment
6
Alignment of RNA sequences
A A G G C C C U G A U
A G A C C G U U
A
U
7
Alignment of RNA sequences
A A G G C C C U G A U
A G A C C G U U
U
8
Alignment of RNA sequences
A A G G C C C U G A U
A G A C C G U U
U
RNA Global Alignment via tree edit distance
SZ 1989
Theorem All these algorithms compute the edit
distance between any two arcs provided we match
these arcs.
K 1998
n
DMRW 2006
m
9
The Alignment graph
U C A C C G A G
U
C
G
C
G
G
U
A
U
G
Theorem There is a one to one correspondence
between all paths in the alignment graph and all
alignments of substrings of R1 and R2.
10
The Alignment graph
U C A C C G A G
U
C
G
C
G
G
U
A
U
G
Theorem There is a one to one correspondence
between all paths in the alignment graph and all
alignments of substrings of R1 and R2.
11
The Alignment graph
U C A C C G A G
U
C
G
C
G
G
U
A
U
G
12
The Alignment graph
U C A C C G A G
U
C
G
C
G
G
U
A
U
G
13
The Alignment graph
U C A C C G A G
U
C
G
C
G
G
U
A
U
G
Theorem There is a one to one correspondence
between all paths in the alignment graph and all
alignments of substrings of R1 and R2 in which
all arcs are deleted.
14
The Alignment graph
U C A C C G A G
U
C
G
C
G
G
U
A
U
G
15
The Alignment graph
U C A C C G A G
U
C
G
C
G
G
U
A
U
G
Theorem There is a one to one correspondence
between HEAVIEST paths in the alignment graph and
OPTIMAL alignments of substrings of R1 and R2.
16
The Local Alignment algorithms
  • We use the alignment graph to compute the local
    similarity between two RNA sequences according to
    two well known metrics
  • Smith-Waterman the highest scoring alignment
    between any pair of substrings of the input RNAs.
  • Its normalized version.

17
Standard Local Similarity (Smith-Waterman)
U C A C C G A G
U
C
  • The score is computed via dynamic program
  • Score(i,j)
  • max

G
C
G
G
U
A
U
G
Score(i,j) Weight of the incoming edge from
(i,j),
0
Time complexity O(mn) one
run of a global algorithm
n
m
18
Normalized Local Similarity
  • The weakness of Smith Waterman approach AP
    2001
  • Solution look for the substrings (with
    their arcs) that maximize
  • and some given value.

19
Normalized Local Similarity
U C A C C G A G
  • Again, dynamic program

U
C
G
  • Define Length(k,i,j) to be the length of the
    shortest path that ends at vertex (i,j) and has
    weight equal to k.

C
G
G
U
  • The best k/Length(k,i,j) over all i,j,k is the
    normalized score.

A
U
G
20
Normalized Local Similarity
  • Again, dynamic program

Length(k-w,i,j)
  • Define Length(k,i,j) to be the length of the
    shortest path that ends at vertex (i,j) and has
    weight equal to k.

For every k,i,j compute Length(k,i,j) min
Length(k,i,j)
Length(k-w,i,j) (j-ji-i) where w
weight of the incoming edge from (i,j)
Time complexity
one run of a global algorithm
n
m
21
Open Problems
U C A C C G A G
  • Arc deletion
  • Improve global tree edit distance

U
C
G
C
G
G
U
A
U
G
22
Muchas Gracias por la atencion
Write a Comment
User Comments (0)
About PowerShow.com