Sequence comparison: More dynamic programming - PowerPoint PPT Presentation

About This Presentation
Title:

Sequence comparison: More dynamic programming

Description:

Most times I was still trying to figure out one or problem, while the ... The term 'dynamic programming' predates computers. ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 27
Provided by: william502
Category:

less

Transcript and Presenter's Notes

Title: Sequence comparison: More dynamic programming


1
Sequence comparison More dynamic programming
  • Genome 559 Introduction to Statistical and
    Computational Genomics
  • Prof. William Stafford Noble

2
One-minute responses
  • WAY TOO FAST. Please walk around more during
    sample problems. I was completely lost.
  • Today I felt a bit lost. Most times I was still
    trying to figure out one slide or problem, while
    the class was on the next one.
  • It was fast today, but after the reading I was
    prepared to take things more quickly and I
    understood things much better today.
  • I enjoyed class today. I thought it moved at a
    great pace.
  • I thought the pace was good today.
  • I liked the pace of the lecture even though you
    said we spent too much time on the dynamic
    programming, it gave me time to understand.
  • The pace is great and gives me time to explore.
  • I thought this lecture built nicely on the last
    lecture. I struggled last class but it clicked
    today.

3
One-minute responses
  • The matrix exercise was very helpful, even though
    Im not fully clear on how it works yet.
  • I found todays class time much more
    understandable.
  • I struggled a little bit to grasp the matrix, but
    by the end I had it. The pace and numerous
    examples helped.
  • The DP matrix was simple to grasp after computing
    one or two matrix values, so the portion of the
    lecture could go faster.
  • I like the sample problems.
  • Dynamic programming reminded me of sudoku, which
    was fun.
  • Going through the alignment table helped a lot.
  • It was nice to do examples with DNA sequences.
  • Im feeling a lot better about it all. I really
    like going through examples.
  • Again, the small steps with programming problems
    helped, although the first problem was overly
    challenging (when explained in a different way it
    was fine).
  • I was a little confused when writing the program.
    I think more practice is required. The practice
    problems will help.
  • Todays class was much better since we had
    appropriate reading first. The sample problems
    were interesting since they actually relate to
    biology.

4
One-minute responses
  • Is there a place to get more samples of simple
    code to use to help see patterns of how this
    works? Or is there plenty in the book?
  • There are lots of examples in the book. And of
    course, you can easily find lots of examples on
    the web. For a reference book with examples, try
    Python Cookbook, by Martelli, Ravenscroft and
    Ascher.
  • Im a little fuzzy about how dynamic programming
    differs from other sorts of programming, but
    everything else was really clear.
  • The term dynamic programming predates
    computers. There is no relationship between this
    use of the word programming and what we are
    learning to do in Python.

5
DP matrix
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
6
Three legal moves
  • A diagonal move aligns a character from the left
    sequence with a character from the top sequence.
  • A vertical move introduces a gap in the sequence
    along the top edge.
  • A horizontal move introduces a gap in the
    sequence along the left edge.

7
DP matrix
GA-ATC CATA-C
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
8
DP matrix
GAAT-C CA-TAC
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
9
DP matrix
GAAT-C C-ATAC
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
10
DP matrix
GAAT-C -CATAC
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
11
Multiple solutions
  • When a program returns a sequence alignment, it
    may not be the only best alignment.

GA-ATC CATA-C
GAAT-C CA-TAC
GAAT-C C-ATAC
GAAT-C -CATAC
12
DP in equation form
  • Align sequence x and y.
  • F is the DP matrix s is the substitution matrix
    d is the linear gap penalty.

13
DP in equation form
14
A simple example
Find the optimal alignment of AAG and AGC. Use a
gap penalty of d-5.
A C G T
A 2 -7 -5 -7
C -7 2 -7 -5
G -5 -7 2 -7
T -7 -5 -7 2
A A G

A
G
C
15
A simple example
Find the optimal alignment of AAG and AGC. Use a
gap penalty of d-5.
A C G T
A 2 -7 -5 -7
C -7 2 -7 -5
G -5 -7 2 -7
T -7 -5 -7 2
A A G
0
A
G
C
16
A simple example
Find the optimal alignment of AAG and AGC. Use a
gap penalty of d-5.
A C G T
A 2 -7 -5 -7
C -7 2 -7 -5
G -5 -7 2 -7
T -7 -5 -7 2
A A G
0 -5 -10 -15
A -5
G -10
C -15
17
A simple example
Find the optimal alignment of AAG and AGC. Use a
gap penalty of d-5.
A C G T
A 2 -7 -5 -7
C -7 2 -7 -5
G -5 -7 2 -7
T -7 -5 -7 2
A A G
0 -5 -10 -15
A -5 2 -3 -8
G -10 -3 -3 -1
C -15 -8 -8 -6
18
Traceback
  • Start from the lower right corner and trace back
    to the upper left.
  • Each arrow introduces one character at the end of
    each aligned sequence.
  • A horizontal move puts a gap in the left
    sequence.
  • A vertical move puts a gap in the top sequence.
  • A diagonal move uses one character from each
    sequence.

19
A simple example
Find the optimal alignment of AAG and AGC. Use a
gap penalty of d-5.
  • Start from the lower right corner and trace back
    to the upper left.
  • Each arrow introduces one character at the end of
    each aligned sequence.
  • A horizontal move puts a gap in the left
    sequence.
  • A vertical move puts a gap in the top sequence.
  • A diagonal move uses one character from each
    sequence.

A A G
0 -5
A 2 -3
G -1
C -6
20
A simple example
Find the optimal alignment of AAG and AGC. Use a
gap penalty of d-5.
  • Start from the lower right corner and trace back
    to the upper left.
  • Each arrow introduces one character at the end of
    each aligned sequence.
  • A horizontal move puts a gap in the left
    sequence.
  • A vertical move puts a gap in the top sequence.
  • A diagonal move uses one character from each
    sequence.

A A G
0 -5
A 2 -3
G -1
C -6
AAG- AAG- -AGC A-GC
21
Traceback problem 1
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
Write down the alignment corresponding to the
circled score.
22
Solution 1
GA CA
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
Write down the alignment corresponding to the
circled score.
23
Traceback problem 2
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
Write down three alignments corresponding to the
circled score.
24
Solution 2
GAATC CA---
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
Write down three alignments corresponding to the
circled score.
25
Solution 2
GAATC CA---
GAATC C-A--
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
Write down three alignments corresponding to the
circled score.
26
Solution 2
GAATC CA---
GAATC C-A--
GAATC -CA--
G A A T C
0 -4 -8 -12 -16 -20
C -4 -5 -9 -13 -12 -6
A -8 -4 5 1 -3 -7
T -12 -8 1 0 11 7
A -16 -12 2 11 7 6
C -20 -16 -2 7 11 17
Write down three alignments corresponding to the
circled score.
Write a Comment
User Comments (0)
About PowerShow.com