Title: Lecture I Sequence Comparison
1Lecture I Sequence Comparison
Bioinformatics Where Mathematics Meets Molecules
- MAA North Central Section Summer Seminar
2Dot Plot A Visual Alignment
3Dot Plot Sliding Windows
4DotPlots on the Web
DOTLET http//www.isrec.isb-sib.ch/java/dotlet/Dot
let.html DNA DOT http//arbl.cvmbs.colostate.edu/
molkit/dnadot/ DOTTER http//www.cgb.ki.se/cgb/gr
oups/sonnhammer/Dotter.html
5(No Transcript)
6(No Transcript)
7(No Transcript)
8Edit Distance
- Linguistically constrained edit distance 5
9A Linguistic Tree
10Alignment With Gaps
TREE- -REED
TREE REED
TRE-E -REED
T-REE REE-D
Indel Insertion or Deletion Biologists
term for Gap
11Global Sequence Alignment
- Use gaps to help align matching characters
- Compute similarity between words A and B
12Alignment Scoring
TREE- -REED
TREE REED
1 3?
3 2?
TRE-E -REED
T-REE REE-D
3?? 2?
2 ?? 2?
13Alignment Paths
A
C
G
G
C
T
C
Alignment of two sequences
A
T
G
Path through matrix
G
C
C
A-TGGCCTC ACGG-C-TC
T
T
14Dynamic Programming
- Bellmans principle (i0,j0) (im,jn)
(i0,j0) (i,j) ? (i,j) (im,jn) - Recursive computation of cost function
(Needleman-Wunsch)
Di,j max Di-1,j d(Ai, ), Di,j-1 d(,Bj),
Di-1,j-1 d(Ai,Bj)
i-1
i
j
j-1
15Example
0
Initialization D1,j ? j Di,1 ? i
? -3 ? -4
ATGGCCTC ACGGC-TC
Optimal alignment
16Local Sequence Alignment
Modified Recursion (Smith-Waterman)
Initialization D1,j ? Di,1 ?
0
Di,j max Di-1,j d(Ai, ),
Di,j-1 d(,Bj), Di-1,j-1
d(Ai,Bj), 0
17Scoring Protein Alignments
18Multiple Sequence Alignment
Image from http//bioinformatics.ubc.ca/resources/
tutorials/?ubicdocument_id26
- Exact higher dimensional DP
- Heuristic greedy algorithm (progressive)
19BLAST
- Basic Local Alignment Search Tools
- Local alignments of query sequence with entire
NCBI database - Nucleotide, protein combinations
- Heuristic algorithm
- High-Scoring Segment Pairs (HSP)
- Word size parameter
20Matching Sequence Summary
21Alignment Scores
22BLASTn Alignment
23BLASTp and bl2seq
24BLAST Statistics
- http//www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschu
l-1.html - Extreme Value Distribution
- Raw Score --gt Bit Score --gt
E-value