Pairwise Alignments Part 1 - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Pairwise Alignments Part 1

Description:

Pairwise Alignments Part 1 Biology 224 Instructor: Tom Peavy Sept 8 – PowerPoint PPT presentation

Number of Views:158
Avg rating:3.0/5.0
Slides: 21
Provided by: TomP78
Category:

less

Transcript and Presenter's Notes

Title: Pairwise Alignments Part 1


1
Pairwise AlignmentsPart 1
  • Biology 224
  • Instructor Tom Peavy
  • Sept 8

ltPowerPoint slides based on Bioinformatics and
Functional Genomics by Jonathan Pevsnergt
2
Pairwise alignments in the 1950s
b-corticotropin (sheep) Corticotropin A (pig)
ala gly glu asp asp glu asp gly ala glu asp glu
CYIQNCPLG CYFQNCPRG
Oxytocin Vasopressin
Early alignments revealed --differences in amino
acid sequences between species --differences in
amino acids responsible for distinct functions
3
Pairwise sequence alignment is the most
fundamental operation of bioinformatics
  • It is used to decide if two proteins (or genes)
  • are related structurally or functionally
  • It is used to identify domains or motifs that
  • are shared between proteins
  • It is the basis of BLAST searching (next week)
  • It is used in the analysis of genomes

4
(No Transcript)
5
Pairwise alignment protein sequences can be more
informative than DNA
  • protein is more informative (20 vs 4
    characters)
  • many amino acids share related biophysical
    properties
  • codons are degenerate changes in the third
    position
  • often do not alter the amino acid that is
    specified
  • protein sequences offer a longer look-back
    time
  • (relatedness over millions or billions of
    years)
  • (note issue of convergent evolution)
  • DNA sequences can be translated into protein,
  • and then used in pairwise alignments

6
Pairwise alignment protein sequences can be more
informative than DNA
DNA can be translated into six potential
proteins
5 CAT CAA 5 ATC AAC 5 TCA ACT
5 CATCAACTACAACTCCAAAGACACCCTTACACATCAACAAACCTACC
CAC 3 3 GTAGTTGATGTTGAGGTTTCTGTGGGAATGTGTAGTTGTT
TGGATGGGTG 5
5 GTG GGT 5 TGG GTA 5 GGG TAG
7
Pairwise alignment protein sequences can be more
informative than DNA
  • Many times, DNA alignments are appropriate
  • --to confirm the identity of a cDNA
  • --to study noncoding regions of DNA
  • --to study DNA polymorphisms
  • --to study molecular evolution (syn. vs nonsyn)
  • --example Neanderthal vs modern human DNA

Query 181 catcaactacaactccaaagacacccttacacccactag
gatatcaacaaacctacccac 240

Sbjct 189 catcaactgcaaccccaaagccacccct-caccca
ctaggatatcaacaaacctacccac 247
8
Definitions
Pairwise alignment The process of lining up two
or more sequences to achieve maximal levels of
identity (and conservation, in the case of amino
acid sequences) for the purpose of assessing the
degree of similarity and the possibility of
homology.
9
Definitions
Homology Similarity attributed to descent from a
common ancestor.
Identity The extent to which two (nucleotide or
amino acid) sequences are invariant.
RBP 26 RVKENFDKARFSGTWYAMAKKDPEGLFLQDNIVA
EFSVDETGQMSATAKGRVRLLNNWD- 84
K GTWMA L A V T
L W glycodelin 23 QTKQDLELPKLAGTWHSMAMA-TN
NISLMATLKAPLRVHITSLLPTPEDNLEIVLHRWEN 81
10
Definitions
Conservation Changes at a specific position of
an amino acid or (less commonly, DNA) sequence
that preserve the physico-chemical properties of
the original residue. Similarity The extent to
which nucleotide or protein sequences are
related. It is based upon identity plus
conservation.
11
Definitions two types of homology
Orthologs Homologous sequences in different
species that arose from a common ancestral gene
during speciation may or may not be responsible
for a similar function. Paralogs Homologous
sequences within a single species that arose by
gene duplication.
12
(No Transcript)
13
Pairwise GLOBAL alignment of retinol-binding
protein from human (top) and rainbow trout (O.
mykiss)
1 .MKWVWALLLLA.AWAAAERDCRVSSFRVKENFDKARFSGT
WYAMAKKDP 48 ...
. .. . 1
MLRICVALCALATCWA...QDCQVSNIQVMQNFDRSRYTGRWYAVAKKDP
47 . . .
. . 49 EGLFLQDNIVAEFSVDETGQMSATAKG
RVRLLNNWDVCADMVGTFTDTED 98
... ..
48 VGLFLLDNVVAQFSVDESGKMTATAHGRVIILNNWEMCANMFGTFE
DTPD 97 . . .
. . 99 PAKFKMKYWGVASFLQKGNDDHW
IVDTDYDTYAVQYSCRLLNLDGTCADS 148
..
98 PAKFKMRYWGAASYLQTGNDDHWVIDTDYDNYAIHYSCR
EVDLDGTCLDG 147 . .
. . . 149
YSFVFSRDPNGLPPEAQKIVRQRQEELCLARQYRLIVHNGYCDGRSERNL
L 199 .. .
148 YSFIFSRHPTGLRPEDQKIVTDKKKEICFLGK
YRRVGHTGFCESS...... 192
14
Pairwise GLOBAL alignment of retinol-binding
protein and b-lactoglobulin
1 MKWVWALLLLAAWAAAERDCRVSSFRVKENFDKARFSGTWYAMAKK
DPEG 50 RBP . . . . .
.. 1 ...MKCLLLALALTCGAQALIVT..QTMK
GLDIQKVAGTWYSLAMAASD. 44 lactoglobulin 51
LFLQDNIVAEFSVDETGQMSATAKGRVR.LLNNWD..VCADMVGTFTDTE
97 RBP . .
. 45 ISLLDAQSAPLRV.YVEELKPTPEGDLEILLQK
WENGECAQKKIIAEKTK 93 lactoglobulin 98
DPAKFKMKYWGVASFLQKGNDDHWIVDTDYDTYAV...........QYSC
136 RBP . . .
. 94 IPAVFKIDALNENKVL........VLDTDYKK
YLLFCMENSAEPEQSLAC 135 lactoglobulin 137
RLLNLDGTCADSYSFVFSRDPNGLPPEAQKIVRQRQ.EELCLARQYRLIV
185 RBP . .
136 QCLVRTPEVDDEALEKFDKALKALPMHIRLSF
NPTQLEEQCHI....... 178 lactoglobulin
25 identity 32 similarity
15
RBP and b-lactoglobulin are homologous
proteins that share related three-dimensional
structures
b-lactoglobulin (P02754)
retinol-binding protein (NP_006735)
16
Gaps
Positions at which a letter is paired with a
null are called gaps. Gap scores are
typically negative. Since a single mutational
event may cause the insertion or deletion of
more than one residue, the presence of a gap
is ascribed more significance than the length
of the gap. In BLAST, it is rarely necessary
to change gap values from the default.
17
Should distantly related species have more
gaps than closely related species (or
genes)? What about their relationship in
regards to sequence identity?
18
There are 3 Principal Methods of
Pair-wise Sequence Alignment
  1. Dot Matrix Analysis (e.g. Dotlet, Dotter, Dottup)
  2. Dynamic Programming (DP) algorithm
  3. Word or k-tuple methods (e.g. FASTA BLAST)

19
(No Transcript)
20
Exon and Introns
Write a Comment
User Comments (0)
About PowerShow.com