Title: Pairwise sequence alignment (practice)
1Pairwise sequence alignment (practice)
2Manual Work Write down the amino acid sequences
derived from all six possible reading frames
of gtseq1 ACTGTCGC
3- Open the following website
- http//biotools.umassmed.edu/cgi-bin/biobin/transe
q
Choose all frames
Remove X
4Translate all possible reading frames of the
following sequence
- TATGTCTCTCACCAACAAGAACGTCGTTTTCGTGGCCGGTCTGGGCGGCA
TTGGCCTGGAC - ACCAGCCGGGAGTTGGTCAAGCGTAATCTGAAGAACCTGGTCATCCTGGA
TCGCATTGAC - AATCCGGCTGCCATTGCCGAACTGAAGGCAATCAATCCCAAGGTGACCAT
CACCTTCTAT - CCCTACGATGTGACTGTGCCCGTCGCTGAGACCACCAAGCTCCTGAAGAC
CATCTTTGCC - CAGGTGAAGACAATCGATGTCCTGATCAACGGTGCTGGCATCCTGGACGA
TCATCAGATC - GAGCGCACCATTGCCGTTAACTACACGGGCCTGGTCAACACCACCACAGC
CATTCTGGAC - TTCTGGGACAAGCGCAAGGGCGGCCCAGGCGGCATCATTTGCAACATTGG
CTCCGTCACC - GGTTTCAATGCCATCTACCAGGTGCCCGTTTACTCCGGCTCCAAGGCGGC
GGTGGTTAAC - TTTACCTCCTCCCTGGCGAAACTGGCTCCCATTACTGGTGTCACTGCTTA
CACTGTCAAT - CCTGGCATCACCAGGACCACTCTGGTCCACAAATTCAACTCGTGGCTGGA
TGTGGAGCCC - CGTGTGGCGGAGAAGCTGCTCGAGCATCCCACCCAGACCTCTCAGCAGTG
CGCCGAGAAC - TTTGTGAAGGCCATTGAGCTGAACAAGAACGGTGCCATCTGGAAGTTGGA
TTTGGGCACC - TTGGAGCCCATCACATGGACCCAGCACTGGGACTCGGGCATCTAA
Which reading frame(s) is(are) likely to be the
true reading frame(s)?
5(No Transcript)
6DOT-PLOT Give the coordinates of the boxes to
be filled?
Window size 5 Stringency 5
Window size 5 Stringency 2
7Nucleic Acid Dot Plots (http//www.vivo.colostate.
edu/molkit/dnadot/)
Copy DNA sequences
Compare Horse (NM_001164018) and Chicken
(NM_001081704) hemoglobin
Window size must be an odd number
Number of mismatches allowed
8(No Transcript)
9Use the Dotplot to compare chicken ovomucoid
(NM_001112662) to itself
GCACCGGCAGCCGCCTGCAGAGCCGGGCAGTACCTCACCATGGCCATGGC
AGGCGTCTTCGTGCTGTTCT CTTTCGTGCTTTGTGGCTTCCTCCCAGAT
GCTGCCTTTGGGGCTGAGGTGGACTGCAGTAGGTTTCCCAA CGCTACAG
ACAAGGAAGGCAAAGATGTATTGGTTTGCAACAAGGACCTCCGCCCCATC
TGTGGTACCGAT GGAGTCACTTACACCAACGATTGCTTGCTGTGTGCCT
ACAGCATAGAATTTGGAACCAATATCAGCAAAG AGCACGATGGAGAATG
CAAGGAAACTGTTCCTATGAACTGCAGTAGTTATGCCAACACGACAAGCG
AGGA CGGAAAAGTGATGGTCCTCTGCAACAGGGCCTTCAACCCCGTCTG
TGGTACTGATGGAGTCACCTACGAC AATGAGTGTCTGCTGTGTGCCCAC
AAAGTAGAGCAGGGGGCCAGCGTTGACAAGAGGCATGATGGTGGAT GTA
GGAAGGAACTTGCTGCTGTTGACTGCAGCGAGTACCCTAAGCCTGACTGC
ACGGCAGAAGACAGACC TCTCTGTGGCTCCGACAACAAAACATATGGCA
ACAAGTGCAACTTCTGCAATGCAGTCGTGGAAAGCAAC GGGACTCTCAC
TTTAAGCCATTTTGGAAAATGCTGAATATCAGAGCTGAGAGAATTCACCA
CAGGATCCC CACTGGCGAATCCCAGCGAGAGGTCTCACCTCGGTTCATC
TCGCACTCTGGGGAGCTCAGCTCACTCCCG ATTTTCTTTCTCAATAAAC
TAAATCAGCAACAAAAAAAAAA
10What do these parallel lines represent?
11LALIGN - finds multiple matching subsegments in
two sequences
Part of the FASTA package of sequence analysis
program.
Lalign - compares two protein or DNA sequences
for local or global similarity and shows the
local sequence alignments.
http//www.ch.embnet.org/software/LALIGN_form.html
12Choose method
default matrix
Set scoring matrix and gap penalties
Paste your sequence
13Open http//www.ch.embnet.org/software/LALIGN_for
m.html
- Use the sequences below and perform a global
alignment with 5 Opening gap penalty and 0
Extending gap penalty. - gtseq1
- GCGACTGTTCCTATGAACTGCAGTAGTTATGCCAACACGACAAGCGAGGA
CGGAAAAGTGAGTCTGTGGTACTGATGGAGTCACCTACGACGCGAGGACG
CCAGGTG - gtseq2
- GCGAGGACGGAAAAGTG
14GLOBAL DOESNT ALWAYS WORK.
GLOBAL
15SOLUTION LOCAL
16Write down the amino acid sequences derived from
all six possible reading frames
of gtseq1 ACTGTCGC gtseqRC GCGACAGT Forward gt_
1 TV gt_2 LS gt_3 CR Reverse gt_1 AT gt_2 RQ
gt_3 DS