Medical Natural Sciences Year 2: Introduction to Bioinformatics - PowerPoint PPT Presentation

About This Presentation
Title:

Medical Natural Sciences Year 2: Introduction to Bioinformatics

Description:

Title: PowerPoint Presentation Author: heringa Last modified by: heringa Created Date: 2/3/2003 6:56:45 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 56
Provided by: heri4
Category:

less

Transcript and Presenter's Notes

Title: Medical Natural Sciences Year 2: Introduction to Bioinformatics


1
Medical Natural Sciences Year 2Introduction to
Bioinformatics
Lecture 8 Multiple sequence alignment
(II) Centre for Integrative Bioinformatics VU
2
Progressive multiple sequence alignment
  • Accuracy is very important
  • Problem
  • Errors are propagated through the progressive
    steps
  • Once a gap, always a gap
  • Feng Doolittle, 1987

3
Progressive multiple alignment - general principle
1
Score 1-2
2
1
Score 1-3
3
4
Score 4-5
5
Scores
Similarity matrix
55
Scores to distances
Iteration possibilities
Guide tree
Multiple alignment
4
Multiple alignment profilesGribskov et al. 1987
i
A C D ? ? ? W Y
0.3 0.1 0 ? ? ? 0.3 0.3
Gap penalties
0.5
1.0
Position dependent gap penalties
5
Clustal, ClustalW, ClustalX
  • CLUSTAL W/X (Thompson et al., 1994) uses
    Neighbour Joining (NJ) algorithm (Saitou and Nei,
    1984), widely used in phylogenetic analysis, to
    construct a guide tree.
  • Sequence blocks are represented by profiles, in
    which the individual sequences are additionally
    weighted according to the branch lengths in the
    NJ tree.
  • Further carefully crafted heuristics include
  • (i) local gap penalties
  • (ii) automatic selection of the amino acid
    substitution matrix, (iii) automatic gap penalty
    adjustment
  • (iv) mechanism to delay alignment of sequences
    that appear to be distant at the time they are
    considered.
  • CLUSTAL (W/X) does not allow iteration (Hogeweg
    and Hesper, 1984 Corpet, 1988, Gotoh, 1996
    Heringa, 1999, 2002)

6
Sequence weighingPair-wise alignment quality
versus sequence identity(Vogt et al., JMB 249,
816-831,1995)
7
Pair-wise sequence alignment (more than just
string matching)
Global dynamic programming
MDAGSTVILCFVG
Evolution
M D A A S T I L C G S
Amino Acid Exchange Matrix
Search matrix
Gap penalties (open,extension)
MDAGSTVILCFVG-
MDAAST-ILC--GS
8
Integrating Primary and Predicted Secondary
Structure data for Multiple Alignment
Victor Simossis Jaap Heringa Centre
for Integrative Bioinformatics VU (IBIVU) Vrije
Universiteit Amsterdam, The Netherlands
9
Using secondary structure in multiple alignment
Structure more conserved than sequence
  • 10 years SS prediction method development Q3
    5
  • 10 years MA method development difference in Q3
    can be 40

10
Using secondary structure in multiple alignment
Secondary structure prediction Q3 76 SS
prediction now good enough(?)
11
Secondary structure-induced alignment iteration
12
Flavodoxin-cheY multiple alignment Praline with
pre-processing
  • 1fx1 -PKALIVYGSTTGNT-EYTAETIARQLANAG-YE
    VDSRDAASVEAGGLFEGFDLVLLGCSTWGDDSI------ELQDDFIPLF-
    DSLEETGAQGRKVACF
  • FLAV_DESDE MSKVLIVFGSSTGNT-ESIaQKLEELIAAGG-HE
    VTLLNAADASAENLADGYDAVLFgCSAWGMEDL------EMQDDFLSLF-
    EEFNRFGLAGRKVAAf
  • FLAV_DESVH MPKALIVYGSTTGNT-EYTaETIARELADAG-YE
    VDSRDAASVEAGGLFEGFDLVLLgCSTWGDDSI------ELQDDFIPLF-
    DSLEETGAQGRKVACf
  • FLAV_DESSA MSKSLIVYGSTTGNT-ETAaEYVAEAFENKE-ID
    VELKNVTDVSVADLGNGYDIVLFgCSTWGEEEI------ELQDDFIPLY-
    DSLENADLKGKKVSVf
  • FLAV_DESGI MPKALIVYGSTTGNT-EGVaEAIAKTLNSEG-ME
    TTVVNVADVTAPGLAEGYDVVLLgCSTWGDDEI------ELQEDFVPLY-
    EDLDRAGLKDKKVGVf
  • 2fcr --KIGIFFSTSTGNT-TEVADFIGKTLGA---KA
    DAPIDVDDVTDPQALKDYDLLFLGAPTWNTG----ADTERSGTSWDEFLY
    DKLPEVDMKDLPVAIF
  • FLAV_AZOVI -AKIGLFFGSNTGKT-RKVaKSIKKRFDDET-MS
    DA-LNVNRVS-AEDFAQYQFLILgTPTLGEGELPGLSSDCENESWEEFL-
    PKIEGLDFSGKTVALf
  • FLAV_ENTAG MATIGIFFGSDTGQT-RKVaKLIHQKLDG---IA
    DAPLDVRRAT-REQFLSYPVLLLgTPTLGDGELPGVEAGSQYDSWQEFT-
    NTLSEADLTGKTVALf
  • FLAV_ANASP SKKIGLFYGTQTGKT-ESVaEIIRDEFGN---DV
    VTLHDVSQAE-VTDLNDYQYLIIgCPTWNIGEL--------QSDWEGLY-
    SELDDVDFNGKLVAYf
  • FLAV_ECOLI -AITGIFFGSDTGNT-ENIaKMIQKQLGK---DV
    ADVHDIAKSS-KEDLEAYDILLLgIPTWYYGE--------AQCDWDDFF-
    PTLEEIDFNGKLVALf
  • 4fxn -MK--IVYWSGTGNT-EKMAELIAKGIIESG-KD
    VNTINVSDVNIDELL-NEDILILGCSAMGDEVL-------EESEFEPFI-
    EEIS-TKISGKKVALF
  • FLAV_MEGEL MVE--IVYWSGTGNT-EAMaNEIEAAVKAAG-AD
    VESVRFEDTNVDDVA-SKDVILLgCPAMGSEEL-------EDSVVEPFF-
    TDLA-PKLKGKKVGLf
  • FLAV_CLOAB -MKISILYSSKTGKT-ERVaKLIEEGVKRSGNIE
    VKTMNLDAVD-KKFLQESEGIIFgTPTYYAN---------ISWEMKKWI-
    DESSEFNLEGKLGAAf
  • 3chy ADKELKFLVVDDFSTMRRIVRNLLKELGFN--NV
    EEAEDGVDALNKLQAGGYGFVI---SDWNMPNM----------DGLELL-
    KTIRADGAMSALPVLM
  • T
  • 1fx1 GCGDS-SY-EYFCGA-VDAIEEKLKNLGAEIVQD
    ---------------------GLRIDGD--PRAARDDIVGWAHDVRGAI-
    -------
  • FLAV_DESDE ASGDQ-EY-EHFCGA-VPAIEERAKELgATIIAE
    ---------------------GLKMEGD--ASNDPEAVASfAEDVLKQL-
    -------
  • FLAV_DESVH GCGDS-SY-EYFCGA-VDAIEEKLKNLgAEIVQD
    ---------------------GLRIDGD--PRAARDDIVGwAHDVRGAI-
    -------

13
PRALINEUsing secondary structure for alignment
Dynamic programming search matrix
Amino acid exchange weights matrices
MDAGSTVILCFV
HHHCCCEEEEEE
M D A A S T I L C G S
H H H H C C E E E C C
H
H
C
C
E
E
Default
14
Flavodoxin-cheYusing predicted secondary structure
1fx1 -PK-ALIVYGSTTGNTEYTAETIARQLANAG-YE
VDSRDAASVEAGGLFEGFDLVLLGCSTWGDDSI------ELQDDFIPLFD
S-LEETGAQGRKVACF e eeee b
ssshhhhhhhhhhhhhhttt eeeee stt tttttt seeee b
ee sss ee ttthhhhtt ttss tt
eeeee FLAV_DESVH MPK-ALIVYGSTTGNTEYTaETIARELA
DAG-YEVDSRDAASVEAGGLFEGFDLVLLgCSTWGDDSI------ELQDD
FIPLFDS-LEETGAQGRKVACf e eeeeee
hhhhhhhhhhhhhhh eeeeee eeeeee
hhhhhh
eeeee FLAV_DESGI MPK-ALIVYGSTTGNTEGVaEAIAKTLN
SEG-METTVVNVADVTAPGLAEGYDVVLLgCSTWGDDEI------ELQED
FVPLYED-LDRAGLKDKKVGVf e eeeeee
hhhhhhhhhhhhhh eeeeee hhhhhh eeeeeee
hhhhhh
eeeeee FLAV_DESSA MSK-SLIVYGSTTGNTETAaEYVAEAF
ENKE-IDVELKNVTDVSVADLGNGYDIVLFgCSTWGEEEI------ELQD
DFIPLYDS-LENADLKGKKVSVf
eeeeee hhhhhhhhhhhhhh eeeee
eeeee hhhhhhh h
eeeee FLAV_DESDE MSK-VLIVFGSSTGNTESIaQKLEELIA
AGG-HEVTLLNAADASAENLADGYDAVLFgCSAWGMEDL------EMQDD
FLSLFEE-FNRFGLAGRKVAAf eeee
hhhhhhhhhhhhhh eeeee hhhhhhhhhhheeeee
hhhhhhh hh eeeee 2fcr
--K-IGIFFSTSTGNTTEVADFIGKTLGAK---ADAPIDVDDVT
DPQALKDYDLLFLGAPTWNTGAD----TERSGTSWDEFLYDKLPEVDMKD
LPVAIF eeeee
ssshhhhhhhhhhhhhggg b eeggg s gggggg seeeeeee
stt s s s sthhhhhhhtggg tt
eeeee FLAV_ANASP SKK-IGLFYGTQTGKTESVaEIIRDEFG
ND--VVTL-HDVSQAE-VTDLNDYQYLIIgCPTWNIGEL--------QSD
WEGLYSE-LDDVDFNGKLVAYf eeeee
hhhhhhhhhhhh eee hhh hhhhhhheeeeee
hhhhhhhhh
eeeeee FLAV_ECOLI -AI-TGIFFGSDTGNTENIaKMIQKQL
GKD--VADV-HDIAKSS-KEDLEAYDILLLgIPTWYYGEA--------QC
DWDDFFPT-LEEIDFNGKLVALf eee
hhhhhhhhhhhh eee hhh hhhhhhheeeee
hhhhh
eeeeee FLAV_AZOVI -AK-IGLFFGSNTGKTRKVaKSIKKRF
DDET-MSDA-LNVNRVS-AEDFAQYQFLILgTPTLGEGELPGLSSDCENE
SWEEFLPK-IEGLDFSGKTVALf eee
hhhhhhhhhhhhh hhh hhhhhhheeeee
hhhhhhhhh
eeeeee FLAV_ENTAG MAT-IGIFFGSDTGQTRKVaKLIHQKL
DG---IADAPLDVRRAT-REQFLSYPVLLLgTPTLGDGELPGVEAGSQYD
SWQEFTNT-LSEADLTGKTVALf eeee
hhhhhhhhhhhh hhh hhhhhhheeeee
hhhhh eeeee 4fxn
----MKIVYWSGTGNTEKMAELIAKGIIESG-KDVNTINVSDV
NIDELLNE-DILILGCSAMGDEVL------E-ESEFEPFIEE-IST-KIS
GKKVALF eeeee
ssshhhhhhhhhhhhhhhtt eeeettt sttttt seeeeee
btttb ttthhhhhhh hst t tt
eeeee FLAV_MEGEL M---VEIVYWSGTGNTEAMaNEIEAAVK
AAG-ADVESVRFEDTNVDDVASK-DVILLgCPAMGSEEL------E-DSV
VEPFFTD-LAP-KLKGKKVGLf
hhhhhhhhhhhhhh eeeee hhhhhhhh eeeee

eeeee FLAV_CLOAB M-K-ISILYSSKTGKTERVaKLIEEGVK
RSGNIEVKTMNL-DAVDKKFLQESEGIIFgTPTY-YANI--------SWE
MKKWIDE-SSEFNLEGKLGAAf eee
hhhhhhhhhhhhhh eeeeee hhhhhhhhhh eeee
hhhhhhhhh eeeee 3chy
ADKELKFLVVDDFSTMRRIVRNLLKELGFNN-VEEAEDGV-DAL
NKLQAGGYGFVISD---WNMPNM----------DGLELLKTIRADGAMSA
LPVLMV tt eeee s
hhhhhhhhhhhhhht eeeesshh hhhhhhhh eeeee s
sss hhhhhhhhhh ttttt eeee 1fx1
GCGDS-SY-EYFCGAVDAIEEKLKNLGAEIVQD-----------
----------GLRIDGD--PRAARDDIVGWAHDVRGAI--------
eee s ss sstthhhhhhhhhhhttt ee s
eeees gggghhhhhhhhhhhhhh FLAV_
DESVH GCGDS-SY-EYFCGAVDAIEEKLKNLgAEIVQD------
---------------GLRIDGD--PRAARDDIVGwAHDVRGAI-------
- eee hhhhhhhhhhhh
eeeee eeeee
hhhhhhhhhhhhhh FLAV_DESGI GCGDS-SY-TYFCGAVDVI
EKKAEELgATLVAS---------------------SLKIDGE--P--DSA
EVLDwAREVLARV-------- eee
hhhhhhhhhhhh eeeee
hhhhhhhhhhh FLAV_DESSA
GCGDS-DY-TYFCGAVDAIEEKLEKMgAVVIGD-----------------
----SLKIDGD--P--ERDEIVSwGSGIADKI--------
hhhhhhhhhhhh eeeee
e eee FLAV_DESDE
ASGDQ-EY-EHFCGAVPAIEERAKELgATIIAE-----------------
----GLKMEGD--ASNDPEAVASfAEDVLKQL--------
e hhhhhhhhhhhhhh eeeee
ee hhhhhhhhhhh 2fcr
GLGDAEGYPDNFCDAIEEIHDCFAKQGAKPVGFSNPDDYDYEESKSV
RD-GKFLGLPLDMVNDQIPMEKRVAGWVEAVVSETGV------
eee ttt ttsttthhhhhhhhhhhtt eee b gggs
s tteet teesseeeettt ss hhhhhhhhhhhhhhhht FLAV_A
NASP GTGDQIGYADNFQDAIGILEEKISQRgGKTVGYWSTDGYD
FNDSKALR-NGKFVGLALDEDNQSDLTDDRIKSwVAQLKSEFGL------
hhhhhhhhhhhhhh
eeee
hhhhhhhhhhhhhhhh FLAV_ECOLI
GCGDQEDYAEYFCDALGTIRDIIEPRgATIVGHWPTAGYHFEASKGLADD
DHFVGLAIDEDRQPELTAERVEKwVKQISEELHLDEILNA
hhhhhhhhhhhhhh eeee
hhhhhhhhhhhhhhhhhh FLAV_AZOVI
GLGDQVGYPENYLDALGELYSFFKDRgAKIVGSWSTDGYEFESS
EAVVD-GKFVGLALDLDNQSGKTDERVAAwLAQIAPEFGLS--L--
e hhhhhhhhhhhhhh eeeee
hhhhhhhhhhh FLAV_ENTA
G GLGDQLNYSKNFVSAMRILYDLVIARgACVVGNWPREGYKFSF
SAALLENNEFVGLPLDQENQYDLTEERIDSwLEKLKPAV-L------
hhhhhhhhhhhhhhh eeee
hhhhhhh hhhhhhhhhhhh 4fxn
G-----SYGWGDGKWMRDFEERMNGYGCVVVET---------
------------PLIVQNE--PDEAEQDCIEFGKKIANI---------
e eesss shhhhhhhhhhhhtt ee s
eeees ggghhhhhhhhhhhht FLAV
_MEGEL G-----SYGWGSGEWMDAWKQRTEDTgATVIGT-----
-----------------AIVNEM--PDNAPE-CKElGEAAAKA-------
-- hhhhhhhhhhh
eeeee eeee h
hhhhhhhh FLAV_CLOAB STANSIA-GGSDIALLTILNHLMVK
-gMLVYSG----GVAFGKPKTHLG-----YVHINEI--QENEDENARIfG
ERiANkV--KQIF--
hhhhhhhhhhhhhh eeeee
hhhh hhh hhhhhhhhhhhh h 3chy
-----------TAEAKKENIIAAAQAGASGY-------------------
------VVK----P-FTAATLEEKLNKIFEKLGM------
ess hhhhhhhhhtt see
ees s hhhhhhhhhhhhhhht

G
15
Iteration
Convergence
Limit cycle
Divergence
16
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
17
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
18
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
19
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
20
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
21
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
22
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
23
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
24
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
25
Flavodoxin-cheY multiple alignment/ secondary
structure iteration cheY SSEs
3chy-AA SEQUENCE AA ADKELKFLVVDDFSTMRR
IVRNLLKELGFNNVEEAEDGVDALNKLQAGGYGFVISDWNMP 3chy-I
TERATION-0 PHD EEEEEEE
HHHHHHHHHHHHHHHHH E HHHHHHHHHH HHHEEE
3chy-ITERATION-1 PHD EEEEEEEE
HHHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-2 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHH EEEEEE
3chy-ITERATION-3 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-4 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEE
3chy-ITERATION-5 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-6 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHH EEEEEE
3chy-ITERATION-7 PHD EEEEEEEE
HHHHHHHHHHHHHH EEE HHHHHH EEEEE
3chy-ITERATION-8 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHH EEEEEE
3chy-ITERATION-9 PHD EEEEEEEE
HHHHHHHHHHHHHH HHHHHHHHHH EEEEE
3chy-AA SEQUENCE AA
NMDGLELLKTIRADGAMSALPVLMVTAEAKKENIIAAAQAGASGYVVKP
FTAATLEEKLNKIFEKLGM 3chy-ITERATION-0
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHH
HHHHHHHHHHHHHH 3chy-ITERATION-1
PHD HHHHHHEEEEEE HHH HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-2
PHD HHHHHHEEEEEE HHHHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-3
PHD HHHHHHHHHHHH
HHHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-4 PHD HHHHH
EEEEE HHHHHHHHHHHHHHHHH EEE HHHHHHHHHHHHHH
3chy-ITERATION-5 PHD HHHHHHHH
EEEEE HHHHHHHHHHHHHHHH EEE
HHHHHHHHHHHHHH 3chy-ITERATION-6 PHD
HHHHHHHH EEEEE HHHHHHHHHHHHHHHH EEEE
HHHHHHHHHHHHHH 3chy-ITERATION-7
PHD HHHHHHHH EEEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-8
PHD HHHHHHHH EEEEE HHHHHHHHHHHHHHHH
EEE HHHHHHHHHHHHHH 3chy-ITERATION-9
PHD HHHHHHHH EEEEE
HHHHHHHHHHHHHHH EEEE HHHHHHHHHHHHHH
26
Secondary structure prediction-based alignment
  • Evaluation using the HOMSTRAD database of
    structural alignments
  • Compared to PHD, secondary structure
    prediction/MSA iteration improves both alignment
    and secondary structure prediction by 3-4
  • Iteration can be controlled by a MSA sum-of-pairs
    score and secondary structure prediction
    consistency score

27
Symmetry-derived secondary structure prediction
using multiple sequence alignments (SymSSP)
Victor Simossis Jaap Heringa Centre
for Integrative Bioinformatics VU (IBIVU) Vrije
Universiteit Amsterdam, The Netherlands
28
PralineProfile pre-processing
Once a gap, always a gap Use information from
all sequences right from the start Heringa
(1999, 2002), Kleinjung et al. 2002
29
Progressive multiple alignment
1
Score 1-2
2
1
Score 1-3
3
4
Score 4-5
5
Similarity matrix
Scores
55
Guide tree
Multiple alignment
30
Progressive multiple alignment
d
1
3
1
3
2
5
1
3
2
5
1
3
2
5
4
31
Profile pre-processing
1
Score 1-2
2
1
Score 1-3
3
4
5
Score 4-5
Key Sequence
1
2
1
Pre-alignment
3
4
5
A C D . . Y
1
Pre-profile
Pi Px
32
Profile pre-processing
1
Score 1-2
2
1
Score 1-3
3
4
5
Score 4-5
Pre-profiles
Pre-alignments
1
A C D . . Y
1
2
3
4
5
2
A C D . . Y
2
1
3
4
5
5
A C D . . Y
1
5
2
3
4
33
Pre-profile alignment
Pre-profiles
1
A C D . . Y
2
A C D . . Y
Final alignment
3
A C D . . Y
1
2
3
4
5
4
A C D . . Y
A C D . . Y
5
34
Pre-profile alignment
1
2
1
3
4
5
2
2
1
3
4
Final alignment
5
3
1
1
3
2
2
4
3
5
4
5
4
4
1
2
3
5
5
1
5
2
3
4
35
Pre-profile alignmentAlignment consistency
Ala131
1
1
2
1
A131 A131 L133 C126 A131
3
4
5
2
2
1
2
3
4
5
3
1
3
2
4
5
4
4
1
2
5
3
5
5
1
5
2
3
4
36
Flavodoxin-cheY consistency scores(prepro0)
1fx1 --7899999999999TEYTAETIARQL8776-66
57777777777777553799VL999ST97775599989-43556667779
8998878AQGRKVACF FLAV_DESVH
-46788999999999TEYTAETIAREL7777-775777777777777755
3799VL999ST97775599989-435566677798998878AQGRKVACF
FLAV_DESDE -47899999999999999999999988776695
658888777777778763YDAVL999SAW987778987775355666666
9777776789GRKVAAF FLAV_DESGI
-46788999999999TEGVAEAIAKTL9997-766788887777778875
39DVVL999ST987776--9889546667776697776557777888888
FLAV_DESSA 936777999999999999999999999887597
65777888888888876399999999STW77765--99995366666777
97998779999999999 4fxn
-8787799999999999999999997766669675677888888888887
77999999988777776--9889577788888897773237888888888
FLAV_MEGEL 9776779999999999999999997777766-6
65666677788899976799999999987777669--8873623344666
95555455778888888 2fcr
--87899999999999TEVADFIGK9965419003000001122333556
79DLLF99999855312888111224555555407777777888888888
FLAV_ANASP -47899LFYGTQTGKTESVAEIIR977765392
2356677777777897779999999999988843--99985557787778
99998879999999999 FLAV_ECOLI
997789999GSDTGNTENIAKMIQ87742229224566788899999955
69999999999755553----99262225555495777767778999999
FLAV_AZOVI --79IGLFFGSNTGKTRKVAKSIK998877596
57577888888999777899999999999877761112222222244555
-5555555778999999 FLAV_ENTAG
94789999999999999999999998755229223234555555555555
688899999998875521111111133477777-7777777999999999
FLAV_CLOAB -86999ILYSSKTGKTERVAK999755555505
7678887888887777765778899998522223--98883422344555
97777777777777777 3chy
01222222233333356666655555552229222222222222211121
63335555755553222888877674533344493332222222222222
Avrg Consist 86677788888888899999999987765548
44455566666666665557888888888766544887666334445566
586666556778888888 Conservation
01255386758489697469639464633430452443554465434735
16658868567554455000000314365446505575435547747759
1fx1 G888799955555559888888888899777-
---7777797787787978---5555555667765556777777788887
99------ FLAV_DESVH G888799955555559888888888
899777----7777797787787978---555555566776555677777
778888799------ FLAV_DESDE
A88878685555555999988888889998879--8777788-9877777
7--8555555554433245667777777777599------ FLAV_DESG
I 87775977755555677777777777777778---88888887
667778777775555555555542424667888887777-------- FL
AV_DESSA 977768777555556777777777777777767887
777777778888-978985555555556536556888888888877----
---- 4fxn 86777755555555266666666655555
55778877679998777779777776655555555554444666666665
55798------ FLAV_MEGEL 8577775666666525556777
77888888868997788898877655867788554433322222221223
3223355557-------- 2fcr
87777357333333377776666777776553333333333333332283
3333333332244444567777777888777633------ FLAV_ANAS
P 9777737753333447778888887777777333344444444
44433833333344444444444455577777788777734------ FL
AV_ECOLI 977743786444444777788888888888833334
44444444444424444455555455577566778888888887773411
0000 FLAV_AZOVI 97776355333333466666667777777
77333344444444444448233335555555555554555888888887
7772311---- FLAV_ENTAG 9777738865555558666666
66677666633333333333333322123333344444444455555665
566666555582------ FLAV_CLOAB
76662722222221244444444445555558788222222222222211
1111122222222222344443333333233399------ 3chy
222227222222224111355431113324578-877789976
66556877776322222222222322222323344444422------ A
vrg Consist 86665656444444466666666666666665666
55555655555556555654444434444433444556666666666668
89999 Conservation 736630574333341634645344447
46710000011010011000000010434744645443225474454448
434301000000 Iteration 0 SP 135136.00 AvSP
10.473 SId 3838 AvSId 0.297
37
Flavodoxin-cheY consistency scores (prepro1500)
1fx1 -42444IVYGSTTGNTEYTAETIARQL8866
66666577777775667888DLVLLGCSTW77766----99547666676
9-77888788AQGRKVACFFLAV_DESVH
-34444IVYGSTTGNTEYTAETIAREL77666666657777777566788
8DLVLLGCSTW77766----995476666769-77888788AQGRKVACF
FLAV_DESSA -33444IVYGSTTGNTET999998887776557
77668888899666686YDIVLFGCSTW77777----996466666779-
88SL98ADLKGKKVSVFFLAV_DESGI
-34444IVYGSTTGNTEGVA999999999976555567777788666667
8DVVLLGCSTW77777----995466666779-88887688888KKVGVF
FLAV_DESDE -44777IVFGSSTGNTE9887776666555667
77778899999777777YDAVLFGCSAW88877----997587777779-
8887766777GRKVAAF4fxn
-32222IVYWSGTGNTE8888888876666778888888888NI888858
6DILILGCSA888888------8-8888886--66665378ISGKKVALF
FLAV_MEGEL -12222IVYWSGTGNTEAMA8888888888888
888555555555555485DVILLGCPAMGSE77------572222288--
8888755588GKKVGLF2fcr
-41456IFFSTSTGNTTEVA999998865432222765554443244779
YDLLFLGAPT944411999-111112454441-8DKLPEVDMKDLPVAIF
FLAV_ANASP -00456LFYGTQTGKTESVAEII9877553233
22427776666623589YQYLIIGCPTW55532--999843678W98889
9998888888GKLVAYFFLAV_AZOVI
-42445LFFGSNTGKTRKVAKSIK87777434333536666665467777
YQFLILGTPTLGEG862222222222355558-45666666888KTVALF
FLAV_ENTAG -266IGIFFGSDTGQTRKVAKLIHQKL666466
4424DVRRATR88888SYPVLLLGTPT88888644444444446WQEF8-
8NTLSEADLTGKTVALFFLAV_ECOLI
-51114IFFGSDTGNTENIAKMI987743311111555555588355599
YDILLLGIPT954431----88355225544--44666666779KLVALF
FLAV_CLOAB -63666ILYSSKTGKTERVAKLIE633333333
33333333333366LQESEGIIFGTPTY63--6--------66SWE3333
3333333333GKLGAAF3chy
ADKELKFLVVDDFSTMRRIVRNLLKELGFNNVEEAEDGVDALNKLQ-AGG
YGFVI---SDWNMPNM----------DGLEL--LKTIRADGAMSALPVLM
Avrg Consist
93344599999999999999999887766555555556666677566678
89999999999767658888775555566668967777677889999999
Conservation 023642867584896974696394646334435
43125645654143443665886856755445500000031446544600
55575345547747759 1fx1
G98879-89-999877977--7788899999999955--88888-99
88887798999777778766553344588776666222266899899FL
AV_DESVH G98879-89-999877977--778889999999995
5--88888-99888877989997777787665533445887766662222
66899899FLAV_DESSA G98878-688688888-88--8899
9999999999979988888887788889-89-978777766675664557
7776666654466899899FLAV_DESGI
G98879-898688888987--788888999GATLV7698899-9998789
888-8899787878776663122477788888333276899899FLAV_
DESDE AS8888-68-888888899--9999999999988888-9
99888889887788978887766688542222122555555553332779
999994fxn GS2228-228222222222--2388888
88888888888888888888888888888888777886676553557755
5533221288888888FLAV_MEGEL
G4888--28-8888882MD--AWKQRTEDTGATVI77-------------
--------77222--224444222222244222112--------2fcr
GLGDA5-8Y5DNFC88-88--887777777777776544
45555555555443855557777744653333577999999875553338
99899FLAV_ANASP GTGDQ5-GY5899999-99--99EEKIS
QRGG9997555554444444443328444446666555555555666667
6666433333899899FLAV_AZOVI
GLGDQ5-885777555-55--55555788888888555555555555555
554855555555555666555555888855555544442--288FLAV_
ENTAG GLGDQL-NYSKNFVSA-MR--ILYDLVIARGACVVG888
8EGYKFSFSAA6664NEFVGLPLDQEN88888EERIDSWLE888422426
88688FLAV_ECOLI GC99549784688888987997777777
77888885544444444444444411444477777445577556778888
8887433322100100FLAV_CLOAB
STANS636666333333333333666666666666666666333336336
6336663333336EDENARIFGERIANKVKQI3333336666663chy
VTAEA---KKENIIAA-----------AQAGAS------
-------------------GYVVK-----PFTAATLEEKLNKIFEKLGM-
----- Avrg Consist
99887797877777777779977888888888888667777777777677
66677777676667766655455577776666433355788788Conse
rvation 74664003715454570630035453444474575300
00010100100000000106837601444423355744544484343010
00000 Iteration 0 SP 136702.00
AvSP 10.654 SId 3955 AvSId 0.308
38
Consistency iteration
Pre-profiles
Multiple alignment positional consistency scores
39
Pre-profile update iteration
Pre-profiles
Multiple alignment
40
Strategies for multiple sequence alignment
  • Profile pre-processing
  • Secondary structure-induced alignment
  • Globalised local alignment
  • Matrix extension
  • Objective try to avoid (early) errors

41
Globalised local alignment
1. Local (SW) alignment (M Po,e)


2. Global (NW) alignment (no M or Po,e)
Double dynamic programming
42
M BLOSUM62, Po 0, Pe 0
43
M BLOSUM62, Po 12, Pe 1
44
M BLOSUM62, Po 60, Pe 5
45
Strategies for multiple sequence alignment
  • Profile pre-processing
  • Secondary structure-induced alignment
  • Globalised local alignment
  • Matrix extension
  • Objective try to avoid (early) errors

46
Integrating alignment methods and alignment
information with T-Coffee
  • Integrating different pair-wise alignment
    techniques (NW, SW, ..)
  • Combining different multiple alignment methods
    (consensus multiple alignment)
  • Combining sequence alignment methods with
    structural alignment techniques
  • Plug in user knowledge

47
Matrix extension
  • T-Coffee
  • Tree-based Consistency Objective Function For
    alignmEnt Evaluation
  • Cedric Notredame
  • Des Higgins
  • Jaap Heringa J. Mol. Biol., 302, 205-2172000

48
Using different sources of alignment information

Structure alignments
Clustal
Clustal
Dialign
Lalign
Manual
T-Coffee
49
Globalised local alignment
1. Local (SW) alignment (M Po,e)


2. Global (NW) alignment (no M or Po,e)
Double dynamic programming
50
M BLOSUM62, Po 0, Pe 0
51
M BLOSUM62, Po 12, Pe 1
52
M BLOSUM62, Po 60, Pe 5
53
Strategies for multiple sequence alignment
  • Profile pre-processing
  • Secondary structure-induced alignment
  • Globalised local alignment
  • Matrix extension
  • Objective try to avoid (early) errors

54
Integrating alignment methods and alignment
information with T-Coffee
  • Integrating different pair-wise alignment
    techniques (NW, SW, ..)
  • Combining different multiple alignment methods
    (consensus multiple alignment)
  • Combining sequence alignment methods with
    structural alignment techniques
  • Plug in user knowledge

55
Matrix extension
  • T-Coffee
  • Tree-based Consistency Objective Function For
    alignmEnt Evaluation
  • Cedric Notredame
  • Des Higgins
  • Jaap Heringa J. Mol. Biol., 302, 205-2172000
Write a Comment
User Comments (0)
About PowerShow.com