Title: SuperPose: A Web Server for Automated Protein Structure Superposition
1SuperPose A Web Server for Automated Protein
Structure Superposition
- Gary Van Domselaar
- gvd_at_redpoll.pharmacy.ualberta.ca
- October 08, 2004
2Introduction
- Who Cares?
- Review of Superposition
- Identifying Corresponding Points Between
Structures - Multiple Structure Superposition
- RMSD Calculation
- The SuperPose Web Site
3(No Transcript)
4(No Transcript)
5(No Transcript)
6Principles of Superposition
- How do we superimpose these two cubes?
1MYK
7Principles of Superposition
- Identify corresponding points.
1MYK
8Principles of Superposition
- Identify the common center and the principle axes
for each structure.
1MYK
9(No Transcript)
10Principles of Superposition
- Rotate the two structures so the average distance
between corresponding points is minimized,and
their principal axes overlap.
11(No Transcript)
12Principles of Superposition
- A faster way is to use quaternion-based
superposition to both rotate and minimize the sum
of residuals - S.K.Kearsley, On the orthogonal transformation
used for structural comparisons, Acta Cryst. A45,
208 (1989) - http//www-structure.llnl.gov/xray/comp/suptext.ht
m
13Identifying Corresponding Points Between Protein
Structures
PDB_Entry_A 1 SDKIIHLTDDSFDTDVLKA--DGAILVDFWA
EWCGPCKMIAPILDEIADE 48
........... ......... P
DB_Entry_B 1 MVKQIESKTAFQEALDAAGDKLVVVDFSAT
WCGPCKMIKPFFHSLSEK 48 PDB_Entry_A 49
YQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEVAATKVGALSKGQ
98 ....
.............. ..
. PDB_Entry_B 49 YSNVIFL-EVDVDDCQDVASECEVKCTP
TFQFFKKGQ----KVGEFS-GA 92 PDB_Entry_A 99
LKEFLDANLA 108
.... PDB_Entry_B 93 NKEKLEATINELV
105
2TRXA - 3TRXA
14Identifying Corresponding Points Between Protein
Structures
3TRX - 3GRX1
Length 163 Identity 11/163 ( 6.7)
Similarity 14/163 ( 8.6) Gaps
139/163 (85.3) Score 16.0
3TRX_
model_de 1
MVKQIESK 8
.... 3GRX_model_1_ 1 ANVEIYTKETCPYSHRAKAL
LSSKGVSFQELPIDGNAAKREEMIKRSGRT
50 3TRX_model_de 9 TAFQ--------------EALDAAG
--DKLVVVDFSATWCGPCKMIKPFF 42
.. .. ..
3GRX_model_1_ 51 TVPQIFIDAQHIGGYDDLYALDAR
GGLDPLLK 82 3TRX_model_de
43 HSLSEKYSNVIFLEVDVDDCQDVASECEVKCTPTFQFFKKGQKV
GEFSGA 92
3GRX_model_1_
83
82 3TRX_model_de 93
NKEKLEATINELV 105
3GRX_model_1_ 83 82
15Identifying Corresponding Points Between Protein
Structures
- Solution Secondary Structural Alignment
3TRX - 3GRX1
Sequence1 3TRX_model_default_chain_default Seque
nce2 3GRX_model_1_chain_default Score.... 600
Test Stat 5.31 Matches.. 64
Sequence1 CEEEECCHHHHHHHHHHHCCEEEEEEEEECCCHHHHH
CCCCCCHHHHHCC Matching.
Sequence2
CEEEEEEECCCHHHHHHHH
HHHHHCC Structure
CBBBBBBBCCCHHHHHHHH HHHHHCC Sequence1
CEEEEEEEECCCHHHHHHHCCCCEEEEEEEECCCCCEEECCCCHHHHHHH
Matching.
Sequence2 CEEEEEECCCCHHHHHHHHHCCCCCC
EEEEECCCCC CHHHHHHHH Structure
CBBBBBBCCCCHHHHHHHHHCCCCCCBBBBBCCCCC
CHHHHHHHH Sequence1 HHHCC Matching.
Sequence2 HHHCCCCCCCC Structure HHHCCCCCCCC
16Identifying Corresponding Points Between Protein
Structures
- Problem Multiple Structural Forms
Length 145 Identity 143/145 (98.6)
Similarity 143/145 (98.6) Gaps
2/145 ( 1.4) Score 730.0
1A29_model_de 1
QLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMI
50
1CLL_model_de
1 LTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQD
MI 49 1A29_model_de 51
NEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISA
100
1CLL_model_de
50 NEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGY
ISA 99 1A29_model_de 101
AELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMT
144
1CLL_model_de 100
AELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTA
144
1A29 - 1CLL
17Identifying Corresponding Points Between Protein
Structures
- Solution Subdomain Alignment
1A29 - 1CLL
18Identifying Corresponding Points Between Protein
Structures
- The Difference Distance Matrix
- Make a Distance Matrix for each structure
2
3
1
1
3
2
4
4
1 2 3 4 1 0 0.9 2.0 1.2 2 0.9 0 3
2.0 0 4 1.2 0
1 2 3 4 1 0 0.9 2.0 2.3 2 0.9 0
3 2.0 0 4 2.3 0
19Identifying Corresponding Points Between Protein
Structures
- The Difference Distance Matrix
- Subtract the dif matrices to make a DD Matrix
- Plot the magnitude of the distance as a color
shade
1 2 3 4 1 0 0 0 1.1 2 0 0 3
0 0 4 1.1 0
20Identifying Corresponding Points Between Protein
Structures
- Analyze the difference distance matrix for
similar subdomains. - The DD Matrix will have regions that are similar,
and regions that are different.
21Identifying Corresponding Points Between Protein
Structures
Superposition restricted to residues 5-74
22Identifying Corresponding Points Between Protein
Structures
Superposition restricted to residues 5-74
23Identifying Corresponding Points Between Protein
Structures
Length 145 Identity 143/145 (98.6)
Similarity 143/145 (98.6) Gaps
2/145 ( 1.4) Score 730.0
1A29_model_de 1
QLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMI
50
1CLL_model_de
1 LTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQD
MI 49 1A29_model_de 51
NEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISA
100
1CLL_model_de
50 NEVDADGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGY
ISA 99 1A29_model_de 101
AELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMT
144
1CLL_model_de 100
AELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQMMTA
144
24Multiple Structure Superposition
- How do you optimally superimpose more than 2
structures?
25Multiple Structure Superposition
- Superimpose to an average structure
Average Structure
1
3
2
3-Structure Superposition
Initial 2-Structure Superposition
Structure 3
26Multiple Structure Superposition
- Superposition ordering is important
- Structures should be superposed in order of their
pairwise structural similarity. - An 'all-against-all' DD Matrix analysis can be
used to quickly determine overall relative
similarity between every pair of structures
Avg RMSD for 3TRX chains A B 1.5 A
Avg RMSD for 3TRX chains A C 1.75 A
27Multiple Structure Superposition
- A structure 'pileup' is created from the DD
Matrix analysis to determine the superposition
order.
3TRX_A,D .5A 3TRX_A,B .6A 3TRX_B,D
.7A 3TRX_B,C .8A 3TRX_A,C .9A 3TRX_C,D 1.0
3TRX_A,D .5A 3TRX_A,B .6A 3TRX_B,C .8A
28Multiple Structure Superposition
- Average structures can be sensibly generated only
from a collection of structures with identical
sequences - How do you superimpose a collection of sequences
with non-identical sequences? - Progressive pairwise buildup using the pileup as
a guide.
3TRX_A,D .5A 3TRX_A,B .6A 3TRX_B,C .8A
Superpose Structures A and D 'Anchor' Structure
A, translate/rotate B, add B to A,D 'Anchor'
Structure B, translate/rotate C, Add C to A,B,D
29Multiple Structure Superposition
CLUSTAL W (1.83) multiple sequence
alignment 2TRX_model_default_chain_A
SDKIIHLTDDSFDTDVLKA--DGAILVDFWAEWCGPCKMIAPILDEIADE
2TRX_model_default_chain_B
SDKIIHLTDDSFDTDVLKA--DGAILVDFWAEWCGPCKMIAPILDEIADE
3TRX_model_default_chain_defau
--MVKQIESKTAFQEALDAAGDKLVVVDFSATWCGPCKMIKPFFHSLSEK
..
.. .. 2TRX_model_
default_chain_A YQGKLTVAKLNIDQNPGTAPKYGIR
GIPTLLLFKNGEVAATKVGALSKGQ 2TRX_model_default_chain
_B YQGKLTVAKLNIDQNPGTAPKYGIRGIPTLLLFKNGEV
AATKVGALSKGQ 3TRX_model_default_chain_defau
YSNVIFL-EVDVDDCQDVASECEVKCTPTFQFFKKGQ----KVGEFS-GA
..
...
2TRX_model_default_chain_A
LKEFLDANLA--- 2TRX_model_default_chain_B
LKEFLDANLA--- 3TRX_model_default_chain_defau
NKEKLEATINELV
.
30RMSD Calculation
- The degree of similarity between two or more
structures is described by its average root mean
square deviation (RMSD)
x3
x4
x2
y3
y4
y2
x5
y5
x1
x1
y1
31SuperPose
- http//wishart.biology.ualberta.ca/SuperPose/
- Superposition for 2 chains and for multiple
chains - Subdomain superposition
- Superposition of structures with low sequence
identity