Protein Structure Alignment - PowerPoint PPT Presentation

About This Presentation
Title:

Protein Structure Alignment

Description:

Superposition - best least squares (RMSD Root Mean Square Deviation) ... Based on sequence correspondence compute 3D transformation (least square fit can be applied) ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 22
Provided by: maxs9
Category:

less

Transcript and Presenter's Notes

Title: Protein Structure Alignment


1
Protein Structure Alignment
Human Hemoglobin alpha-chain pdb1jebA
Human Myoglobin pdb2mm1
Another example G-Proteins 1c1yA,
1kk1A6-200 Sequence id 18 Structural id 72
2
Transformations
  • Translation
  • Translation and Rotation
  • Rigid Motion (Euclidian Trans.)
  • Translation, Rotation Scaling

3
Inexact Alignment. Simple case two closely
related proteins with the same number of
amino acids.
Question how to measure an alignment error?
4
Distance Functions
  • Two point sets Aai i1n
  • Bbj j1m
  • Pairwise Correspondence
  • (ak1,bt1) (ak2,bt2) (akN,btN)

(1) Exact Matching aki bti0
(2) Bottleneck max aki bti (3) RMSD
(Root Mean Square Distance) Sqrt(
Saki bti2/N)
5
Superposition - best least squares(RMSD Root
Mean Square Deviation)
Given two sets of 3-D points Ppi, Qqi ,
i1,,n rmsd(P,Q) v S ipi - qi 2 /n Find a
3-D rigid transformation T such that rmsd(
T(P), Q ) minT v S iT(pi) - qi 2 /n
A closed form solution exists for this task. It
can be computed in O(n) time.
6
Correspondence is Unknown
Given two configurations of points in the
three dimensional space,
find those rotations and translations of one
of the point sets which produce large
superimpositions of corresponding 3-D
points.
7
A 3-D reference frame can be uniquely defined by
the ordered vertices of a non-degenerate triangle
p1
p2
p3
8
Sequence Based Structure Alignment
  • Run pairwise sequence alignment.
  • Based on sequence correspondence compute 3D
    transformation (least square fit can be applied).
  • Iteratively improve structural superposition.

Not a good approach sequence alignment can be
incorrect.
9
Structure Alignment (Straightforward Algorithm)
  • For each pair of triplets, one from each molecule
    which define almost congruent triangles compute
    the rigid transformation that superimposes them.
  • Count the number of aligned point pairs and sort
    the hypotheses by this number.

10
  • For the highest ranking hypotheses improve the
    transformation by replacing it by the best RMSD
    transformation for all the matching pairs.
  • Complexity O(n3m3 ) O(nm) .
  • Applying 3D grid gives practically O(n3m3)
    O(n)
  • If one exploits protein backbone geometry 3D
    grid
  • O(nm) O(n)

11
Structural Alignment Approaches
Two interrelated problems 3D transformation and
point correspondence (matching, alignment)
Some methods
  1. Generate a set of 3D transformations.
  2. Cluster similar transformations.
  3. Compute 3D alignment for each cluster
    representative.
  1. Generate a set of 3D transformations.
  2. Compute 3D alignment for each transformation.

Geometric Hashing Combines transformation and
correspondence detection in one scheme.
12
Accuracy improvement during detection of 3D
transformation.
Instead of 3 points use more. How many?
Align any possible pair of fragments - Fij(k)
13
Accept Fij(k) if rmsd(Fij(k)) lte. Complexity
O(n3 n) O(n) (assume nm) (For each Fij(k)
we need compute its rmsd) can be reduced to
O(n3) O(n)
14
Improvement BLAST idea - detect short similar
fragments, then extend as much as possible.
i1
i-1
i
j-1
j1
j
ai-1 ai ai1 bj-1 bj bj1
Extend while rmsd(Fij(k)) lte.
Complexity O(n2)O(n)
15
  • Sequence-order Independent Alignment

P
Q
16
4-helix bundle
2cblA
1f4nA
1rhgA
1b3q
17
Sequence Order Independent Alignment
18
Sequence Order Independent Alignment
2cblA 1f4n 1rhgA 1b3q
51
103
113
169
chain A
chain B
3
58
54
7
73
126
34
12
171
147
chain A
chain B
306
355
354
305
19
The C2 domain calcium-binding motif
E. A. NALEFSKI and J. J. FALKE The C2 domain
calcium-binding motif Structural and functional
diversity Protein Sci 1996 5 2375-2390
20
TRAF-Immunoglobulin Ensemble
E- strand
  • Ensemble 8 proteins from 2 folds.
  • Core sandwich of 6 strands
  • Runtime 21 seconds

- helices - strands
21
Some Links
  • Rasmol Molecular Visualization
  • SCOP - Structural Classification of Proteins
  • MultiProt - Protein Structural (pairwise/multiple)
    Alignment
  • MASS Secondary Structure Based
    (pairwise/multiple) Alignment
Write a Comment
User Comments (0)
About PowerShow.com