Title: Alignment of Flexible
1Alignment of Flexible Molecular Structures
2Motivation
- Proteins are flexible. One would like to align
proteins modulo the flexibility. - Hinge and shear protein domain motions
(Gerstein, Lesk , Chotia). - Conformational flexibility in drugs.
3(No Transcript)
4Motivation
5Flexible protein alignment without prior hinge
knowledge
- FlexProt - algorithm
- detects automatically flexibility regions
- exploits amino acid sequence order
6Examples
7Experimental Results
8- Task largest flexible alignment by decomposing
the two molecules into a minimal number of rigid
fragment pairs having similar 3-D structure.
9FlexProt Main Steps
Detection of Congruent Rigid Fragment Pairs
Joining Rigid Fragment Pairs
Rigid Structural Comparison
Clustering (removing ins/dels)
10Structural Similarity Matrix
11Detection of Congruent Rigid Fragment Pairs
i1
i-1
i
j-1
j1
j
vi-1 vi vi1 wj-1 wj wj1
12FlexProt Main Steps
Detection of Congruent Rigid Fragment Pairs
Joining Rigid Fragment Pairs
Rigid Structural Comparison
Clustering (removing ins/dels)
13How to Join Rigid Fragment Pairs ?
14Graph Representation
Graph Node
Graph Edge
15Graph Representation
- The fragments are in ascending order.
- The gaps (ins/dels) are limited.
- Allow some overlapping.
W
a
b
Size of the rigid fragment pair (node b) - Gaps
(ins/dels) - Overlapping
Penalties
16Graph Representation
- DAG (directed acyclic graph)
17W_k
W_m
W_n
W_t
W_i
- Single-source shortest paths
- O(EV)
18FlexProt Main Steps
Detection of Congruent Rigid Fragment Pairs
Joining Rigid Fragment Pairs
Rigid Structural Comparison
Clustering (removing ins/dels)
19Clustering (removing ins/dels)
T1
T2
If joining two fragment pairs gives small RMSD
(T1 T2) then put them into one cluster.
20FlexProt Main Steps
Detection of Congruent Rigid Fragment Pairs
Joining Rigid Fragment Pairs
Rigid Structural Comparison
Clustering (removing ins/dels)
21Rigid Structural Comparison
22Multiple Structural Alignment
23Multiple Structural Alignment Schemes
- Linear progressive. Starts with one object and
successively compares the other objects to the
results.
- Tree progressive. The alignment is created
according to a similarity tree. The alignment
direction is from the leaves to the tree root. - Gerstein and Levitt 1998.
- Orengo and Taylor 1994. SSAPm method.
- Sali and Blundell 1990
- Russell and Barton 1992
- Ding et al. 1994
24Multiple Structural Alignment Schemes
- Pivot. Uses one object as the pivot and compares
it to all other objects. The results are then
analyzed to find the common similarities. - Leibowitz, Fligelman, Nussinov, and Wolfson 1999.
Geometric Hashing technique. - Escalier, Pothier, Soldano, Viari 1998. Exploits
all common substructures.
25Multiple Structural Alignment Schemes
- Optimization Techniques.
- Guda, Scheeff, Bourne, Shindyalov. Monte Carlo
optimization.
26Previous Work Multiple Structural Alignment
- Disadvantages
- Most methods do not detect partial solutions.
- The methods which detect partial solutions are
not efficient for a large number of molecules.
27Partial Solutions
B
- Detection of local similarities.
- Detection of subset of molecules that share some
local structural pattern.
A
A
B is harder to detect than A
A
B
28Largest Common Point Set (LCP)
Given two point sets detect the largest common
sub-set. exact congruence or e-congruence
29Solution Space
- The number of solutions, which answer the minimal
criteria, could be exponential.
a-1
a-2
a-3
323 kM
a-1
a-2
a-1
a-2
a-3
30Partial Multiple-LCP
Detect t largest alignments between exactly k
molecules. We are interested in above solutions
for each k, 2 ? k ? m.
31MultiProt
/home/silly6/mol/demos/MultiProt/
- Non-predefined Pattern detection.
- Partial Solutions.
- Time Efficient
- 5 protein in 14 seconds
- 20 proteins (500 a.a.) in 10 minutes
- 50 proteins (200 a.a.) in 19 minutes
- PentiumII 500MHz 512Mb memory
32(No Transcript)
33Algorithm Features
- Assumption any multiple alignment of proteins
should align, at least short, contiguous
fragments (minimum 3 points) of input points. - Reduction of solution space The aligned
contiguous fragments are of maximal length. - All (almost, because of e-congruence) possible
solutions (transformations) are detected (optimal
solutions are hard to select).
34Multiple Alignment with Pivot
Input Pivot Molecule Mp
(participates in all solutions) Set of Molecules
SS\Mp Error Threshold e
- Detect all possibly aligned fragments of maximal
length between the input molecules (chance to
detect subtle similarities). - Select solutions that give high scoring global
structural similarity. - Iterate over all possible pivots, Mp M1 Mm
35Bio-Core Detection
- Geom. Bio. Constraints
- Classification
- hydrophobic (Ala, Val, Ile, Leu, Met, Cys)
- polar/charged (Ser, Thr, Pro, Asn, Gln, Lys,
Arg, His, Asp, Glu) - aromatic (Phe, Tyr, Trp)
- glycine (Gly)
Or any other scoring matrix!
36Experimental Results
37Superhelix, 5 molecules.
38Concavalin, 6 molecules.
39Partial Solution Detection
B
1adj 1hc7 1qf6 1ati
A
Task to detect A and B
x
B
A
z
A
y
B
A
B
40- Domain A ranked first (142 matched atoms)
- Domain B ranked eightth (85 matched atoms)
414 proteins aligned based on detected domain A
42Multiple Alignment of domain A
43Multiple Alignment of domain A (enlarged)
444 proteins aligned based on domain B
45Multiple Alignment of domain B
46Multiple Alignment of domain B (enlarged)
47Application to G proteins
A
48(No Transcript)
49Substrate assisted catalysis application to G
proteins
Substrate assisted catalysis application to G
proteins. Mickey Kosloff and Zvi Selinger, TRENDS
in Biochemical Sciences Vol.26 No.3 March 2001
161
50Aspects of Structural Comparison
- A large number of structures (hundreds)
Molecular Dynamics. - Structural flexibility proteins are not rigid
structures. - Structure representation
- C-alpha atoms are suitable for comparisons of
folds. - Detection of similar function requires different
representation. This brings another problem
side chain flexibility. - Sequence order in structural alignment.
- Detection of active sites might require different
approach. Proteins with different folds might
provide the same function. - Statistical Significance
- Measure of geometrical similarity (RMSD,
bottleneck, ), biological scoring function.
51Molecular Surface Representation
52Motivation
- Prediction of biomolecular recognition.
- Detection of drug binding cavities.
- Molecular Graphics.
53Rasmol Spacefill display
54 1. Solvent Accessible Surface SAS2.
Connolly Surface
55Connollys MS algorithm
- A water probe ball (1.4-1.8 A diameter) is
rolled over the van der Waals surface. - Smoothes the surface and bridges narrow
inaccessible crevices.
56Connollys MS algorithm - cont.
- Convex, concave and saddle patches according to
the no. of contact points between the surface
atoms and the probe ball.
- Outputs pointsnormals according to the
- required sampling density (e.g. 10 pts/A2).
57Example - the surface of crambin
58Critical points based on Connolly rep. (Lin,
Wolfson, Nussinov)
- Define a single pointnormal for each patch.
- Convex-caps, concave-pits, saddle - belt.
59Critical point definition
60Connolly gt Shou Lin
61Solid Angle local extrema
hole
knob
62Chymotrypsin surface colored by solid angle
(yellow-convex, blue-concave)
63Protein-protein and Protein-ligand Docking
64Shape Complementarity
65Geometric Docking Algorithms
- Based on the assumption of shape complementarity
between the participating molecules. - Molecular surface complementarity -
protein-protein, protein-ligand, (protein -
drug). - Hydrogen donor/acceptor complementarity -
protein-drug. - Remark usually protein here can be replaced
by DNA or RNA as well.
66Issues to be examined when evaluating docking
methods
- Rigid docking vs Flexible docking
- If the method allows flexibility
- Is flexibility allowed for ligand only, receptor
only or both ? - No. of flexible bonds allowed and the cost of
adding additional flexibility. - Does the method require prior knowledge of the
- active site ?
- Performance in unbound docking experiments.
- Speed - ability to explore large libraries.
67General Algorithm outline
- Calculate the molecular surface of the receptor
and the ligands and their interest points (
normals). - Match the interest points and recover candidate
transformations. - Check for inter-molecule and intra-molecule
penetrations and score the amount of contact. - Rank by geom-score/energies.
68Shape feature and signature (Norel et al.)
69Unbound docking examples
70GGH based flexible docking
Applies either to flexible ligands or to flexible
receptors.
71Flexible DockingCalmodulin with M13 ligand
72Flexible Docking HIV Protease Inhibitor