Title: Dynamic Maintenance of Molecular Surfaces under Conformational Changes
1Dynamic Maintenance of Molecular Surfaces under
Conformational Changes
- Eran Eyal and Dan Halperin
- Tel-Aviv University
2Molecular Simulations
- Molecular simulations help to understand the
structure (and function) of protein molecules - Monte Carlo Simulation (MCS)
- Molecular Dynamics Simulation (MDS)
3Solvent Models
- Explicit Solvent Models
- using solvent molecules
- Implicit Solvent Models
- all the effects of the solvent
- molecules are included in an
- effective potential W Welec Wnp
- Wnp Si?iAi(X)
- Ai(X) the area of atom i accessible to solvent
for a given conformation X
4Molecular Surfaces
- van der Waals surface
- Solvent Accessible surface
- Smooth molecular surface (solvent excluded)
Taken from http//www.netsci.org/Science/Compchem/
feature14.html (Connolly)
5Related Work
- Lee and Richards, 1971 Solvent accessible
surface - Richards, 1977 Smooth molecular surface
- Connolly, 1983 First computation of smooth
molecular surface - Edelsbrunner, 1995 Computing the molecular
surface using Alpha Shapes - Sanner and Olson, 1997 Dynamic reconstruction
of the molecular surface when a small number of
atoms move - Edelsbrunner et al, 2001 algorithm to maintain
an approximating triangulation of a deforming 3D
surface - Bajaj et al, 2003 dynamically maintain
molecular surfaces as the solvent radius changes
6Our Results
- a fast method to maintain a highly accurate
surface area of a molecule dynamically during
conformation changes - robust while using floating point
- efficiently accounting for topological changes
theory and practice
7Initial Construction of the Surface
- Finding all pairs of intersecting atoms
- Construction of spherical arrangements
- Controlled Perturbation
- Combining the spherical arrangements
- Constructing the boundary and calculating its
surface area
8Finding the Intersecting Atoms
- Using a grid based solution introduced by
Halperin and Overmars - Theorem Given S S1,,Sn spheres with radii
r1,,rn such that - rmax/rmin lt c for some constant c
- Theres a constant ? such that for each sphere
Si, the concentric sphere with radii ?ri does
not contain the center of any other sphere - Then
- (1) The maximum number of spheres that intersect
any given sphere in S is bounded by a constant - (2) The maximum complexity of the boundary of the
union of the spheres is O(n)
9The Grid Algorithm
- Subdivide space into cubes 2xrmax long
- For each sphere compute the cubes it intersects
(up to 8 cubes) - For each sphere check intersection with the
spheres located in its cubes - Constructed in O(n) time with O(n) space
- Finding all pairs of intersecting spheres takes
O(n) time
10Construction of Spherical Arrangements
Full trapezoidal decomposition
Spherical Arrangement
Partial trapezoidal decomposition
11Controlled Perturbation
- A method of robust computation while using
floating point arithmetic - Handles two types of degeneracies
- Type I intrinsic degeneracies of the spherical
arrangement - Type II degeneracies induced by the trapezoidal
decomposition
12Type I Degeneracies
- We wish to ensure the following conditions
- 1. No Inner or outer tangency of two atoms
- 2. No three atoms intersecting in a single point
- 3. No four atoms intersecting in a common point
- We achieve these conditions by randomly
perturbing the center of each atom that induces a
degeneracy by at most d (the perturbation
parameter). d is a function of e (the resolution
parameter), m (the maximum number of atoms that
intersect any given atom) and R (the maximum atom
radius) - d 2m e1/3R2/3 - ensures elimination of all Type
I degeneracies in expected O(n) time
13Type II Degeneracies
- Happens when two arcs added by the trapezoidal
decomposition are too close (the angle between
them is less than a certain ? threshold) - These degeneracies are prevented by randomly
choosing a direction for the north pole of an
atom that induces no degeneracies - sin ? lt 1/(2m(m-1)) ensures finding a good pole
direction in expected O(n) time
14Combining the Spherical Arrangements
- For each atom, the arc of each intersection
circle points to the same arc on the intersection
circle of the second atom. - Now we have a subset of the arrangement of the
spheres (contains all features of the arrangement
except the 3 dimensional cells)
15Building the Boundary of the Molecule
- Start with the lowest region (2D face) of the
bottommost atom - Traverse the outer boundary of the 3D
arrangements Whenever an arc of an intersection
circle is reached, we jump to the opposite region
on the other atom that shares this arc - During the traversal, the area of each
encountered region is calculated, and summed up
16Finding the voids
- Find for each atom the exposed regions (regions
not covered by other atoms) - Find the difference between the set of exposed
regions on all atoms and the outer boundary - Traverse the difference to construct the boundary
of the voids
17Screenshot
18Dynamic Maintenance of the Surface
- We wish to maintain the boundary of the protein
molecule and its area as the molecule undergoes
conformational changes - The grid algorithm requires reconstruction from
scratch of the entire structure on each step,
which is slow for large molecules (even though it
is asymptotically optimal in the worst case),
O(n) time where n is the number of atoms
19The Problem
- We perform a simulation where each time several
DOFs of the backbone change (F and ? angles) - A simulation step is accepted when it causes no
self collisions - After a step is accepted, we wish to quickly
update the boundary of the molecule and its
surface area
20A Step of the Simulation
- Perform a k-DOF change
- Check if the change incurs self collisions
- If not
- Find all the pairs of intersecting atoms affected
by the change - Modify the spherical arrangements
- Modify the boundary of the molecule and its
surface area account for topological changes
21Attaching Frames to the Backbone
- The backbone of a protein with the reference
frames of each link - For each atom center we calculate its coordinates
within its frame
22Detecting Self Collision
- We use the ChainTree introduced by Lotan et al
Courtesy of Itay Lotan
23ChainTree Performance
- Update Algorithm Modifies the ChainTree after a
k-DOF change in O(klog(n/k)) time - Testing Algorithm Finds self collision in
O(n4/3) time
24Finding intersecting atom pairs
- After a DOF change is accepted, we use the
ChainTree to find all the pairs of intersecting
atoms affected by the change - Deleted pairs
- Inserted pairs
- Updated pairs
25The IntersectionsTree
- A tree used for efficient retrieval of modified
intersections - Updated in a similar way to the testing algorithm
of the ChainTree - Worst case running time O(n4/3) (in practice
very efficient)
26The Modified Intersections List
- During the update of the IntersectionsTree we
store in a separate list all the changes done in
the IntersectionsTree - Deleted intersecting atom pairs
- Inserted intersecting atom pairs
- Updated intersecting atom pairs
- The Modified Intersections List is used to update
the spherical arrangements
27Updating the Spherical Arrangements
- For each pair of inserted intersecting atoms
add their intersection circle to the spherical
arrangements of both atoms - For each pair of updated intersecting atoms
remove their old intersection circle from the two
spherical arrangements and add their new
intersection circle - For each pair of deleted intersecting atoms
remove their old intersection circle from the two
spherical arrangements - The Cost O(p), where p is the number of atoms
whose spherical arrangements were modified
28Example
- Backbone of 4PTI - A single 180o DOF change of
the ? angle of the 13th amino acid - Affected atoms 14 out of 454 (p out of n)
- Modified intersection circles 13
29Example - Continued
- (Hemi)spherical arrangement of one of the
affected atoms (the N atom of the 14th amino
acid) of 4PTI before (left) and after (right) the
mentioned DOF change
30Dynamic Controlled Perturbation
- Goals
- Perturb as few atoms as possible
- For efficiency
- To reduce errors
- Avoid cascading errors caused by
- Perturbing an atom several times in different
simulation steps - Changing a torsion angle several times
31Type I Degeneracies
- Extend the Modified Intersections List to include
also pairs of atoms that almost intersect - Check all atoms in the Modified Intersections
List that belong to inserted and updated pairs
and the atoms that belong to near intersecting
pairs - Each of these atoms is checked against the atoms
that intersect it or almost intersect it - The center of an atom that causes a degeneracy is
perturbed within a sphere or radius d around the
original center of the atom within its reference
frame - The spherical arrangement of a perturbed atom
must be re-computed from scratch
32Avoiding Errors in the Transformations
- In each DOF, accumulate the sum of the angle
changes, and calculate a single rotation matrix
(instead of combining several rotations) - Use exact arithmetic with arbitrary-precision
rational numbers to compute the sines and cosines
of the rotations turned off in current
experiments, too slow
33Type II Degeneracies
- The same set of atoms is tested
- For perturbed atoms we re-calculate their
spherical arrangements from scratch
34Running Time
- The expected update time of the spherical
arrangements including the perturbation time is
O(p)
35Modify the Boundary and Surface Area
- Naïve method
- The same method used for the initial construction
traverse the outer boundary, and then traverse
the voids - Some savings
- No need to recalculate the surface area of
regions that werent updated - No need to recalculate the exposed regions of
atoms that werent updated - The Cost O(n)
36Dynamic Graph Connectivity
- We use a Dynamic Graph Connectivity algorithm
introduced by Holm, De Lichtenberg Thorup
(2001) - We define the boundary graph
- Each exposed region of the spherical arrangements
is a vertex of the graph - Two vertices of the graph are connected by an
edge if their respective regions are adjacent on
the boundary of the molecule - A connected component of the graph corresponds to
a connected component of the boundary of the
molecule (outer boundary or voids)
37Boundary Graph Illustration
38Updating the Boundary Graph
- After the spherical arrangements are modified (in
an accepted DOF change) - Remove all the vertices corresponding to modified
or deleted regions (with their incident edges) - Add new vertices corresponding to modified or new
regions - Add new edges connecting the new vertices to each
other and to the rest of the graph
39HDT Graph Connectivity Algorithm
- A poly-logarithmic deterministic fully-dynamic
algorithm for graph connectivity - Maintains a spanning forest of a graph
- Answers connectivity queries in O(logn) time in
the worst case - Uses O(log2n) amortized time per insertion or
deletion of an edge - n, the number of vertices of the graph, is fixed
as edges are added and removed
40The General Idea of the Algorithm
- A spanning forest F of the input graph G is
maintained - Each tree in each spanning forest in represented
by a data structure called ET-tree, which allows
for O(logn) time splits and merges
41ET-tree
A Spanning Tree
Euler Tour
ET-Tree
42ET-tree properties
- Merging two ET-trees or splitting an ET-tree can
takes O(logn) time while maintaining the balance
of the trees - Each vertex of the original tree may appear
several times in the ET-tree. One occurrence is
chosen arbitrarily as representative - Each internal node of the ET-tree represents all
the representative leaves on its sub-tree, and
may hold data that represent these leaves
43Spanning Forests Hierarchy
- The edges of the graph are split into
lmax?log2n? levels - A hierarchy FF0 ? F1 ? ? Flmax of spanning
forests is maintained where Fi is the sub forest
of F induced by the edges of level ? I - Invariants
- If (v,w) is a non-tree edge, v and w are
connected in Fl(v,w) - The maximal number of nodes in a tree (component)
of Fi is ?n/2i?
44Updating the Graph
- Insert an edge added to level 0. If it connects
two components, it becomes a tree edge (the
components are merged) - Remove a non-tree edge trivial
- Remove a tree edge - more difficult. We must
search for an edge that replaces the removed edge
on the relevant spanning tree
45Removing a Tree Edge
- The removal of a tree edge e(v,w) splits its
tree to Tv and Tw (Tv is the smaller one) - The replacement edge can be found only on levels
? l(e) - On each level ? l(e) (starting with l(e))
- Promote the edges of Tv to the next level
- Each non-tree edge incident to vertices of Tv is
tested - If it reconnects the split component, we are done
- If not, we promote it to the next level
46Amortization Argument
- The amortization argument of the algorithm is
based on increasing the levels of the edges (each
level can be increased at most lmax times)
47Illustration of the Algorithm
48Our Extensions
- We allow vertices of the graph to be inserted and
removed. This has no effect on the amortized
running time, because throughout the simulation
the number of vertices remains O(n) - In each representative occurrence of each ET-tree
we store the area of the relevant region - Each internal node of each ET-tree holds the sum
of the areas of the representative leaves in its
sub-tree - Maintaining the area information takes O(logn)
time per split or merge of the ET-trees
49ET-tree with Areas
50The Running Time
- Maintaining the area information for the spanning
forest F takes O(log2n) amortized time for each
insertion or deletion of an edge - Finding the connected component of a given region
of the boundary takes O(logn) time - The amortized cost of recalculating the surface
area of the outer boundary and voids of the
molecule is O(plog2n) - The cost of computing the contribution of a given
atom to the boundary and all the voids is O(logn)
51Implementation Details
- Order of edge deletion
- Recycling of deleted vertices
- Heuristics
52Heuristics
- Sampling Search for a replacement edge within
the first s non-tree edges, without promotion - Truncating Levels Perform simple search (no
promotion) for trees with less than b nodes
53Complexity Summary
O(n) Initial construction of the arrangements and boundary (including perturbation)
O(klog(n/k)) Updating the ChainTree
T(n4/3) Testing for self collision
T(n4/3) Updating the IntersectionsTree
O(p) Updating the arrangements (including perturbation)
O(n) or O(plog2n) Updating the boundary
54Breakdown of Running Time
55Experimental Results Inputs
Graph Size V,E Mean m Max m of Links of Amino Acids of Atoms Input File
3405, 10553 5.79 10 117 58 454 4PTI
15254, 47266 5.74 10 521 260 2034 1BZM
29385, 90820 6.33 13 937 468 3636 2GLS
45558, 138818 6.24 13 1497 748 5614 1JKY
62308, 191317 5.87 13 2117 1058 8181 1KEE
84536, 260096 6.14 13 2905 1452 11180 1EA0
56The Experiments
- Executed on a 1 GHz Pentium III machine with 2
GB of RAM - Only one chain is read from each PDB file
- 1000 simulation steps
- Each step k DOFs are chosen uniformly at random
- For each chosen DOF a uniform random change is
chosen between -1o and 1o - The results reflect the average running times of
accepted simulation steps (usually several
hundreds)
57Average Number of Modified Atoms and Circles
58Modification Times for Accepted Steps
50-DOFs 20-DOFs 5-DOFs 1-DOF Initial Construct. Atoms Input File
1.32 67.5 0.83 42.6 0.48 24.4 0.11 5.5 1.95 454 4PTI
2.79 31.7 2.24 25.5 1.49 16.9 0.61 7 8.79 2034 1BZM
4.3 23.5 2.65 14.5 1.45 7.9 0.57 3.1 18.25 3636 2GLS
4.15 15.2 2.81 10.3 1.43 5.2 0.61 2.3 27.31 5614 1JKY
4.92 13.5 3.51 9.6 2.29 6.3 1.1 3 36.48 8181 1KEE
6.25 11.7 4.79 8.9 2.91 5.4 1.29 2.4 53.53 11180 1EA0
59Observations
- Strong connection between the number of
simultaneous DOF changes and the number of
modified atoms - The algorithm is more effective for larger
molecules - Faster update times for small number of
simultaneous DOF changes - The implementation runs in time proportional to p
60Dynamic Connectivity Implementation
- Using the implementation by Iyer, Karger, Rahul
Thorup of the dynamic graph connectivity
algorithm of Holm, De Lichtenberg Thorup - Improved performance for small number of
simultaneous DOF changes
61Naive vs. Dynamic connectivity
improvement Dynamic connectivity (1-DOF) Naïve algorithm (1-DOF) Input File
11 0.09 0.11 4PTI 454
9 0.56 0.61 1BZM 2034
36 0.37 0.57 2GLS 3636
55 0.27 0.61 1JKY 5614
41 0.65 1.1 1KEE 8181
50 0.64 1.29 1EA0 11180
62Naive vs. Dynamic connectivity
improvement Dynamic connectivity (5-DOF) Naïve algorithm (5-DOF) Input File
-7 0.51 0.48 4PTI 454
-6 1.57 1.49 1BZM 2034
4 1.39 1.45 2GLS 3636
18 1.18 1.43 1JKY 5614
11 2.03 2.29 1KEE 8181
13 2.54 2.91 1EA0 11180
63Breakdown of Running Time Naïve vs. Dynamic
Connectivity
Naïve Connectivity
Dynamic Connectivity
64Heuristics
1-DOF
20-DOFs
65Future Work
- Allow DOFs in side chains of the protein
- Extend the work to volume calculations
- Extend the implementation to smooth molecular
surfaces - Speedup the implementation
66References
- The material presented in class is mainly based
on the following papers - Eyal and Halperin 05, Dynamic maintenance of
molecular surfaces under conformational changes,
To appear in proceedings of the 21st ACM
Symposium on Computational Geometry (SoCG05) - http//www.cs.tau.ac.il/eyaleran/dynamic_surfaces
.pdf - Eyal and Halperin 05, Improved maintenance of
molecular surfaces using dynamic graph
connectivity, Manuscript - http//www.cs.tau.ac.il/eyaleran/dynamic_connecti
vity.pdf
67Additional References
- Our work combines and extends the following
previous work - Halperin and Overmars 98, Spheres, molecules and
- hidden surface removal, Computational
Geometry Theory Applications, Vol. 11(2), pp.
83-102 - Halperin and Shelton 98, A perturbation scheme
for spherical arrangements with application to
molecular modeling, Computational Geometry
Theory Applications, Vol. 10, pp. 273-287 - Lotan et al 04, Algorithm and data structures
for efficient - energy maintenance during Monte Carlo
simulation of - proteins (2004), Journal of Computational
Biology, Vol. 11(5), pp. 902-932
68Some More References
- The dynamic graph connectivity we use is based on
the following paper - Holm, De Lichtenberg Thorup 01,
Poly-logarithmic deterministic fully-dynamic
algorithms for connectivity, Journal of the ACM,
Vol. 48(4), pp. 723-760 - and its implementation
- Iyer, Karger, Rahul Thorup 01, An experimental
study of poly-logarithmic, fully dynamic,
connectivity algorithms, J. Exp. Algorithmics,
Vol. 6, pp. 4-