Title: Application of Robotics
1Application of Robotics Computational
GeometryTechniques to Proteins and other
Molecules
- Nancy M. Amato
- Parasol Lab,Texas AM University
- and
- IBM TJ Watson Research Center
2Goal of This Talk
- Mention some Techniques in Robotics,
Computational Geometry, etc that may have
application to biomolecules - Computational geometry mathematical techniques
for studying folding problems (Streinu,
Whitesides, Snoeyink) - CG Robotics techniques for docking and shape
analysis (Bereg, Amato) - Dimensionality Reduction Techniques (Forbes
Burkowski, Snoeyink) - Combinatorial Rigidity (Mantler, Snoeyink,
Streinu, Amato, Thomas) - Robotics techniques for articulated systems,
including systems with closed loops (Amato,
Thomas, Brock, Ros, Whitesides)
3Geometry for Folding and Layout (Sue Whitesides)
- Algorithms for moving linkages and hinged objects
- See animations of folding to polyhedra and
knotted surfaces made by folding flat patterns
(implemented by Francois Labelle) at - www.cs.mcgill.ca/sqrt/unfold/unfolding.html
- Applications to micromanufacturing
- Graph Layout problems Given an abstract graph,
determine if it has a layout in space satisfying
geometric constraints, e.g., distance constraints - Potential application for visualization
- GraphDrawing (GD 2004) in New York, layout
results using semi-rigid structures as a proof
technique
4Single-Vertex Origami Foldsand their induced
simple motions
Ileana Streinu and Walter Whiteley
Same family of infinitesimal motions
Single-vertex origami spherical polygon, and
projects to planar polygon
5Single-vertex origami exploration tool
- Audrey Lee and Ileana Streinu Implemented using
- CGAL (planar map data structure)
- Petsc (non-linear solver)
- QT (gui) and OpenGL
- Extendible to explore other motions
6Protein/Protein Interactions (Sergey Bereg)
- Docking of Rigid Proteins
- Examples where can predict using only shape
complementarity - Approach based on motion, requires fast collision
detection (need precise location of collisions) - Sample space of rotations can reduce the search
space - Current Focus is extending to Flexible Proteins
- Challenge modeling the dynamics and how to
exploit previous technique for rigid proteins
7Nonlinear Dimensionality Reduction (Forbes
Burkowski Shirley Hui)
- Dimensionality reduction is a technique that may
be used to organize high dimensional data by
discovering a more compact representation - used in computer vision to capture changes in a
scene. - Dimensionality reduction for molecules
- We are currently investigating techniques to
represent conformational changes - Working with protein flexibility
- Working with flexibility of ligands
8Proteins Combinatorial Rigidity Allostery
(Mantler many others)
- Work in progress Mantler many others
- Can rigidity explain allostery?
- Examining glycogen phosphorylase (GP)
7GPB Relaxed
8GPB Tense
9Banana Spiders(Mantler Snoeyink, CCCG 2004)
- Graphs satisfying Lamans condition in 3D can be
highly connected yet flexible. - Andrea and Jack hoped that this answered one of
Bills questions, but
10Robotics Techniques for Articulated Systems
(Amato, Brock, Ros, Thomas, Whitesides)
- Many methods have been developed in robotics for
motion planning - Models developed for articulated robots can be
adapted to model molecules - Randomized motion planning methods have been
successful in searching high-dimensional spaces - Applied in animation, CAD/CAM, and now molecules
- Applied to problems in Computational Biology
- Protein/Ligand Binding
- Protein structure prediction
- Protein Folding
- RNA Folding
11Motion Planning
The Piano Movers Problem
Box Folding
Alpha Puzzle
12Protein Structure Prediction (TJ Brunette
Oliver Brock)
- New Method for searching conformation space
- family of search techniques for high-dimensional
spaces derived from robotics motion planning - Application to protein structure prediction
- Use Rosettas energy function
- Outperforms Rosetta in cases studied (next slide)
- Future Directions
- Dimensionality reduction by exploiting rigidity
- Incorporation of domain knowledge to reduce
search space
13New Conformation Space Search Prediction Results
(Brock)
Protein 2PTL 60 (78) amino acids / 120 degrees
of freedom
Native Structure
Rosetta
Our Method
Protein 1O0U 414 amino acids / 828 degrees of
freedom
14Configuration Space (C-Space)
C-Space
- robot maps to a point in higher
- dimensional space
- parameter for each degree of freedom
- (dof) of robot
- C-space set of all robot placements
- C-obstacle infeasible robot placements
-
3D C-space (x,y,z)
6D C-space (x,y,z,pitch,roll,yaw)
2n-D C-space (f1, y1, f2, y2, . . . , f n, y n)
15Motion Planning in C-space
Simple workspace obstacle transformed Into
complicated C-obstacle!!
C-space
Workspace
C-obst
C-obst
obst
obst
C-obst
C-obst
obst
obst
y
x
robot
robot
16Probabilistic Roadmap Methods (PRMs)Kavraki,
Svestka, Latombe,Overmars 1996
C-space
Roadmap Construction (Pre-processing)
C-obst
C-obst
C-obst
C-obst
C-obst
17Applications of PRM-based Motion Planning (Amato
et al)
18Applications of PRM-based Motion Planning (Amato
et al)
19Protein Folding via Motion Planning(Amato,
Thomas, Song and K. Dill M. Scholtz)
Protein L
Protein G
20Protein Folding
- We are interested in the folding process
- how the protein folds to its native structure
21Why Study Folding Pathways?
- Importance of Studying Pathways
- insight into protein interactions function
- may lead to better structure prediction
algorithms - Diseases such as Alzheimers Mad Cow related to
misfolded proteins - Computational Techniques Critical
- Hard to study experimentally (happens too fast)
- Can study folding for thousands of already solved
structures - Help guide/design future experiments
22Folding Landscapes
- Each conformation has a potential energy
- Native state is global minimum
- Set of all conformations forms landscape
- Shape of landscape reflects folding behavior
Native state
Different proteins ? different landscapes ?
different folding behaviors
23Using Motion Planning to Map Folding Landscapes
RECOMB 01,02, 04 PSB 03
- Use Probabilistic Roadmap (PRM) method from
motion planning to build roadmap - Roadmap approximates the folding landscape
- Characterizes the main features of landscape
- Can extract multiple folding pathways from
roadmap - Compute population kinetics for roadmap
Native state
24Related Work
- Other PRM-Based approaches for studying molecular
motions - Other work on protein folding
- (Apaydin et al, ICRA01,RECOMB02)
- Ligand binding
- (Singh, Latombe, Brutlag, ISMB99, Bayazit,
Song, Amato, ICRA01) - RNA Folding (Tang, Kirkpatrick, Thomas, Song,
Amato RECOMB 04)
25Modeling Proteins
One amino acid
26Roadmap Construction Node Generation
- Sample using known native state
- sample around it, gradually grow out
- generate conformations by randomly selecting
phi/psi angles - Criterion for accepting a node
- Compute potential energy E of each node and
retain it with probability
Native state
N
Denser distribution around native state
27Ramachandran Plots for Different Sampling
Techniques
Uniform sampling
Gaussian sampling
Iterative Gaussian sampling
28Distributions for different typesPotential
Energy vs. RMSD for roadmap nodes
all alpha
alpha beta
all beta
29Roadmap ConstructionNode Connection
Edge weight w(u,v) f(E(C1), E(C2), E(Cn))
Native state
30PRMs for Protein Folding Key Issues
- Energy Functions
- The degree to which the roadmap accurately
reflects folding landscape depends on the quality
of energy calculation. - We use our own coarse potential (fast) and well
known all atom potential (slow) - Validation
- In ICRA01, RECOMB 01, JCB 02, results
validated with experimental results Li
Woodward 1999.
31One Folding Path of Protein AA nice movie. But
so what?
Ribbon Model
Space-fill Model
- B domain of staphylococcal protein A
32Roadmap AnalysisSecondary Structure Formation
Order
RECOMB01, JCB02, RECOMB02, JCB03, PSB03
- Order in which secondary structure forms during
folding
hairpin 1,2
helix
Q Which forms first?
33Formation Time Calculation
- Secondary structure has formed when x of the
native contacts are present - native contact less than 7 A between Ca atoms in
native state
If we pick x as 60, then at time step 30, three
contacts present, structure considered formed
34Contact Map
- A contact map is a triangular
- matrix which identifies all the
- native contacts among
- residues
35Contact Maps
36Secondary Structure Formation OrderTimed
Contact Map of a Path JCB02
residue
residue
?
Formation order ?, ? 3-4, ? 1-2, ? 1-4
Average T 142
37Secondary Structure Formation OrderValidation
Sample Summary
38Detailed Study of Proteins G LPSB03
Protein L
Protein G
Protein G
- Protein G Protein L
- Similar structure (1 helix, 2 beta strands), but
15 sequence identity - Fold differently
- Protein G helix, beta 3-4, beta1-2, beta 1-4
Kuszewski et al 1994, Orban et al. 1995 - Protein L helix, beta 1-2, beta 3-4, beta 1-4
Yi Baker 1996, Yi et al 1997 - Can our approach detect the difference? Yes!
- 75 Protein G paths 80 Protein L paths have
right order - Increases to 90 100, resp., when use all atom
potential
39Helix and Beta StrandsCoarse Potential PSB03
(b3- b4 forms first) over 2k paths analyzed
b2
b1
b4
b3
(b1- b2 forms first) over 2k paths
b2
b1
b4
b3
40Helix and Beta StrandsAll-atom Potential
(b3- b4 forms first)
Analyze First x Contacts
b2
Contacts
SS Formation Order
20
40
60
80
100
b1
a
b
b4
b1
b2
b1
b4
,
3-
,
-
,
-
79
79
74
82
90
all
a
b1
b2
b3
b4
b1
b4
,
-
,
-
,
-
21
21
26
18
10
b4
a
b
b4
b1
b2
b1
b4
,
3-
,
-
,
-
77
74
71
77
81
hydrophobic
a
b
b2
b3
b4
b1
b4
23
26
29
23
,
1-
,
-
,
-
19
b3
(b1- b2 forms first)
b2
b1
b4
b3
41Summary PRM-Based Protein Folding
- PRM roadmaps approximate energy landscapes
- Efficiently produce multiple folding pathways
- Secondary structure formation order (e.g. G and
L) - More efficient than trajectory-based simulation
methods, such as Monte Carlo, molecular dynamics - Provide a good way to study folding kinetics
- multiple folding kinetics in same landscape
(roadmap) - more realistic than statistical models (e.g.
Lattice models, Bakers model PNAS99, Munozs
model, PNAS99) - Current Future Directions
- Using rigidity to bias sampling better fewer
samples - Studying pathways connecting specific
conformations, e.g., allostery, folded/misfolded
states, etc - Doing it in Parallel using STAPL on BlueGeneL
42Announcing our Protein Folding Server
- http//parasol.tamu.edu/foldingserver/
- You can submit proteins and we will build a
roadmap and analyze it and show you results - Ramachandran plots (all conformations In roadmap)
- RMSD vs. Potential energy plots (all
conformations in roadmap) - Secondary structure formation order statistics
for roadmap pathways - Energy profiles and timed contact maps for
particular pathways - pathway to native (best from most common ss
formation order group) - between two specified conformations
- You can choose to have your protein added to a
public database, or we can keep it private just
for you
43RNA Folding ResultsX. Tang, B. Kirkpatrick, S.
Thomas, G. Song RECOMB04
- RNA energy landscape can be completely described
by huge roadmaps.
- Heuristics are used to approximate energy
landscape using small roadmaps.
- Our roadmaps contain many folding pathways.
Energy profile
Folding Steps
- Population kinetics analysis on the roadmaps
shows that heuristic 1 can efficiently describe
the energy landscape using a small subset of nodes
Map2 (Heuristic 1) 15 Nodes
Map3 (Heuristic 2) 33 Nodes
Map1 (Complete) 142 Nodes
Population
Population
Population
Folding Steps
Folding Steps
Folding Steps
44Ligand BindingIEEE ICRA01
Given an description of a ligand molecule
(robot) and a protein (obstacle).
Find a configuration of the ligand near the
protein where geometric, electro-static and
chemical constraints are satisfied.
ligand
protein
45Ligand BindingIEEE ICRA01
- Docking Find a configuration of the ligand near
the protein that satisfies geometric,
electro-static and chemical constraints - PRM Approach (Singh, Latombe, Brutlag, 1999)
- rapidly explores high dimensional space
- We use OBPRM better suited for generating
conformations in binding site (near protein
surface) - Haptic User interaction
- haptics (sense of touch) helps user understand
molecular interaction - User assists planner by suggesting promising
regions, and planner will post-process and
improve
46Contact Information
- For more information, check out our website
- http//parasol.tamu.edu/amato/
- Credits
- My students Guang Song (now Postdoc with
Jernigan at Iowa State), Shawna Thomas, Xinyu
Tang -
- Ken Dill (UCSF) and Marty Scholtz (Texas AM)