Title: Nucleic Acids
1Nucleic Acids
Chuck Duarte, Jim Havranek
2Interface Mode
Tanja Kortemme
3INTERFACE MODE -interface
Binding energy (-interface -s ltpdbnamegt
-ddg_bind_only) DGbind DG(complex) -
DG(partnerA) - DG(partnerB)
CONTRIBUTIONS TO THE BINDING ENERGY FOR THE WT
COMPLEX ---------------------------------------
--------------------------------------------------
----------------------------------------
Eatr Erep Esol Eaa Edun Eintra
Ehbnd Epair Eref Eh2o Eh2ohb Eh2osol
Eres pdb name -------------------------------
--------------------------------------------------
------------------------------------------------ D
G_BIND -13.8 0.3 6.7 0.0 0.0
0.0 -1.1 0.0 0.0 0.0 0.0
0.0 -9.4 1be9 --------------------------
--------------------------------------------------
--------------------------------------------------
-----
4INTERFACE MODE -interface
Changes in binding energy upon mutation
(-mutlist) DDGbind DGbind(mutant) -
DGbind(WT)
BINDING
ENERGY FOR MUTANT COMPLEXES
CONTRIBUTIONS TO THE BINDING
ENERGY FOR THE MUT COMPLEXES Eatr
Erep Esol Eaa Edun Eintra Ehbnd
Epair Eref Eh2o Eh2ohb Eh2osol Eres
ROS PDB CH WT MUT BIND_CON -13.8 0.3
6.7 0.0 0.0 0.0 -1.1 0.0
0.0 0.0 0.0 0.0 -9.4
WT BIND_CON -13.1 0.3 6.6 0.0
0.0 0.0 -1.1 0.0 0.0 0.0
0.0 0.0 -8.8 27 327 A I
A BIND_CON -13.3 0.3 6.0 0.0
0.0 0.0 -0.8 0.0 0.0 0.0
0.0 0.0 -9.2 118 7 B T
A BIND_CON -12.1 0.3 6.8 0.0
0.0 0.0 -1.1 0.0 0.0 0.0
0.0 0.0 -7.6 120 9 B V
A
CHANGES IN BINDING ENERGY FOR MUTANT
COMPLEXES MUT-WT
CHANGES IN CONTRIBUTIONS
TO THE BINDING ENERGY FOR THE MUT COMPLEXES
Eatr Erep Esol Eaa Edun
Eintra Ehbnd Epair Eref Eh2o Eh2ohb
Eh2osol Eres ROS PDB CH WT MUT Nei DDG_BIND
0.7 0.0 -0.1 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0 0.0
0.6 27 327 A I A 27 DDG_BIND 0.5
0.0 -0.7 0.0 0.0 0.0 0.3
0.0 0.0 0.0 0.0 0.0 0.2 118
7 B T A 19 DDG_BIND 1.7 0.0
0.1 0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 1.8 120 9 B
V A 24
5INTERFACE MODE - PARAMETERIZATION
- SIDE-CHAIN MODELING
- structural relaxation upon mutation (repack
neighbors) - repacking of partners
- no significant improvement for ala scanning
- ENERGY FUNCTION
- int parameterized on DDGbind values in
interfaces (Lin) - sc-sc HB term environment-dependent (linear -
Lin) - pack parameterized on aa identities
(packer weights ) - HB now default env-dep
- default should be pack for pack_rotamers and
int for score - currently int for both
- -Wint_only int for pack and score
- -Wpack_only pack for pack and score
6INTERFACE MODE - WEIGHTS
Ala!
Ala!
gt needs parameterization/benchmarking
7INTERFACE MODE - ENERGY FUNCTION
restrictive sc-sc HB (lt 2.3 Å) works
better specificity prediction
8Alter Specificity
Brian Kuhlman
9- Alter_spec mode version of second-site
suppressor strategy recently used by Tanja to
create orthogonal binding interactions. - Goal redesign two interacting proteins so that
- DGMut-Mut DGWT-WT ltlt DGWT-Mut DGMut-WT
- Protocol
- Loop through all possible point mutants at the
interface and identify mutations that weaken
binding. Neighboring residues are allowed to
repack to accommodate the point mutations. - For each destabilizing mutation redesign the
surrounding residues to better accommodate the
mutation. Output a file (mutlist) that contains
all the point mutations and the compensating
mutations. - Calculate the binding energies of WT-WT, WT-Mut,
Mut-WT and Mut-Mut for each redesign using
interface mode with the alter_spec flag. The
results are sorted so the most promising
redesigns are at the top.
Deanne Sammond
10Sample command lines requires two steps
1) rosetta.gcc -s 1c4z.pdb -design -alter_spec
-pmut pmut63 -alter_spec_mutlist mut63c -fix
fix63n -alter_spec_mutlist file name
output file with list of point mutants and
redesigns -fix file name input file
specifying residues that are arent allowed to
change identity -pmut file name input file
specifying which residues to try point mutants at
(the default is all interface residues) 2)
rosetta.gcc -s 1c4z.pdb -interface
-repack_neighbors -alter_spec_format -mutlist
mut63c -intout E63c -mutlist file name
input file generated by the previous step
11Output
point_mut_number 4 structure_number 10 ,
11 , 12 Eatr Erep Esol
Eaa Edun Eres Gmut-Gwt Partner1 5.4
-2.4 0.0 -0.7 -2.2 -1.1 Gmut-Gwt
Partner2 1.4 -0.1 -0.2 -0.4 0.3
-1.0 ddGbind MUTMUT -4.8 -4.7 2.8
0.0 4.0 -5.3 ddGbind MUTWT -5.0
4.7 1.3 0.0 2.0 4.9 ddGbind
WTMUT 1.4 -0.4 2.5 0.0 0.7
1.6 Mutations Partner 1 S638A I655W F690Y Y694A
Mutations Partner 2 F63E point_mut_number
3 structure_number 7 , 8 , 9
Eatr Erep Esol Eaa Edun
Eres Gmut-Gwt Partner1 1.9 -2.0 3.9
-0.6 1.1 1.9 Gmut-Gwt Partner2 1.1
-0.1 0.3 -0.4 -0.1 -0.9 ddGbind
MUTMUT -1.3 -6.4 2.2 0.0 0.9
-9.0 ddGbind MUTWT -4.7 -3.2 1.6
0.0 0.5 -3.6 ddGbind WTMUT 1.5
-0.6 2.0 0.0 -0.5 0.9 Mutations
Partner 1 S638A L642R I655W Y694A Mutations
Partner 2 F63D
12- Observations
- often redesigns residues that dont interact
directly with the point mutation - the protocol does not explicitly optimize
affinity so sometimes the redesign just enhances
interactions within each chain. - having trouble finding designs that destabilize
both WT-Mut and Mut-WT
13Symmetry Mode
Ora Furman
14Symmetric docking mode
15INTERFACE MODE - ENERGY FUNCTION
restrictive sc-sc HB (lt 2.3 Å) works better
monomeric proteins
more stringent hydrogen bonding
16Symmetric docking of Homo-multimers basics
Regular docking docking frame is varied monomer
1 is fixed
Symmetry docking docking frame is fixed monomer
1 is varied
- Create symmetric multimer
- Rotate monomer 1
- other monomers (contact point on x-axis
angle360o/n) - Optimize distance T
17Symmetric docking outline
read in monomer1, center on origin of
coordinates. Read in docking_T_size. OR read in
multimer, define docking_T_size, the translation
and the rotation axes
18Symmetric docking command line options
- -symmetry
- evokes symmetry mode sets docking_symmetry T
- -n_monomers
- number of monomers in complex (default 2
current max3) - -init_T_size
- initial distance of center of monomer from
contact point center of multimer complex
defines initial docking_T_size - -multimer
- indicates that multimer is read in. sets
dock_sym_multimer_start T - Used to derive symmetry frame.
19Symmetric docking notes
- Currently assumes C symmetries (planar)
- Interface residues are evaluated between monomers
1-2 only and assumed to be the same for all
monomer x - x1 pairs - Backbones are symmetric, side chains are
monomer-specific - Side chain conformations are stored and optimized
for each monomer separately - Elongation
- The degree of elongation (i.e. planarity) can be
monitored (external script) to enrich for wanted
conformations - RMSD calculation
- Since monomer1 changes (instead of the docking
frame), the structure has first to be moved so
that monomer1 is superimposed onto native
monomer1
20(No Transcript)
21Flexible Docking Calculations
22Backbone Conformational ChangeCAPRI T01 HPr
HPr Kinase (Round 1, Sep 2001)
Terminal helix swings upon docking, nuzzling HPr
in a pocket
No energy funnel for binding the unbound
components
23Torsion Angle Perturbation
C-terminal helix a4
Torsion angle movement in residues 290-292 would
allow the correct conformation to be observed.
Res 290-292
Kinase I
24Low-Resolution Search With Flexible Backbone
Initial helix perturb
Initial rigid body perturb
Rigid-body move
Helix Minimization
Helix MC perturb
Monte Carlo Accept?
Small helix perturb
10x50
Low resolution output decoy
25Flexible Docking Results With torsion angle
perturbations and explicit minimizations
score
HPr rmsd
score
18/36 contacts, translation 1.8Å, rotation 18º
helix rmsd
26Loop Flexibility
- Currently exploring ways of moving loops during
protein-protein docking to simulate an induced
fit binding mechanism
Rohl, CA et al 2004 to appear
27Docking Algorithm Overview
Random Start Position
Build/Optimize Loops
Low-Resolution Monte Carlo Search
High-Resolution Refinement
105
Clustering
Predictions
28Implementation
- Call loop mode to build/optimize loops before
low-res docking search - (Future work integrate loop moves during low-res
docking search, alternating with rigid-body moves)
29Surfaces
Dave Masica
30Redesigning Proteins for Optimal Interactions
with Inorganic Crystals
David Masica from Dr. Jeff Grays Lab
31Generation of a Crystalline solid from the Unit
Cell
5.630 Å
Cl
Na1
32Overhead and Side Views of Pre-docked Ferritin
and NaCl Crystal
33Current Working Flowchart
Start Position
500
Best Scoring Decoy
Define Contact Residues and Design (Design
Ligand)
Best Scoring Decoy
10
Post-Process (Based on score and homology)
34Ferritin NaCl Complex after many Dock and Design
Attempts
Res Original Mutant 4 ILE
ASP 53 GLU ILE 60 GLU
ARG 64 ARG LEU 67 LYS
ILE 71 GLN LEU 124 HIS
ASP 127 ASP ILE 128 PHE
TYR 130 GLU ILE 131 SER
LEU 132 HIS TYR 135 ASP
VAL 136 GLU VAL 168 ARG
LEU 169 LEU THR 170 THR
VAL 172 LYS ALA 173 HIS
ARG 174 ASP LEU
88.5 identical To native ferritin
Magenta Mutated Residues
35Questions and Challenges to Come
- Formation of covalent bonds between cysteines and
inorganic hetero atoms - Is a new approach necessary to define heavy atom
types in Rosetta? - Modeling inorganic crystal interactions with
water - Modeling many proteins interacting with a single
crystal simultaneously - Benchmark?
- Modeling the loose electrons in metal compounds
36Jumping Code
Phil Bradley
37Docking with more flexibility
- Chu Wang
- 2nd Rosetta Meeting
- August, 2004
38Whats New?
- Rosetta Docking
- Include unbound native rotamers.
- Rotamer trials minimization.
- Refolding with native bond length/angle.
- Docking with BB/SC minimization.
- Super predictions in CAPRI.
- Chu Wang
- A married man.
- A PhD candidate.
39Side-chain modeling in docking too left or too
right?
- All side-chains used to be removed and repacked
with rotamers. - Use rotamer trials minimization to sample
off-rotamer space. - Include native unbound rotamers to increase
side-chain conservatism. - Combines them to improve docking performance.
40Repacking of interface side-chains
41Improving docking perturbations
42Refold with native bonds
- The minimized native structure without
idealization first usually blows up when template
bond parameters are used. - But for docking and homology modeling, we do have
input bond information and they should be used as
much as possible. - With the new_refold algorithm, native bond
information is stored and recalled for folding
later.
43(No Transcript)
44A good reason to explore flexible-backbone
docking
45Rigid-body Docking does not work all the time
46In order to fit , they have to BEND!!!
Therefore, we know the importance of backbone
flexibility
471ACB Chymotrypsin / Eglin C
Chain E I
Chain A B
48Backbone minimization of this loop
49How best can we do?
50How well did we do?
51Lessons and Plans
- Rigid-body movement seems to be able to
compensate this loop deviation between bound and
unbound. - Backbone minimization works only when the
rigid-body position is right, which is hardly to
be sampled. - Collect more test cases to apply BB/SC
minimization with limited RG search.
52Incorporating Side Chain Entropy into an Energy
Function for Protein Design
53Rationale for calculating side chain entropy
- Currently it is being ignored
- Expect that it will change the relative
favorability for burying different amino acids - A MET will lose more entropy when buried than a
VAL will
54How to calculate entropy
Repack the protein, while repacking, keep track
of how often each rotamer is observed and
calculate the average energy for each residue
R gas constant Pi the probability of the side
chain being in rotamer class i
55Result of entropy calculation (simulation is
based on 2661 proteins, 622,366 residues)
56Design with explicit side chain entropy
- Pick a random sequence
- Repack the sequence to calculate ltUgt and TS
- Force an amino acid mutation and repack the new
sequence to determine the new ltUgt and TS - Evaluate move by comparing new ltUgt - TS with old
ltUgt - TS - Goto (3)
57Modified 2-layer approach
58Thank you !
591et9
LYS
SER
60(No Transcript)
61Computational design of artificial catalytic
protein
- Lin Jiang
- David Bakers lab
- 08/09/04
62Outline
- Inversive Rotamer Tree approach
- (1) Tansition State (TS)
- (2) Building inversive rotamer tree around TS
- (3) Generating backbone coordinates for all no
clashing sidechain rotamer combination - (4) Identify sites in a set of possible protein
scaffolds matching each of the backbond
coordinates in step (2)
63Outline
- Inversive Rotamer Tree approach
- (5) for each identified site, computationally
designing the amino acid packing around TS while
retaining catalytice functional group geometry - (6) Optimizing placment of TS model in binding
site by rigid boby docking together with protein
design. - (7) select lowest energy design with best
catalytic site geometry. QM evaluation of
designed site and refine it
641.
3.
1. TS Ligand Coordinate
Protein Coordinate
2. Inversive Rotamer Ensemble
3. Polyalanine Chain
2.
4. Placed Vrotamer Ensemble
4.
5. Ranked Vrotamer Ensemble
5.
6. Pocket Sidechain Design (PSD)
6.
65Recover the active site of native enzyme TIM
66More match TIM
Dwyer et al, Science. 2004304(5679)1967-71.
67Recovery of native aldolase
68Design of Unnatural Aldolase
69Acknowledgement
David Baker (PI) Jens Meiler Gong
Cheng Brian Kuhlman Jeffrey J.
Gray Tanja Kortemme Alexandre
Zanghellini Baker lab members BMSD program and
Dept. of Biochemistry
70(No Transcript)
71Disulfides
Bill Schief
72Pi-pi and aliphatic orientation-dependent scoring
Kira Misura
73Plane orientation score Pi-pi (F/Y/W), cation-pi
(F/Y/W,H/R), proline, hydrophobic
- Cation-pi partial negative charge on benzene
ring, positive charge on lys/arg - Pi-pi quadrupole moment, strongest in T-shaped
or offset stacked conformation- electrostatics
dominate? - Proline?
- Hydrophobic- capturing something about packing
that lj is missing? Minimization of this term
might improve rotamer selection, packing.
74- Get statistics from PDB for native proteins
- Xtal structures, gt2.5A resolution
- Contact definition at least 2 heavy atoms within
4.2A. - Decoy set 70 proteins, 500 decoys each
75Define distances and orientation between planes
in sidechains
cross2
center_center
q
cross1
cross2
cross1
vertical
horizontal
76FF - offset stacked
FF T-shaped
WP, stacked
WH, stacked
77Compare natives to decoys500 decoys from 70
proteins,centroid relax fullatom relax
FF decoy
FR decoy
FR native
FF native
vertical
horizontal
Blue 0-30 deg. Green30-60 deg. Red60-90
deg.
78FH decoy
FP decoy
FP native
FH native
vertical
horizontal
Blue 0-30 deg Green30-60 deg Red60-90 deg
79Class R RF,RY,RW,RR
vertical
horizontal
80vertical
horizontal
Class P FP,YP,RP,HP,WP
81Native ranks ( of proteins where native receives
score in the top 15)
Energy gap change between native and decoys-
comparison with LJ energy and LJ energy plane
score energy
82Generalized BornElectrostatics
Jim Havranek
83Generalized Born Electrostatics
Implicit water continuum dielectric We want an
environment-dependent ES model, but the irregular
geometry of the protein-solvent interface makes
this hard. GB is one way to encapsulate the
environment of an atom.
84Generalized Born Radii are measures of burial
Born radius for ion
GB model associate with each atom a concentric
sphere that gives an equivalent degree of burial
as in the complex protein geometry. (Invert the
equation above).
85Charged interactions
Coulombic
Polarization
where
Limiting behavior at d0, reverts to Born
equation as d approaches ?, reverts to
Coulombs law
86GB in Rosetta
Using the method of Onufriev, Bashford, and Case
to get generalized Born radii. Not directly
pairwise additive - one pairwise pass gets you
radii, another gets you energies. Design mode
positions with uncertain conformations/identities
are given a placeholder sphere for radii
calculation. Code doesnt execute if Wgb_elec
0.0
87Sequence recovery with GB electrostatics
Good Asp, Glu, Lys, Arg Split Asn good, Gln
neutral Bad His Overall 1.5 better
88Sequence recovery in protein-DNA interfaces
Good Glu, Lys, Arg Bad His, Trp Overall 3.0
better
89Summary
GB is working in Rosetta improves sequence
recovery, decoy discrimination slower than other
terms Under construction derivatives weights
with ligands Should be useful for enzyme design
/ pKa prediction
90FULL ATOM SCORE FUNCTION
SCORE 12 (current fa scorefxn) For protein type
xx hb_lrbb 1.0 (Long range HB) hb_srbb
0.5 (Short range HB) For protein type a
(alpha) hb_lrbb 1.0 hb_srbb 0.5 For all
others (b and ab) hb_lrbb 2.0 hb_srbb 0.5
hb_sc 1.0 (Sidechain HB) fa_atr 1.0 (LJ
attractive) fa_rep 1.0 (LJ repulsive) fa_dun 1
.0 (Dunbrack term) fa_pair 1.0 (Pair
term) fa_sol 1.0 (LK Solvation
term) fa_prob 0.5 (akin to Rama
term) rama 0.2 (Ramachandran term)
SCORE 13 (NEW fa scorefxn) For protein type xx
hb_lrbb 0.7 (Long range HB) hb_srbb 0.1
(Short range HB) hb_sc 0.8 (Side chain HB) For
protein type a (alpha) hb_lrbb
0.1 hb_srbb 0.4 hb_sc 2.5 For protein type
b (beta) hb_lrbb 1.1 hb_srbb 0.1 hb_sc 0.
3 For protein type ab (mixed alpha-beta) hb_lrbb
0.7 hb_srbb 0.1 hb_sc 0.7 fa_atr 1.0
(LJ attractive) fa_rep 1.0 (LJ
repulsive) fa_dun 0.5 (Dunbrack
term) fa_prob 2.0 (akin to Rama
term) fa_gb_elec 1.3 (Gen Born Electrostatics
term) sasa 0.05 (Surface area)
New solvation term
91CENTROID SCORE FUNCTION
SCORE 6 (previous centroid scorefxn) For protein
type xx hb_lrbb 0.5 (Long range HB) hb_srbb
0.25 (Short range HB) For protein type a
(alpha) hb_lrbb 0.5 hb_srbb 0.25 For all
others (b and ab) hb_lrbb 0.25 hb_srbb 1.
0 vdw 1.0 (VDW attractive) env 1.0
(Solvation term) pair 1.0 (Repulsion
term) cb 1.0 (C beta term) sheet 0.5 (Beta
sheet term) ss 0.5 (sheet-sheet term) hs 1.0
(helix-sheet term) rsigma 0.5 rg 1.0 (Radius
of gyration) rama 0.1 (Ramachandran term)
SCORE 6 7 (NEW centroid scorefxn) For protein
type xx hb_lrbb 1.3 (Long range HB) hb_srbb
0.4 (Short range HB) For protein type a
(alpha) hb_lrbb 1.3 hb_srbb 0.7 For all
others (b and ab) hb_lrbb 1.3 hb_srbb 0.1
vdw 1.0 env 1.0 pair 0.4 cb 1.5
sheet 0.0 ss 0.3 hs 0.0 rsigma 0.0
rg 1.0 (Radius of gyration) rama 0.1
(Ramachandran term)
92Mixed alpha-beta
93All Beta
94All alpha
95Constraint Handling
Carol Rohl
96Restraints
- Types
- NOEs Distance restraints on atom pairs
- RDCs Orientational restraints on atom pairs
- Checked for automatically in all Rosetta modes.
- Default filenames 1pdb_.cst, 1pdb_.dpl
- Command line flags -cst -dpl change the
default extensions - Mode-specific behavior
- Ab initio folding uses modified protocols in the
presences of restraints. - refine mode is a centroid-based refinement
primarily for optimizing distance restraints.
97Distance Restraints Format
- Reference NMRformats.README in the
rosetta_documentation archive - Specify residue number and atom name for two
atoms, and lower and upper distance bounds. - Any IUPAC atom name acceptable and/or CEN for
centroid-based constraints. - Degenerate hydrogens
- H atoms with a shared heavy atom (methyl,
methylene, amino) - Hydrogen number replaced with .
98Evaluation of Restraints
- Side-chain constraints evaluated only in
full-atom scoring schemes. - If NO centroid constraints are provided
- weak centroid constraints are automatically
generated and evaluated in centroid scoring
schemes - 15Å upper bound
- Parameters (constraints_ns.cc)
- MAX_CONSTRAINTS max of distance restraints
- DEGENERATE_PAD upper bound pad for degenerate
restraints - GLOBAL_PAD upper bound pad added to all
restraints - WARNING PACKER IS RESTRAINT-UNAWARE!!
99Potentials
- Four scoring schemes
- 1. Linear penalty for exceeding upper bound,
truncated at 10Å - 2. Linear penalty without truncation
- 3. Quadratic for small violations, linear for
large (CNS potential) - 4. Quadratic penalty for exceeding upper bound
- Select with score_set_cst_mode(int mode)
- Lower bound matters only for CNS-type potential
(mode3).
100Extras
- main_frag_cst_trial
- Fragment insertions with prescreening for
fragments that will improve the constraint score - constraint stage
- Selective evaluation of restraint subsets
according to sequence separation. - Pairs with sequence separation lt stage are
evaluated. - Set with score_set_cst_stage(int stage)
- Used primarily in de novo folding.