Title: Protein Folding: Interrelation between Secondary and Tertiary Structure Determination
1Protein Folding Interrelation between Secondary
and Tertiary Structure Determination
Karl F. Freed
James Franck Institute and Department of
Chemistry University of Chicago
KITPC, Beijing China, July 29, 2009.
2www.ncbi.nlm.nih.gov/Genbank/genbankgrowth.jpg
Human, Dog, Rat, Worm Genome Projects Obtain
genes which code for protein sequences
www.genomesonline.org/images/gold_s1.gif
3Proteins The primary functional biomolecules
insulin cytochrome c hormone
heme group glucose electron transfer
levels
ribonuclease lysozyme
myoglobin cleaves RNA cleaves
oxygen carbohydrates storage
hemoglobin oxygen transport
glutamine synthetase synthesize glutamine
antibodyrecognize/target foreign bodies
Our goal is to determine the folded structures
Function follows from structure
4Protein Folding Problem
SEQUENCE
Folding 0.00001 10 sec
Native State Responsible for function
5Why havent we solved The Protein Folding
Problem after 40 years?
Mother Folding
- Two aspects
- Predict pathways
- Predict structure
Sequence determines structure
6Give me an aa sequence Ill produce a pathway
and the final structure. Why so
difficult?
- Were not smart enough
- Its a very complex system
7Give me an aa sequence Ill produce a pathway
and the final structure. Why so difficult?
- Complex problem
- Too many atoms (not enough computing power)
- Force fields inaccurate (pairwise interactions
inadequate) - Complex interplay between secondary and tertiary
structure formation (local vs. long-range
structure) - High degree of folding cooperativity
- Averaging doesnt work (no mean field models)
- H2O solvent is difficult to treat
- Dont know all the rules?
- Not enough information?
- Reductionist models (e.g. H/P) often too simple
8What are the fundamental principles needed to
predict pathways and structures?
9Two aspects of The Protein Folding Problem
- Mechanistic studies How does it get from
the U-state to the N-state?
Successful when have homology, but side-steps The
Question What are the Principles?
10Why do they fold into specific structures?
11What level of representation is needed?
Ca
Cb
Beads-on-a-string
12Monomer structure miscibility of polyolefins
K. F. Freed and J. Dudowicz, Adv. Polym. Sci.
183, 63-126 (2005).
13Major themes and challenges in protein folding
2) Satisfy main-chain hydrogen bonds and form
secondary structure.
1) Polymer bends in certain ways
3) Bury hydrophobic residues and pack the atoms
Must satisfy 1, 2 3 simultaneously
14Vast Conformational Search Levinthal Paradox
How does a protein find the time to fold?
Polypeptide backbone is flexible, adopting
specific conformations
Poly-Proline II basin
Beta basin
y
preferences
Helical, turn basin
Ramachandran Map
f
And we have to search too!
15What info is needed to fold proteins? all-atom ?
dihedral angles
All-atom protein solvent Simulation would take
decades
16Reduced representation Side chains are only Cb
Retains the 3 themes
Big Challenge Retain sequence information lost
with removal of side chains
17?-basin
PPII-basin
1) Proteins bend in certain ways
Sampling in dihedral space f-y angles and
Ramachandran BASINS
e
b
PP2
Extended
f y show very strong preferences for certain
regions of the f-y map, called Ramachandran
basins. (due to steric electrostatic interaction
s)
aL
Helical
Where to get this information?
?-basin
18 From computer simulations?
But, force fields can vary widely
Zaman et al. JMB 2003
19Data mine protein data base (PDB) of crystal
structures Extract the distributions for each
type of amino acid
THR
ALA
y
f
20Ramachandran Map of ALA with neighborsALA, ASP
ALAASP
ALAALA
Map depends on neighbor type and
conformationStrong correlations in sequence
21Move-set uses highly selected trimers from PDB
Specifying side chain type and backbone geometry
implicitly includes all-atom side chain
information
22Knowledge-based energy function Assign
interaction energies according to the observed
distances in the PDB
ProbPDB(rij) Probability of finding 2 atoms some
distance apart e.g. Dist(Caala Caval).
EnergyPDB(rij) -ln( ProbPDB(rij) )
23 Secondary structure prediction methods
SASA,
24What prevents accuracy of secondary structure
prediction from reaching 90 ?
Secondary structure often depend on long range
interactions, i.e. tertiary structure
This is supported by the following studies
- The same fragment from different parts of protein
G forms varying secondary structures
Minor and Kim (1996)
- Secondary structure prediction accuracy decreases
with increasing contact order
Kihara (2005)
Pan et al. (1999) Jacobini et al. (2000) Zhou et
al. (2000) Ikeda and Higo (2006)
- The same sequence fragment can be found in
multiple native secondary structure types
25What we do differently
Couple secondary and tertiary structure during
the folding process Restrict possible secondary
structure as the chain folds
Eliminate all other factors, and the one which
remains must be the truth.
A. C. Doyle, The Sign of the Four (1890)
26Trimer library Full PDB
27B)
Iterations mimic steps in folding pathway
Major pathway
1
b1 b2 helix b4 b5
310 b3
Unfolded state
Round 1
Round 0
0
b3
28A
B
C
Energy
1af7
1r69
1ubq
C? rmsd
C? rmsd
C? rmsd
29Round 5
1af7
1di2
1r69
1b72A
1
Round 0
Round 0
Round 0
Round 0
0
1
Round 1
Round 1
Round 1
Round 1
0
1
Round 2
Round 2
Round 2
Round 2
Secondary Structure frequency
0
1
Round 4
Round 3
Round 3
Round 3
0
1
Round 6
Round 4
Round 5
Round 4
0
1
Round 6
Round 7
Round 6
Round 8
0
residue index
30(No Transcript)
311AF7 3.4 Å RMSD
1TIF 5.4 Å RMSD
1SAP 7.8 Å RMSD
32Novel Aspects
- Predict 2 3 structure without using homology
- Use principles of protein structure and folding
- couple 2 3 structure formation
- sequential stabilization
- Iterative fixing to reduce the search.
- Use a Cb representation
- Potential function orientational 2 structure
dependence in a Cb model - Q8-level 2 structure, (f,y) prediction, can
outperform PSIPRED - Outputs pathway information
33Conclusions
34Acknowledgements
Prof. Tobin Sosnick Joe DeBartolo - ab initio
folding, secondary structure prediction Dr.
Andres Colubri Folding simulations, software
Dr. Abhishek Jha (MIT) coil library, unfolded
state, a, b propensities, structure refinement,
electrostatics James Fitzgerald (Stanford) -
Statistical potentials, torsional dynamics Prof.
M. Zaman (UT Austin) Simulations on
peptides All-atom statistical potential Dr.
Min-yi Shen, Prof. A. Sali, UCSF Funding NIH,
NSF, Burroughs Wellcome Fund