Title: Construyendo modelos 3D de proteinas
1Construyendo modelos 3D de proteinas fold
recognition / threading
2Why make a structural model for your protein ?
The structure can provide clues to the function
through structural similarity with other proteins
ÙÚÈÏÈÌ ÁÏÂÔWith
With a structure it is easier to guess the
location of active sites
With a structure we can plan more precise
experiments in the lab
We can apply docking algorithms to the structures
(both with other proteins and with small
molecules)
3Protein Modeling Methods
- Ab initio methods solution of a protein folding
problem search in conformational space - Energy-based methods energy minimization molecu
lar simulation - Knowledge-based methods homology modeling fold
recognition / threading
4Why do we need Ab Initio Methods?data taken from
PDBhttp//www.rcsb.org/pdb/holdings.html
New folds and those sequences with very little
sequence homology lt15
5Protein Modeling Methods
- Ab initio methods solution of a protein folding
problem search in conformational space - Energy-based methods energy minimization molecu
lar simulation - Knowledge-based methods homology modeling fold
recogniion
6Predicting Protein Structure Threading / Fold
Recognition
Fold recognition is essentially finding the best
fit of a sequence to a set of candidate folds
7The Threading Problem
- Find the best way to mount the residue sequence
of one protein on a known structure taken from
another protein
8Why is it called threading ?
- threading a specific sequence through all known
folds - for each fold estimate the probability that the
sequence can have that fold
9Threading Basic Strategy
dhgakdflsdfjaslfkjsdlfjsdfjasd
Query
10Protein Threading
J
K
I
L
Protein B
Protein A
Conserved Core Segments
11(No Transcript)
12Input/Output of Protein Threading
T H R E A D I N G
Core segments C1..m
Amino acid sequence a1..n
Pairwise amino acid scoring function
g()
13Fold recognition (Threading) The
sequence Known protein folds
structural model
14Input
sequence
H bond donor
H bond acceptor
Glycin
Hydrophobic
Library of folds of known proteins
15H bond donor
H bond acceptor
Glycin
Hydrophobic
S-2
S20
S5
Z5
Z -1
Z1.5
16Amino acid type
Position on sequence
17Fold recognition/ Threading
- Disadvantages
- threading methods seldom lead to the alignment
quality that is needed for homology modeling. - less than 30 of the predicted first hits are
true remote homologues (PredictProtein).
18Threading resources
- TOPITS Heuristic Threader, part of larger
structure prediction system - 3DPSSM Integrated system, does its own MSA and
secondary structure predictions and then
threading - GenThreaderSimilar to 3DPSSM
19Side chain construction
In homology modelling, construction of the side
chains is done using the template structures when
there is high similarity between the built
protein and the templates
Without such similarity the construction can be
done using rotamer libraries
A compromise between the probability of the
rotamer and its fitness in specific position
determines the score. Comparing the scores of all
the rotamer for a given amino acid determines the
preferred rotamer.
In spite of the huge size of the problem (because
each side chain influences its neighbours) there
are quite succesful algorithms to this problem.
20(No Transcript)
21(No Transcript)
22Ab initio
The sequence
structural model
23 Ab initio methods for modelling
This field is of great theoretical interest but,
so far, of very little practical applications.
Here there is no use of sequence alignments and
no direct use of known structures
The basic idea is to build empirical function
that simulates real physical forces and
potentials of chemical contacts
If we will have perfect function and we will be
able to scan all the possible conformations, then
we will be able to detect the correct fold
24Predicting Protein Structure Ab Initio Methods
Sequence
Prediction
Secondary structure
Low energy structures
Predicted structure
Validation
Energy Minimization
Mean field potentials
25Ab initio Methods
Simplified models simplified alphabet
(HP) simplified representation
(lattice) Build-up techniques Deterministic
methods quantum mechanics diffusion
equations Stochastic searches Monte
Carlo genetic algorithms
26Rosetta approach
- Rosetta (David Baker) consistently outstanding
performer in last two CASPs - Integrated method
- I-Sites much finer grained substructures than
secondary structures. A library of all
structures each AA 9mer is found in (taken from
PDB) - Heuristic global energy function to estimate
quality of folds - Monte Carlo search through assignments of I-Sites
to minimize energy function. - Also, HMMSTR, HMM-driven method for assigning
I-Sites.
27Rosetta prediction method
- Define global scoring function that estimates
probability of a structure given a sequence - Generate version of I-sites with fixed length
subsequences (9 amino acids) - Calculate P(I-Sitesequence) for all sequences
and I-sites - Generate structures by Monte Carlo sampling of
assignments of fixed size I-sites to subsequences - End up with ensemble of plausible structures
28Rosetta is way ahead
- CASP 4 results.
- CASP 5 similar, but not as dramatic.
29Fully automated predictions
- CAFASP-2
- Meta-servers work best
- Integrate predictions from several other servers
- Significantly better predictions than any
individual approach - Several public metaservers available
- http//bioinfo.pl/Meta/ is best all-around