Title: Protein Structure
1Protein Structure
- Nimrod Rubinstein
- Bioinformatics Seminar
2Protein Synthesis
- Attachment of correct amino acids (AAs) to their
corresponding tRNAs. - Initiation forming the initiation complex.
- Elongation sequentially forming peptide bonds.
- Termination synthesis is terminated and the
polypeptide is released.
3From Sequence to Structure
- Structure Hierarchies
- Primary structure the sequence of AAs
covalently bound along the backbone of the
polypeptide chain. -
Gly
Ala
Cys
O
Ca
?
C
?
N
O
C
N
?
N
?
Ca
C
?
Ca
?
O
-1800 ? 1800
-1800 ? 1800
4From Sequence to Structure
- Structure Hierarchies
- Secondary structure local conformation of some
part of the polypeptide. - ß Sheet
a Helix
Parallel
Anti Parallel
5From Sequence to Structure
- Structure Hierarchies
- Tertiary structure the overall
3-dimensional arrangement of all the atoms in the
protein.
6From Sequence to Structure
- Structure Hierarchies
- Quaternary structure some proteins contain two
or more separate polypeptide chains, which may be
identical or different.
Globular
Fibrous
7From Sequence to Structure
- Additional Parameters
- Surface accessibility
-
- The surface area of the molecule that is exposed
to the solvent, derived from the complete
structure. - VDW surface the surface area of an atom.
- Connolly surface the interface between the
molecule and the solvent sphere (conventionally
with r 1.4Å) . - Solvent accessible surface the path of the
center of the solvent sphere rolled over the VDW
surface. - Relative accessibility (SAS)/(maxSAS)
- maxSAS SAS(Gly-X-Gly)
8From Sequence to Structure
- Additional Parameters
- Coordination number
- The number of structure stabilizing contacts each
residue in the structure makes. - Computation encapsulating an AA with a sphere,
centered at the residues center of mass, and
counting the number of residues falling inside
this sphere. - Usually done with different cutoff radii.
9From Sequence to Structure
The Levinthal paradox Levinthal C. J. Chym.
Phys. (1968) Assume a protein is comprised of
100 AAs. Assume each AAs backbone can take up 10
different conformations, defined by ? and ?
values. Altogether we get 10100
conformations.
If each conformation were sampled in the
shortest possible time (time of a molecular
vibration 10-13 s) it would take an
astronomical amount of time (1077 years) to
sample all possible conformations, in order to
find the Native State.
NPC even in the 2D case
Luckily, nature works out with these sorts of
numbers and the correct conformation of a protein
is reached within seconds.
10From Sequence to Structure
- Folding Models
- The Backbone-Centric view
-
- Sequence order dependent interactions (?? -
propensities and H-bonds), produce local
secondary structure elements (SSEs). - Local SSEs later overgo longer-range interactions
to form supersecondary structures. - Supersecondary structures of ever-increasing
complexity thus grow, ultimately into the native
conformation.
11From Sequence to Structure
- Folding Models
- The Sidechain-Centric view
- Hydrophobic sidechain interactions are the
strongest for AAs in a water solution. - A few key hydrophobic residues are responsible
for a hydrophobic collapse to the molten
globule state. - The molten globule might not include SSEs, yet
about this structure the remainder of the
polypeptide chain condenses. - The conformation space is viewed as funnel
shaped.
Molten globule states
12From Sequence to Structure
- Folding Models
- The Sidechain-Centric view - Larger proteins
- Intermediate states exist, which are highly
populated. - These states may assist in finding the Native
Structure or may serve as traps that inhibit the
folding process. - Structurally aligning intermediate states against
the SCOP found the corresponding Native
Structures to have the highest scores. - But, many features were missing
- Well defined SSEs.
- A well formed hydrophobic core.
- High RMSDs (7-10Å).
- Dobson C. M. TRENDS in Biochemical Sciences
Jan 2005
13From Sequence to Structure
- Folding Models
- Post-translational Vs. Co-translational
14Determining the Structure
- Crystallization
- Assembling a solution of protein molecules into a
periodic lattice. -
- X-Ray Diffraction
- The crystal is bombarded with X-ray beams.
- The collision of the beams with the electrons
creates a diffraction pattern. - The diffraction pattern is transformed into an
electron density map of the protein from which
the 3D locations of the atoms can be deduced. -
15Determining the Structure
- Nucleotide Magnetic Resonance
- A solution of the protein is placed in a magnetic
field. - spins align parallel or anti-parallel to the
field. - RF pulses of electromagnetic energy shifts spins
from their alignment. - Upon radiation termination spins re-align
while emitting the energy they absorbed. - The emission spectrum contains information about
the identity of the nuclei and their immediate
environment. - The result is an ensemble of models rather than a
single structure.
16Structure Similarity
- Protein Families
- Structures seem to be preserved much more than
sequences, which is easily explainable due to
neutral mutations.
Rigid Ca Alignment RMSD 1.26Å
17Structure Similarity
- Protein Families
- Structures seems to be preserved much more than
sequences, which is easily explainable due to
neutral mutations. - Structural Biologists claim that there are a
limited number of ways in which protein domains
fold. There may be as few as 2000 different
folds (differing by their backbone topology). - Nearly a 1000 different folds have already been
resolved.
18Structure Prediction
- Homology (Comparative) Modeling
- Guideline At least 30 sequence identity is
needed between probe and template. - Template Assignment creating a robust
probe-template alignment (PWA/MSA). - Model Construction
- Generation of coordinates for conserved segments
superimposing/averaging/restrain
based. - Generation of coordinates for variable segments
DB scanning/Ab Initio/restrain
based. - Generation of coordinates for sidechain atoms
superimposing/rotamer libraries/restrain based. - Model Evaluation
- Assessment of to the ability to functionally
identify
the active site of the
model. - Assessment of physico-chemical or structural
environment based on statistical analyses of DBs
for characteristics such as - Intramolecular packing.
- Bond geometry.
- Solvent accessibility.
Peitsch et al. (1999)
19Structure Prediction
- Threading (Sequence-Structure Alignment)
- Identifying evolutionary unrelated proteins that
have converged to similar folds. - Scoring Scheme describes the propensity of each
AA for its structural/physico-chemical
environment SS type, solvent accessibility,
coordination number, etc - Profile construction encoding the templates AAs
structural features to a 1D profile and
predicting such a profile for the probe. - Threading Algorithm Aligning the 1D profiles of
the template and the probe using DP and the
defined scoring scheme. -
-
But No adjustments to the template profile can
be made thus substantial rearrangements are
ignored
20Structure Prediction
- Ab Initio Techniques
- Simulating the folding process
-
- Simplifying the energy landscape
- Reducing the number of degrees of freedom
- Representing a group of atoms by a single atom.
- Reducing the number of atom interactions.
- Sampling the conformation space
- Monte Carlo sampling.
- Genetic Algorithm.
- Simulated Annealing.
- Hierarchical folding simulation.
-
21Blind Prediction
- Critical Assessment of Protein Structure
Prediction CASP - Goal to obtain an in-depth and objective
assessment of our current abilities and
inabilities in the area of protein structure
prediction. - Groups use their tools to model proteins with
pre-published structures. - The predictions are thus evaluated against the
subsequently determined structures. - CASP6 (2004) shows limited improvements compared
to CASP5 (2003).