The Anatomy and Taxonomy of Protein Structure - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

The Anatomy and Taxonomy of Protein Structure

Description:

yellow and blue delineate sterically allowed conformations ... (Kabsch & Sander, 1983) and STRIDE (Frishman & Argos, 1995) are two of the more ... – PowerPoint PPT presentation

Number of Views:199
Avg rating:3.0/5.0
Slides: 30
Provided by: mattc5
Category:

less

Transcript and Presenter's Notes

Title: The Anatomy and Taxonomy of Protein Structure


1
The Anatomy and Taxonomy of Protein Structure
  • First few lectures
  • how do we look at protein structures?
  • how do we classify and compare them?
  • Today, a little about the protein backbone or
    main chain.

2
Backbone geometry in proteins
Ramachandran plot
y
f
Angle w is almost always close to 180--the
peptide bond is planar and trans. f and y may
vary but are limited to certain combinations as
shown at right.
yellow and blue delineate sterically allowed
conformations Red shows residues in helical
secondary structure, cyan In beta-sheet, and
black other. Squares indicate glycines
3
Hydrogen bond geometry
  • Hydrogen bond not really a covalent bond--not
    much orbital overlap.
  • Model as an electrostatic interaction between two
    dipoles consisting of the H-N bond and the O sp2
    lone pair. In electrostatic theory, the optimal
    orientation of two such dipoles is head-to-tail.
    The energy of such an arrangement should decrease
    as the head and tail are brought together as long
    as atomic van der Waals radii are not violated
    (then repulsive forces quickly take over).
  • Ideal hydrogen bond in this model would have
    r3.0 Ã…, p180, b0 and g60. Convince
    yourself of this.
  • In small molecule crystals, this is approximately
    what is observed, though there is a lot of
    variation in the angles b and g. Thus the
    precise COH angle parameters are not critical.
  • Main chain-main chain hydrogen bonds found in
    proteins will show various deviations from this
    geometry, partly due to the topological
    constraints imposed by forming secondary
    structures.

4
Criteria for identifying hydrogen bonds in
protein structures
  • What is a reasonable hydrogen bond? Criteria for
    identifying hydrogen bonds are somewhat arbitrary
    and many have been used. Here are a couple of
    examples.
  • Geometric criteria Often H-bonds are just
    identified by two parameters, the ON
    (acceptor-donor) distance r, and a OH-N angle p.
    The angles describing the COH geometry are
    ignored. Typical cutoffs p 120 and r Ã…. (Baker Hubbard, 1984)
  • Electrostatic criteria One of the most commonly
    used criteria is a potential function based on a
    pure electrostatic model (Kabsch Sander, 1983).
    Place partial positive and negative charges on
    the C,O (q1,-q1) and N,H (q2,-q2) atoms and
    compute a binding energy as the sum of repulsive
    and attractive interactions between these four
    atoms
  • Eq1q2(1/r(ON)1/r(CH)-1/r(OH)-1/r(CN))f
  • where q10.42e and q20.20e, f is a dimensional
    factor (332) to convert E to kcal/mol, and r(AB)
    is the interatomic distance between atoms A and
    B.
  • A hydrogen bond is then identified by a binding
    energy less than some arbitrary cutoff, e.g. E-0.5 kcal/mol.
  • Note that the criteria defined above are only
    applicable when hydrogen atom positions are
    available. Crystal structures do not have
    hydrogens--however, their positions can be
    computed in many cases.

5
Secondary Structure Identification
  • Next week well learn about predicting the
    locations of secondary structures along the amino
    acid sequence of a protein from the sequence
    information alone. To evaluate whether such a
    prediction is correct, one has to be able to
    identify secondary structures from an
    experimentally determined set of protein
    coordinates i.e. how do you define where a
    secondary structure element begins and ends?
  • A trivial but difficult problem (Richardson,
    1981)
  • There is no single and correct algorithm for
    assigning secondary structure type.
  • Most commonly used criteria are backbone
    conformation (phi,psi) and hydrogen bonding
    pattern.
  • DSSP (Kabsch Sander, 1983) and STRIDE (Frishman
    Argos, 1995) are two of the more common
    programs, though there are many ways of defining
    secondary structure boundaries.

6
DSSP turn and helix definitions
  • 3-turn
  • 3 3
  • -N-C-C--N-C-C--N-C-C--N-C-C- residues
  • H O N O H O H O
  • ----------------
  • 4-turn
  • 4 4 4
  • -N-C-C--N-C-C--N-C-C--N-C-C-N-C-C residues
  • H O N O H O H O H O
  • ----------------------
  • 5-turn (just an elaboration of 3- and 4-turn.
  • A minimal helix is two consecutive N-turns--
  • for a minimal four helix from residue i to i3
  • i
  • 444
  • 444

7
DSSP bridge, ladder and sheet definitions
  • parallel bridge
  • x notation
  • -N-C-C--N-C-C--N-C-C- residues
  • H O H O H O
  • \ . . / H-bonds
  • \. ./ (\ and /,
  • .\ /. or .)
  • . \ / .
  • H O H O H H residues
  • -N-C-C--N-C-C--N-C-C-
  • x notations

antiparallel bridge X
notation -N-C-C--N-C-C--N-C-C- residues H O
H O H O . ! ! . H-bonds .
! ! . (! or .) . ! ! .
. ! ! . O H O H O H
residues -C-C-N--C-C-N--C-C-N- X
notations
ladder set of one or more consecutive bridges of
identical type sheet set of one or more ladders
connected by shared residues
8
STRIDE (2ndary STRucture IDEntification)
  • Uses what is known as a knowledge-based
    potential--we as a community of scientists know
    intuitively how to define secondary structures,
    we just cant put our finger on it!
  • So how do we quantify what we already know?
  • Set of qualitative criteria--most common criteria
    used by crystallographers are backbone
    conformation and hydrogen bonding.
  • standard of truth--collective wisdom of
    crystallographers--2ndary structure assignments
    made by crystallographers when they submitted
    structures to the Protein Databank.
  • STRIDE makes potential energy functions for
    H-bonding and backbone conformation but leaves
    floating parameters which are adjusted to best
    reproduce crystallographers assignments.

9
Boundaries of a helix
12
2
2
psi
11
10
phi
Is 10 in the helix? How about 11? How about 2?
11
10
10
Side chain conformation
side chains differ in their number of
degrees of conformational freedom (some dont
have any) but side chains of very different
size can have the same number of chi angles.
11
Names of canonical side chain conformations
name of conformation
ttrans, ggauche
IUPAC nomenclature http//www.chem.qmw.ac.uk/iupa
c/misc/biop.html
12
Rotamers
  • a particular combination of angles c1, c2, etc.
    for a particular residue is known as a rotamer.
  • for example, for aspartate, if one considers only
    the canonical staggered forms, there are nine
    (32) possible rotamers gg-, gg, g-g-, g-g,
    tg, gt, tg-, g-t, tt
  • not all rotamers are equally likely.
  • for example, valine prefers its t rotamer.

distribution of valine rotamers in protein
structures (from Ponder Richards, 1987)
180
c10
360
13
Rotamer libraries
  • one of the problems in designing and
    modelling/predicting protein structures is how to
    construct an appropriate group of rotamers to
    represent the possible side chain conformations
    observed in proteins without using so many as to
    make the problem computationally intractable.
  • such groups of rotamers are known as rotamer
    libraries (Ponder Richards, 1987).
  • the probability of finding a particular rotamer
    is affected by what the backbone angles for that
    residue are (phi, psi). For instance, the g
    conformation is very rarely found in a helix.
    Thus, backbone-dependent rotamer libraries are
    also sometimes used.
  • Well delve into this in more depth in about a
    week when we do homology modelling

14
side chain rotamers are not limited to canonical
eclipsed forms--there are many subtly different
rotamers
from Xiang Honig, 2001
an x degree rotamer in this figure means that
at least one side chain angle differs by x
degrees.
15
Surface and interior of proteins
  • do proteins have a lot of holes/empty space
    inside?
  • how much of a proteins molecular surface is in
    contact with the surrounding solvent (water in
    the case of globular, soluble proteins)?
  • are certain residues more likely to be in contact
    with solvent than others?

16
Calculating Solvent Accessible Surface Area
  • Lee Richards, 1971 Shrake Rupley, 1973
  • First, represent atoms as spheres with
    appropriate van der Waals radii
  • eliminate overlapping parts of spheres
  • This gives a space-filling model similar to the
    picture at right

17
Now roll a sphere of a given radius all around
the Van der Waals surface the sphere will not
make contact with the entire van der Waals
surface its center will trace out a continuous
surface as it rolls
18
Now look at a cross-section Inner surfaces here
are van der Waals. Outer surface is that traced
out by the center of the sphere as it rolls
around the van der Waals surface. If any part
of the arc around a given atom is traced out,
that atom is accessible to solvent. The solvent
accessible surface of the atom is defined as the
sum the arcs traced around an atom.
theres not much solvent accessible surface in
the middle
van der Waals surface
solvent accessible surface
from Lee Richards, 1971
arc traced around atom
19
Fractional accessibility
  • calculate total solvent accessible surface of
    protein structure (also can calculate solvent
    accessible surface for individual
    residues/sidechains within the protein)
  • can also model the accessible surface area in an
    unfolded protein using accessible surface area
    calculations on model tripeptides such as
    Ala-X-Ala or Gly-X-Gly.
  • from these we can calculate what fraction of the
    surface is buried (inaccessible to solvent) by
    virtue of being within the folded, native
    structure of the protein.
  • this is done by dividing the accessible surface
    area in the native protein structure by the
    accessible surface in the modelled unfolded
    protein. Thats the fractional accessibility.
    The residue fractional accessibility and side
    chain fractional accessibility refer to the same
    thing calculated for individual
    residues/sidechains within the structure.

20
Accessible surface area in protein structures
  • accessible surface area As in native states of
    proteins is a non-linear function of molecular
    weight (Miller, Janin, Lesk Chothia, 1987)
  • As 6.3Mr0.73
  • where Mr is molecular wt

this is an empirical correlation but it
comes close to the expected two-thirds power law
relating surface area to volume or mass. Why is
the exponent a little larger?
21
How much surface area is buried when a protein
folds?
  • estimate accessible surface area in unfolded
    proteins using the accessible surface areas in
    Gly-X-Gly or Ala-X-Ala models. This is a linear
    function of molecular weight
  • At 1.48Mr 21
  • the total fractional accessibility is As/At ,and
    the fraction of surface area buried is 1- As /At
  • what fraction of surface area is typically buried
    for a protein of molecular weight 5000 daltons?
    30,000 daltons?

22
Distribution of residue fractional accessibilities
note that a sizable group are completely
buried (hatched) or nearly completely buried
note broad distribution among non-buried
residues, and mean accessibility for non-buried
residues of around 0.5
note that few residues are completely exposed to
solvent, but that fractional accessibility of 1
is possible
from (Miller et al, 1987)
23
Buried residues in proteins
  • the fraction of buried residues (defined by 0 or
    5 ASA cutoffs)
  • increases as a function of molecular weight--for
    your average protein
  • around 25 of the residues will be buried. These
    form the core.

size class mean Mr fraction of buried
residues 0 ASA 5 ASA small 8000 0.070 0.1
54 medium 16000 0.107 0.240 large 25000 0.139
0.309 XL 34000 0.155 0.324 all 0.118 0.257
24
Core of 434 cro
8 accessibility cutoff
25
Residue fractional accessibility correlates with
free energies of transfer for amino acids between
water and organic solvents
  • (Miller, Janin, Lesk Chothia, 1987)
  • (Fauchere Pliska, 1983)
  • the interior of a protein is akin to a
  • nonpolar solvent in which the nonpolar
  • sidechains are buried. Polar sidechains,
  • on the other hand, are usually on the surface.

26
Hydropathy scales
  • the correlation between a residue being polar or
    nonpolar and its tendency to be buried is a
    sequence-structure relationship-- a number of
    such relationships can be seen from examining
    protein structures. As we will see next week,
    such relationships are useful in trying to
    predict protein structure from amino acid
    sequence.
  • many scientists have tried to develop
    hydrophobicity or hydropathy scales to quantify
    the tendency of residues to be buried. Most
    such scales are based on partitioning of the
    amino acid between water and some nonpolar
    solvent, or between the surface and interior of
    proteins.

27
Kyte-Doolittle Hydropathy
nonpolar
on the bubble
polar/ charged
(Kyte Doolittle, 1981)
28
Buried polar residues in proteins
  • while most of the protein interior is made up of
    nonpolar side chains, the average protein will
    have a few buried polar residues, even ones which
    are capable of carrying a formal charge, e.g.
    Lys, Arg, Glu, Asp.
  • charged residues are almost always paired with
    other charged residues to make salt bridges, or
    hydrogen bonded to other polar groups.
  • in general, a key rule of protein structure
    anatomy is that you rarely see buried hydrogen
    bond donors/acceptors not paired to other
    acceptors/donors.

Arg10
buried salt bridge
hydrogen bond to main chain
Glu35
Arg5
29
Cavities in proteins
  • protein interiors generally have high packing
    densities such that not much void space is
    present.
  • nonetheless, proteins do sometimes have interior
    cavities big enough to fit water molecules.
Write a Comment
User Comments (0)
About PowerShow.com