Title: Chap.3 Protein Structure
1Chap.3 Protein Structure Function
- Topics
- Hierarchical Structure of Proteins
- Protein Folding
- Examples of Protein Function-Ligand-binding
Proteins Enzymes - Regulating Protein Function by Protein
Degradation - Regulating Protein Function by Noncovalent and
Covalent Modifications
Goals Learn the basic structure and properties of
proteins and enzymes, which carry out most of the
work in cells (Fig. 3.1).
2Overview of Protein Structure Hierarchy
The four levels of protein structure are
illustrated in Fig. 3.2. A detailed discussion of
each of these levels is presented in the next few
slides. Experiments have shown that the final 3D
tertiary structure of a protein ultimately is
determined by the primary structure (amino acid
sequence). The 3D fold (shape) of the protein
determines its function.
3Primary Structure
The primary structure of a protein refers to its
amino acid sequence. Amino acids in peptides (lt30
aas) and proteins (typically 200 to 1,000 aas)
are joined together by peptide bonds (amide
bonds) between the carboxyl and amino groups of
adjacent amino acids (Fig. 3.3). The backbone of
all proteins consists of a -N-Ca(R)-C(O)-
repetitive unit. Only the R-group side-chains
vary. By convention, protein sequences are
written from left-to-right, from the proteins N-
to C-terminus. The average yeast protein contains
466 amino acids. Because the average molecular
weight of an amino acid is 113 daltons (Da), the
average molecular weight of a yeast protein is
52,728 Da. Note that 1 Da 1 a.m.u. (1 proton
mass).
N
Ca(R)
4Secondary Structure a Helix
Secondary structure refers to short-range,
periodic folding elements that are common in
proteins. These include the a helix, the b sheet,
and turns. In the a helix (Fig. 3.4), the
backbone adopts a cylindrical spiral structure in
which there are 3.6 aas per turn. The R-groups
point out from the helix, and mediate contacts to
other structure elements in the folded protein.
The ? helix is stabilized by H-bonds between
backbone carbonyl oxygen and amide nitrogen atoms
that are oriented parallel to the helix axis.
H-bonds occur between residues located in the n
and n 4 positions relative to one another.
5Secondary Structure b Sheets Turns
In b sheets (a.k.a. pleated sheets), each b
strand adopts an extended conformation (Fig.
3.5). ß strands tend to occur in pairs or
multiple copies in b sheets that interact with
one another via H-bonds directed perpendicular to
the axis of each strand. Carbonyl oxygens and
amide nitrogens in the strands form the H-bonds.
Strands can orient antiparallel (Fig. 3.5a) or
parallel (not shown) to one another in b sheets.
R-groups of every other amino acid point up or
down relative to the sheet (Fig. 3.5b). Most ß
strands in proteins are 5 to 8 aas long. ß Turns
consist of 3-4 amino acids that form tight bends
(Fig. 3.6). Glycine and proline are common in
turns. Longer connecting segments between ß
strands are called loops.
ß turn
6Tertiary Structure
Tertiary structure refers to the folded 3D
structure of a protein. It is also known as the
native structure or active conformation. Tertiary
structure mostly is stabilized by noncovalent
interactions between secondary structure elements
and other internal sequence regions that cannot
be classified as a particular type of secondary
structure. The folding of proteins is thought to
be driven by the need to place the most
hydrophobic regions in the interior out of
contact with water (Fig. 3.7). The structures of
hundreds of proteins have been determined by
techniques such as x-ray crystallography and NMR.
Different methods of representing structures are
shown in Fig. 3.8.
Keep in mind that most proteins are somewhat
flexible and undergo subtle conformational
changes while carrying out their functions.
7Secondary Structure Motifs
Secondary structure motifs are evolutionarily
conserved collections of secondary structure
elements which have a defined conformation. They
also have a consensus sequence because the aa
sequence ultimately determines structure. A given
motif can occur in a number of proteins where it
carries out the same or similar functions. Some
well known examples such as the coiled-coil, EF
hand/helix-loop-helix, and zinc-finger motifs are
illustrated in Fig. 3.9. These motifs typically
mediate protein-protein association, calcium/DNA
binding, and DNA or RNA binding, respectively.
8Quaternary Structure
Multisubunit (multimeric) proteins have another
level of structural organization known as
quaternary structure. Quaternary structure refers
to the number of subunits, their relative
positions, and contacts between the individual
monomers in a multimeric protein. The quaternary
structure of the trimeric hemagglutinin surface
protein of influenza virus is shown in Fig.
3.10b. The tertiary structure of a hemagglutinin
monomer is shown in Fig. 3.10a.
9Modular Domain Structure of Proteins
Domains are independently folding and
functionally specialized tertiary structure units
within a protein. The respective globular and
fibrous structural domains of the hemagglutinin
monomer (which happen to be individual
polypeptide chains) are illustrated above in Fig.
3.10a. Domains (such as the EGF domain) also may
be encoded within a single polypeptide chain, as
illustrated in Fig. 3.11. Domains still perform
their standard functions although fused together
in a longer polypeptide (e.g., DNA binding and
ATPase domains of a transcription factor). The
modular domain structure of many proteins has
resulted from the shuffling and splicing together
of their coding sequences within longer genes.
Epidermal growth factor (EGF) domain
10Supramolecular Structure
In many cases, multimeric proteins achieve
extremely large sizes, e.g., 10s-100s of
subunits. Such complexes exhibit the highest
level of structural organization known as
supramolecular structure. Examples include mRNA
transcription preinitiation complexes (Fig.
3.12), ribosomes, proteasomes, and spliceosomes.
Typically, supramolecular complexes function as
macromolecular machines" in reference to the
fact that the activities of individual subunits
are coordinated in the performance of some
overall task (e.g., protein synthesis by the
ribosome).
11Evolution of Protein Families
Through genome sequencing and classical gene
cloning approaches, the sequences of an enormous
number of proteins have been compiled. Comparison
of sequences shows that most proteins belong to
larger families that have evolved over time from
a common ancestor protein, as illustrated for the
globin family of O2 binding proteins (Fig. 3.13).
Proteins that have a common ancestor are called
homologs. The members of a protein family often
show gt30 sequence ID, have a common 3D fold, and
usually perform closely related functions.
12Structure of the Globin Proteins
These globular proteins are composed of mostly a
helical secondary structure. The similar folds of
the globins can be readily seen by comparing the
structures of the b chain of hemoglobin,
myoglobin, and leghemoglobin (Fig. 3.13). The
closely similar structures of mammalian myoglobin
and the hemoglobin b subunit might be expected,
but the resemblance of the distantly related
plant leghemoglobin is
striking. Comparison of the sequences of the
members of protein families has brought to light
the fact that amino acids within a given class
exhibit a large degree of functional redundancy.
In this regard, the 3 proteins discussed here
exhibit less than 20 identity in their
sequences, yet have the same structure. Lastly,
in hemoglobin 2 different globin chains have
combined to form a multisubunit protein.
13Overview of Protein Folding
Many experiments have shown that proteins can
spontaneously fold from an unfolded state to
their folded native state. This proves that the
amino acid sequence contains enough information
to specify tertiary structure. Bonds within the
peptide backbone seek out different possible
conformations as the final tertiary structure is
achieved (Fig. 3.14). Folding tends to occur via
successive conformational changes leading to
secondary and then tertiary structure elements
(Fig. 3.15). The native conformation of a protein
typically is its lowest free energy, and
therefore, most stable structure. The unfolded
(denatured) conformation of a protein can be
generated by heating or treatment with certain
organic solvents.
14Chaperone-assisted Protein Folding
The folding of many proteins, particularly large
ones, is kinetically slow and is assisted in vivo
by folding agents known as chaperones. These
proteins are found in all organisms and even in
different organelles of eukaryotic cells.
Chaperones assist in 1) folding of nascent
polypeptides made by translation, and 2)
re-folding of proteins denatured by environmental
damage, such as heat shock. Molecular chaperones
bind to unfolded nascent
polypeptide chains as they emerge from the
ribosome, and prevent aggregation, misfolding,
and degradation (Fig. 3.16a). The hydrolysis of
ATP by the chaperone drives conformational
changes that prevent aggregation and help drive
protein folding. Accessory proteins participate
in the process. Eukaryotic molecular chaperones
such as Hsp 70 (cytosol mito matrix) and BiP
(ER) are related to the bacterial protein DnaK.
15Chaperonins
Eukaryotic chaperonins such as the TriC complex
are large multimeric complexes related to the
bacterial GroEL and GroES proteins. These
complexes take up unfolded proteins into an
internal chamber for folding (Fig. 3.17). ATP
hydrolysis drives folding.
16Neurodegenerative Diseases
In neurodegenerative diseases such as Alzheimer's
disease and transmissible spongiform
encephalopathy (mad cow), insoluble misfolded
proteins accumulate in the brain in pathological
lesions known as plaques, resulting in
neurodegeneration (Fig. 3.18). In Alzheimer's
disease, the protein known as amyloid precursor
protein is cleaved into a peptide product
(b-amyloid) that aggregates and precipitates in
amyloid filaments. The misfolding of b-amyloid,
which involves a transition from a helical to b
sheet conformation leads to filament formation.
In mad cow disease, prion proteins precipitate
causing lesions.
17(No Transcript)
18Ligand-binding Proteins
The term ligand refers to any molecule that can
be bound by a protein. Ligands may be hormones,
metabolites, or even other proteins. Ligand
binding requires molecular complementarity. The
greater the degree of complementarity, the higher
the specificity and affinity of the interaction.
Affinity is reflected in the Kd for binding.
Protein-ligand binding is illustrated here for
antibodies (Fig. 3.19a). The complementarity-deter
mining regions (CDRs) of the antibody make highly
specific contacts with epitopes in the antigen
(Fig. 3.19b).
CDR
Epitope
(a)
19Overview of Enzyme Catalysis I
Enzymes are proteins (a few are RNAs called
ribozymes) that catalyze chemical reactions
within living organisms. Enzyme-catalyzed
reactions typically are highly specific, and rate
enhancements of 106-1012 are common. In an
enzyme-catalyzed reaction, the reactant (the
substrate) is converted into the product. Like
all catalysts, enzymes are not consumed in a
reaction. Further, they do not change the ?G0' or
Keq for the reaction, only its rate.
Rate enhancement is achieved due to the fact that
enzymes are most complementary to the transition
state structure formed in the reaction. This
results in stabilization of the transition state
and lowering of the activation energy barrier
(?G) for the reaction (Fig. 3.20).
20Overview of Enzyme Catalysis II
The transformation of a substrate to the product
occurs in the active site of an enzyme. The
active site can be subdivided into a catalytic
site wherein amino acids that catalyze the
reaction reside, and a binding pocket that
recognizes a specific feature of the substrate,
conferring specificity to the enzyme-substrate
interaction. A schematic model for an enzyme
catalyzed reaction is shown in Fig. 3.23. The
kinetic equation describing the reaction E S ?
ES ? E P. A reaction coordinate diagram showing
the binding and catalytic steps of an enzyme
catalyzed reaction is shown in Fig. 3.24.
21Enzyme Kinetics Enzyme Concentration
The velocity of an enzyme-catalyzed reaction
reaches a maximal rate (Vmax) at high
concentrations of substrate (Fig. 3.22a). Vmax is
achieved when all enzyme molecules have bound the
substrate and are engaged in catalysis
(saturation). The French mathematicians Michaelis
and Menten developed a kinetic equation to
explain the behavior of most enzymes. They showed
that the maximal rate of an enzyme-catalyzed
reaction (Vmax) depends on the concentration of
enzyme (Fig. 3.22a) and the rate constant for the
rate-limiting step of the reaction.
MM equation Vmax S S KM
1.0
x
V0
x
x
0.5
x
22Enzyme Kinetics Substrate Affinity
Michaelis and Menten also derived a kinetic
constant, the Michaelis constant (KM), that is
indicative of the affinity of most enzymes for
their substrates. The lower the KM the higher the
affinity of the enzyme for the substrate (Fig.
3.22b). The KM happens to be the concentration of
substrate at which the reaction rate is
half-maximal. The concentrations of cellular
metabolites usually are set near the KMs of the
enzymes that carry out their metabolism. This
allows cells to respond to changes in substrate
concentration.
1/2 Vmax
23Mechanism of Serine Proteases I
Proteases are enzymes that cleave peptide bonds
in other proteins. The serine proteases, which
are important for digestion and blood
coagulation, contain reactive serine residues in
their catalytic sites. Also present are aspartate
and histidine residues that together with serine
make up what is called the catalytic triad. The
active sites of serine proteases also contain
binding pockets that confer specificity by
positioning the peptide bond that is to be
cleaved next to the reactive serine (Fig. 3.25a,
trypsin). The digestive proteases trypsin,
chymotrypsin, and elastase select cleavage sites
based on the features of their binding pockets
(Fig. 3.25b).
Specificity Trypsin-basic aas Chymotrypsin-aromati
c aas Elastase-small side-chain aas
24Mechanism of Serine Proteases II
In the serine protease reaction mechanism, an
acyl enzyme intermediate is formed transiently
after peptide bond cleavage by serine (Fig.
3.26). Subsequently, the acyl group is hydrolyzed
off the serine later in the reaction. Both
acid-base catalysis (Steps a,c,d, f) and
transition state stabilization (Steps b e)
occur during the reaction. The reaction mechanism
is inhibited at low pH due to protonation of
His-57 (inset). The pH optimum of serine protease
reactions therefore occurs at or slightly above
neutrality.
25Multifunctional Enzymes
Most metabolic pathways occur via multiple
enzyme-catalyzed steps. As illustrated in Fig.
3.28, the rates of pathway reactions can be
increased if the substrates and products of each
step are channeled to the next enzyme in the
pathway. Channeling is enhanced in multisubunit
enzyme complexes and by attachment of enzymes to
scaffolds (Fig. 3.28b), or even by fusion of
encoded enzymes into a single polypeptide chain
(Fig. 3.28c).
26Regulating Protein Function by Degradation
The proteolytic degradation (turnover) of
proteins is important for regulatory processes,
cell renewal, and disposal of denatured and
damaged proteins. Lysosomes carry out degradation
of endocytosed proteins and retired organelles.
Cytoplasmic protein degradation is performed
largely by the molecular machine called the
proteasome. Proteasomes recognize and degrade
ubiquinated proteins (Fig. 3.29). Ubiquitin is a
76-amino-acid protein that after conjugation to
the protein, targets it to the proteasome. In
ATP-dependent steps, the C-terminus of ubiquitin
is covalently attached to a lysine residue in the
protein. Polyubiquitination then takes place. The
proteasome degrades the protein to peptides, and
released ubiquitin molecules are recycling.
27Regulating Function by Ligand Binding
The binding of a ligand to a protein typically
triggers an allosteric ("other shape")
conformational change resulting in the
modification of its activity. An overview of
regulation via allosteric transitions is
presented here in the context of the tetrameric
O2 binding protein, hemoglobin (Hb). As shown in
Fig. 3.30, the O2 binding curve for Hb does not
show the simple hyperbolic shape exhibited by
proteins that bind a ligand with
the same affinity regardless of ligand
concentration. Instead, the Hb O2-binding curve
is sigmoidal which indicates that the affinity
for O2 molecules increases after the first 1 or 2
have bound. In this case, binding displays
positive cooperativity. Negative cooperativity is
observed with other protein-ligand systems. The
reduced O2 binding affinity of Hb at low O2
tensions favors release of O2 to peripheral
tissues.
28Calmodulin-mediated Switching
Many proteins play switching functions in cell
signaling. Calcium ion (Ca2) is a very important
messenger in cell signaling. Cells maintain
cytoplasmic calcium concentration at about 10-7
M. When calcium concentration rises above this
level due to hormone-receptor signaling
processes, etc., it binds to a protein known as
calmodulin (Kd 10-6 M) triggering
conformational changes that result in its
activation. Calmodulin contains 4
helix-loop-helix motifs (EF hands) each of which
can bind calcium (Fig. 3.31). Calcium binding
causes a major allosteric transition in
calmodulin. In its alternate conformation,
calmodulin binds to target proteins, changing
their activity.
Ca2
29GTPase-mediated Switching
Proteins belonging to the GTPase superfamily,
such as Ras and G proteins, serve as guanine
nucleotide-dependent regulatory switches that
control of the activity of specific target
proteins (Fig. 3.32). When bound to GTP, these
proteins adopt an active conformation that
modulates target protein function. When bound to
GDP, their activity is turned off. The time-frame
of activation depends on the intrinsic GTPase
activity (the timer function) of these proteins.
In addition, GTP and GDP binding (and thus
activity) may be regulated by other factors.
Examples of such regulation will be covered
later.
Target protein function
30Regulation by Kinase/Phosphatase Switching
Protein function also can be regulated by
allosteric transitions caused by covalent
modification via phosphorylation (Fig. 3.33).
Phosphorylation typically occurs on serine,
threonine, and tyrosine residues. Enzymes known
as kinases carry out phosphorylation. Their
activity is opposed by phosphatases, which
hydrolyze phosphates off of the modified amino
acid. Some proteins are turned on by
phosphorylation others are turned off.