Title: Protein structure
1Protein structure
Anne Mølgaard, Center for Biological Sequence
Analysis
2Could the search for ultimate truth
really have revealed so hideous and
visceral-looking an object?
Max Perutz, 1964 on protein structure
John Kendrew, 1959 with myoglobin model
3Holdings of the Protein Data Bank (PDB)
Sep. 2001 Feb. 2005 X-ray 13116 25350 NMR
2451 4383 theoretical 338 0 total 15905
29733
4Methods for structure determination
- X-ray crystallography
- Nuclear Magnetic Resonance (NMR)
- Modelling techniques
5X-ray crystallography
- No size limitation
- Protein molecules are stuck in a crystal
lattice
- Some proteins seem to be uncrystallizable
- Slow
6X-rays
Fourier transform
7NMR spectroscopy
- Upper limit for structure determination currently
50 kDa
- Protein molecules are in solution
- Dynamics, protein folding
- Slow
8Modeling
- Need structure of a 30 id homolog
- Only applicable to 50 of sequences
- Fast
- Accuracy poor for low sequence id.
- There is still need for experimental structure
determination!
9Amino acids
- Amino group and acid group
- Side chain at Ca
- Chiral, only one enantiomer found in proteins
- (L-amino acids)
Ca
C
N
O
10http//www.ch.cam.ac.uk/magnus/molecules/amino/
11Amino acids
A Ala C Cys D Asp E Glu F Phe G G
ly H His I Ile K Lys L Leu M Met N
Asn P Pro Q Gln R Arg S Ser T Th
r V Val W Trp Y - Tyr
Livingstone Barton, CABIOS, 9, 745-756, 1993
12- Levels of protein structure
- Primary
- Secondary
- Tertiary
- Quarternary
13Primary structure
MKTAALAPLFFLPSALATTVYLA GDSTMAKNGGGSGTNGWGEYL AS
YLSATVVNDAVAGRSAR(etc)
14Ramachandran plot
left-handed ?-helix
?-sheet
?-helix
15Hydrophobic core
- Hydrophobic side chains go into the core of the
molecule but the main chain is highly polar
- The polar groups (CO and NH) are neutralized
through formation of H-bonds
16Secondary structure
?-helix CO(n)HN(n4)
?-sheet
(anti-parallel)
17 and all the rest
- 310 helices (CO(n)HN(n3)), p-helices
(CO(n)HN(n5))
- b-turns and loops (in old textbooks sometimes
referred to as random coil)
18The ?-helix has a dipole moment
19Two types of ?-sheet
parallel
anti-parallel
20Tertiary structure (domains, modules)
Rhamnogalacturonan acetylesterase (1k7c)
Rhamnogalacturonan lyase (1nkg)
21Quarternary structure
B.caldolyticus UPRTase (1i5e)
B.subtilis PRPP synthase (1dkr)
22Protein structure and water
A. aculeatus RG acetylesterase
23Classification schemes
- SCOP
- Manual classification (A. Murzin)
- CATH
- Semi manual classification (C. Orengo)
- FSSP
- Automatic classification (L. Holm)
24Levels in SCOP
- Class Folds Superfamilies Families
- All alpha proteins 202 342 550
- All beta proteins 141 280 529
- Alpha and beta proteins (a/b) 130 213 593
- Alpha and beta proteins (ab) 260 386 650
- Multi-domain proteins 40 40 55
- Membrane and cell surface
- proteins 42 82 91
- Small proteins 72 104 162
- Total 887 1447 2630
http//scop.berkeley.edu/count.htmlscop-1.67
25Major classes in SCOP
- Classes
- All alpha proteins
- Alpha and beta proteins (a/b)
- Alpha and beta proteins (ab)
- Multi-domain proteins
- Membrane and cell surface proteins
- Small proteins
-
26All a Hemoglobin (1bab)
27All b Immunoglobulin (8fab)
28a/b Triose phosphate isomerase (1hti)
29ab Lysozyme (1jsf)
30Folds
- Proteins which have 50 of their secondary
structure elements arranged the in the same order
in the protein chain and in three dimensions are
classified as having the same fold - No evolutionary relation between proteins
-
- confusingly also called fold classes
31Superfamilies
- Proteins which are (remote) evolutionarily
related
- Sequence similarity low
- Share function
- Share special structural features
- Relationships between members of a superfamily
may not be readily recognizable from the sequence
alone
32Families
- Proteins whose evolutionarily relationship is
readily recognizable from the sequence (25
sequence identity)
- Families are further subdivided into Proteins
- Proteins are divided into Species
- The same protein may be found in several species
33Links
- PDB (protein structure database)
- www.rcsb.org/pdb/
- SCOP (protein classification database)
- scop.berkeley.edu
- CATH (protein classification database)
- www.biochem.ucl.ac.uk/bsm/cath
- FSSP (protein classification database)
- www.ebi.ac.uk/dali/fssp/fssp.html
34Why are protein structures so interesting?
They provide a detailed picture of interesting
biological features, such as active site,
substrate
specificity, allosteric regulation etc.
They aid in rational drug design and protein
engineering
They can elucidate evolutionary relationships
undetectable by sequence comparisons
35Inferring biological features from the structur
e
1deo
Topological switchpoint
36Inferring biological features from the structure
Active site
Triose phosephate isomerase (1ag1)
(Verlinde et al. (1991) Eur.J.Biochem. 198, 53)
37Engineering thermostability in serpins
- Overpacking
- Buried polar groups
- Cavities
Im, Ryu Yu (2004) Engineering thermostability
in serine protease inhibitors
PEDS, 17, 325-331.
38Evolution...
Structure is conserved longer than
both sequence and function
39Rhamnogalacturonan acetylesterase (A. aculeatus)
(1k7c)
Platelet activating factor acetylhydrolase
(Bos Taurus) (1wab)
Serine esterase (S. scabies) (1esc)
40Rhamnogalacturonan acetylesterase
Serine esterase
Platelet activating factor acetylhydrolase
Mølgaard, Kauppinen Larsen (2000) Structure, 8,
373-383.
41- "We wish to suggest a structure for the salt of
deoxyribose nucleic acid (D.N.A.). This structure
has novel features which are of considerable
biological interest. - It has not escaped our notice that the specific
pairing we have postulated immediately suggests a
possible copying mechanism for the genetic
material." - J.D. Watson F.H.C. Crick (1953) Nature, 171,
737.
42(No Transcript)