Title: Protein Structure: Xray Crystallography
1Protein Structure X-ray Crystallography
- Microbiology 343
- David Wishart
- david.wishart_at_ualberta.ca
2Objectives
- To review the basics of protein structure
(primary, secondary, supersecondary, tertiary) - Understand the basic principles and steps in
protein crystallization and protein
cyrstallography - Become aware of databases (PDB) and tools to help
visualize proteins
3Much Ado About Structure
Structure Function Structure
Mechanism Structure
Origins/Evolution Structure-based Drug
Design Solving the Protein Folding Problem
4Protein Structures Are Complex
5Protein Structure (A Review)
6Ramachandran Plot
7Secondary Structure
8Beta Sheet
9Alpha Helix
10Reverse Turn
11Supersecondary Structure
12Supersecondary Structure
13Tertiary Structure
14Proteins are Complex
- Average residue contains 8 heavy atoms
- Average protein contains 300 amino acids
- Average structure contains 2400 atoms
- First structure (sperm whale myoglobin) took 5
years with a team of 15 key punch operators
working around the clock to solve - Most structures still take 1 year to solve
15Solving Protein Structures
- Only 2 kinds of techniques allow one to get
atomic resolution pictures of macromolecules - X-ray Crystallography (first applied in 1961 -
Kendrew Perutz) - NMR Spectroscopy (first applied in 1983 - Ernst
Wuthrich)
16X-ray Crystallography
17X-ray Crystallography
- Crystallization
- Diffraction Apparatus
- Diffraction Principles
- Conversion of Diffraction Data to Electron
Density - Resolution
- Chain Tracing
18Crystallization
Protein Crystal
19Crystallization
20Crystallization
- Start with a solution of the protein with a
fairly high concentration (2-50 mg/ml) - Add reagents (PEG) that reduce the solubility
close to spontaneous precipitation - Perform further concentration slowly until small
crystals may start to grow - Often 100s to 1000s of different conditions
have to be tried to succeed - Crystals should to be a few tenth of a mm in each
direction to be useful
21Hanging Drop Method
- A few mL of protein solution are mixed with an
about equal amount of reservoir solution
containing the precipitants - A drop of this mixture is put on a glass slide
which covers the reservoir - The protein/precipitant mixture in the drop is
less concentrated than the reservoir solution so
water evaporates from the drop into the reservoir - The concentration of both protein and precipitant
in the drop slowly increases leading to crystal
formation
22Diffraction Apparatus
23Diffraction Apparatus
24A Bigger Diffraction Apparatus
Synchrotron Light Source
25The Canadian Light Source
26Diffraction Principles (Braggs Law)
nl 2dsinq
27Diffraction Principles
Corresponding Diffraction Pattern
A string of atoms
28Protein Crystal Diffraction
Diffraction Pattern
29Diffraction Apparatus
30Converting Diffraction Data to Electron Density
F T
31Fourier Transformation
i(xyz)(hkl)
F(x,y,z) f(hkl)e d(hkl)
Converts from units of inverse space to cartesian
coordinates
32Diffracting a Cat
Diffraction data with phase information
Real Diffraction Data
33Reconstructing a Cat
FT
Easy
FT
Hard
34The Phase Problem
- Diffraction data only records intensity, not
phase information (half the information is
missing) - To reconstruct the image properly you need to
have the phases (even approx.) - Guess the phases (molecular replacement)
- Search phase space (direct methods)
- Bootstrap phases (isomorphous replacement)
- Uses differing wavelengths (anomolous disp.)
35MAD X-ray Crystallography
- MAD (Multiwavelength Anomalous Dispersion
- Requires synchrotron beam lines (CLS!)
- Requires protein with multiple scattering centres
(selenomethionine labeled) - Allows rapid phasing
- Proteins can now be solved in just 1-2 days
36Resolution
1.2 Å
2 Å
3 Å
37Chain Tracing
Electron Chain Final Density Trace Model
38Refinement
39Refinement
iterations
R
R S(Fo-Fc)/S(Fo)
Fc calculated structure factor
Fo observed structure factor
40The Final Result
ORIGX2 0.000000
1.000000 0.000000 0.00000
2TRX 147 ORIGX3
0.000000 0.000000 1.000000 0.00000
2TRX 148 SCALE1
0.011173 0.000000 0.004858 0.00000
2TRX 149 SCALE2
0.000000 0.019585 0.000000 0.00000
2TRX 150
SCALE3 0.000000 0.000000 0.018039
0.00000 2TRX 151
ATOM 1 N SER A 1 21.389
25.406 -4.628 1.00 23.22 2TRX 152
ATOM 2 CA SER A 1
21.628 26.691 -3.983 1.00 24.42 2TRX 153
ATOM 3 C SER A 1
20.937 26.944 -2.679 1.00 24.21 2TRX
154 ATOM 4 O SER A
1 21.072 28.079 -2.093 1.00 24.97
2TRX 155 ATOM 5 CB
SER A 1 21.117 27.770 -5.002 1.00 28.27
2TRX 156 ATOM 6
OG SER A 1 22.276 27.925 -5.861 1.00
32.61 2TRX 157 ATOM
7 N ASP A 2 20.173 26.028 -2.163
1.00 21.39 2TRX 158
ATOM 8 CA ASP A 2 19.395 26.125
-0.949 1.00 21.57 2TRX 159
ATOM 9 C ASP A 2 20.264
26.214 0.297 1.00 20.89 2TRX 160
ATOM 10 O ASP A 2
19.760 26.575 1.371 1.00 21.49 2TRX 161
ATOM 11 CB ASP A 2
18.439 24.914 -0.856 1.00 22.14 2TRX 162
A PDB coordinate file
41The PDB
- PDB - Protein Data Bank
- Established in 1971 at Brookhaven National Lab (7
structures) - Primary archive for macromolecular structures
(proteins, nucleic acids, carbohydrates) - Moved from BNL to RCSB (Research Collaboratory
for Structural Bioinformatics) in 1998
42The PDB
http//www.rcsb.org/pdb/
43The PDB
- Contains coordinate data (primarily) from X-ray,
NMR and modelling - Contains files in 2 formats
- PDB format
- mmCIF (macrmolecular Crystallographic Information
File Format) - Contains 35,000 entries
- Currently growing exponentially
44PDB Growth
45Viewing 3D Structures
46Protein Rendering
Cylinder Ribbon (N-C gradient)
47Protein Rendering
Ribbon (2o structure)
Stick
48Protein Rendering
Space Filling Wire
Frame (Vector)
49Protein Explorer (Chime)
50Protein Explorer
- http//www.umass.edu/microbio/chime/explorer/
- Uses Chime Rasmol for its back-end
- Very flexible, user friendly, well documented,
offers morphing, sequence structure interface,
comparisons, context-dependent help, smart
zooming, off-line - Browser Plug-in (Like PDF reader)
- Compatible with Netscape (Mac Win)
51QuickPDB
52Quick PDB
- http//www.sdsc.edu/pb/Software.html
- Very simple viewing program with limited
manipulation and very limited rendering capacity
-- Very fast - Java Applet (Source code available)
- Compatible with most browsers and computer
platforms
53Rasmol
54Rasmol
- http//www.umass.edu/microbio/rasmol/
- Very simple viewing program with limited
manipulation capacity, easy to use! - Grand-daddy of all visual freeware
- Runs as installed stand-alone program
- Source code available
- Runs on Mac, Windows, Linux, SGI and most other
UNIX platforms
55Conclusion
- X-ray crystallography is the primary method used
to determine protein structures (3/4 of all
structures in PDB) - Has allowed determination of structures as large
as viruses and ribosomes to be completed - X-ray methods are fast and now depend primarily
on computers and robots - X-ray structures are generally more accurate than
NMR structures, but reveal the structure in the
solid state rather than the liquid state