Scaling, Phasing, Anomalous, Density modification, Model building, and Refinement - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Scaling, Phasing, Anomalous, Density modification, Model building, and Refinement

Description:

Index spots on an X-ray photograph. Draw Bragg planes. ... Calculate cell dimensions from an X-ray photograph. ... Molecular dynamics w/ Xray refinement ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 61
Provided by: chrisby
Category:

less

Transcript and Presenter's Notes

Title: Scaling, Phasing, Anomalous, Density modification, Model building, and Refinement


1
Scaling, Phasing, Anomalous, Density
modification, Model building, and Refinement
2
What weve learned so far
  • Electrons scatter Xrays.
  • Scattering is a Fourier transformation.
  • Inverting the Fourier transform gives the image
    of the electron density.
  • Waves have amplitude and phase. And we cant
    measure phase.
  • Inverting the Fourier transform without the
    phases gives the Patterson map, which is the map
    of all inter-atom vectors.
  • Space groups are groups of symmetry operations in
    3D.
  • The Patterson plus symmetry gives us heavy atoms
    positions.
  • Heavy atom positions plus amplitudes gives us
    phases.

3
What we can do.
  • Sum waves.
  • Calculate the phase given atom position and
    scattering vector.
  • Index spots on an X-ray photograph.
  • Draw Bragg planes.
  • Invert the Fourier transform using Bragg planes.
  • Calculate cell dimensions from an X-ray
    photograph.
  • Describe the symmetry of a crystal or periodic
    pattern.
  • Convert a simple Patterson to heavy atom
    positions.
  • Calculate heavy atom vectors given heavy atom
    positions.
  • Solve for phases given amplitudes and heavy atom
    vectors.

4
From data to model
Collect heavy atom data Fph
Collect native data Fp
Estimate phases
no
density modification?
Calculate r
yes
Is the map traceable?
Trace the map
Refine
5
From data to phases
heavy atom data Fph
native data Fp
Calculate difference Patterson
Find heavy atom peaks on Harker sections
Solve for heavy atom positions using symmetry
Calculate heavy atom vectors
Estimate phases
6
From data to Patterson map
heavy atom data Fph
native data Fp
Find the best scale factor, k
Calculate Fdiff kFph Fp
Calculate difference Patterson
7
From crystal to data
Indexed film
I is relative Bigger crystal, higher I Better
crystal, higher I Longer exposure, higher I More
intense Xrays, higher I
Internal scaling
Intensity, Ip(hkl) F2
native data Fp
Because there is no absolute scale Fp and Fph
are on different scales
8
What happens to the phase calculation if the
scaling is off?
Radii are Fp and kFph
9
Scaling two datasets
h k l Fp s 1 0 0 3233. 100.2 0 0
2028. 98.3 0 0 2179 88.4 0 0 .... ...
h k l Fph s 1 0 0 1122. 50.2 0 0
1014. 49.3 0 0 1081. 44.4 0 0 .... ...
1st approximation The intrinsic average
amplitude of scattering is constant for different
crystals. A simple scale factor k corrects for
crystal differences
ltFpgt ltFphgtk, therefore k ltFpgt/ltFphgt
10
Basic assumption
when scaling two crystals.
The total number of electrons in the unit cell is
the same for each (isomorphous) crystal.
Note isomorphous means same space group, same
cell dimensions.
11
Better scaling Wilson B-factor
Low-resolution features (ignored in slope calc)
water peak (ignored in slope calc)
Region of linear dependence of amplitude with
resolution. Slope W Wilson B-factor
ltIgtltF2gt
Scaled separately
Averaged in resolution bins
X-axisTwo sine-theta over lambda 1/d

d 20Å


Two sets of Fs might have different overall
B-factors, because the crystals may have
different degrees of mosaicity So Wilson scaling
is better than simple scaling.
ltFpgt ltFphgtk, kW(1/d) C
12
How good is scaling?
After solving the structure, we can go back and
see how good the scaling was. Typically, error in
scaling lt 10. In best cases lt 2. Scaling error
is worse if (1) crystals are non-isomorphous (2)
too many heavy atoms present (basic assumption
is wrong).
13
Heavy atom difference Fourier
Fph Fp Fh (vector addition)
  • The amplitude of Fh is only approximately Fph-Fp.
  • The true difference Fph - Fp depends on the
    phase of Fh relative to Fp

14
Centrosymmetric reflections
  • If the crystal has centrosymmetric symmetry, all
    reflections are centrosymmetric. Phase 0 or
    180
  • If the crystal has 2-fold, 4-fold or 6-fold
    rotational symmetry, then the reflections in the
    0-plane are centrosymmetric. (Because the
    projection of the density is centrosymmetric)
  • For centrosymmetric reflections
  • Fph Fp Fh

This means the amplitude Fh is exact for
centrosymmetric reflections.
assuming perfect scaling.
15
0-plane
R
Draw any set of Bragg planes parallel to the
2-fold. The projected density is centrosymmetric.
R
Therefore, phase is 0 or 180.
16
Initial phases
The most probable phase is not necessarily the
best for computing the first e-density map.
weighted average, best phase
Shaded regions are possible Fp and Fph solutions.
17
Figure of merit
Figure of merit m is a measure of how good the
phases are.
C is the center of mass of a ring of phase
probabilities (thats the mass). The radius of
the ring is 1. So m1 only if the probabilities
are sharply distributed. If they are distributed
widely, m is small. Fbest(hkl) F(hkl)me-iabest
18
In class exercise phase error
FP5.00 s0.5 FPH15.50 s0.8 FH12.23
aH1-63.4 FPH24.50 s0.9 FH20.50 aH2-164 (1)
Draw three circles separated by vectors FH1 and
FH2. (2) Draw circular error bars of width
2s. (3) Draw circle plot of Fp phase
probabilities. (4) Estimate the centroid c of
probabilit. (5) What is the Figure of Merit, m?
19
Anomalous dispersion
Inner electrons scatter with a time delay. This
is a phase shift that is always counter-clockwise
relative to the phase of the free electrons.
bound electrons
free electrons
Heavy atom
20
Anomalous dispersion
21
Anomalous dispersion
SIR single isomorphous replacement Advantages
only one derivative crystal is needed. (fewer
scaling problems) Anomalous dispersion has a
greater effect at higher resolution. (because the
inner electrons are more like a point source)
22
Is the initial map good enough?
(1) The map is calculated using abest. (2) The
map is contoured and displayed using InsightII,
MIDAS, XtalView, FRODO, O, ... (3) A trace is
attempted.
23
Model building
e- density cages (1 s contours) displayed using
InsightII
24
Information used to build the first model
Sequence and Stereochemistry ...plus assorted
disulfide and ligand information.
Models are built initially by identifying
characteristic sidechains (by their shape) then
tracing forward and backward along the backbone
density until all amino acids are in
place. Alpha-carbons can be placed by hand, and
numbered, then an automated program will add the
other atoms (MaxSprout).
25
Tracing an electron density map
Class exercise
sequence AGDLLEHEIFGMPPAGGA
Can you locate the density above in the sequence?
26
R-factor How good is the model?
Calculate Fcalcs based on the model. Compute
R-factor
Depending on the space group, an R-factor of 55
would be attainable by scaled random data. The
R-factor must be lt 50. Note It is possible to
get a high R-factor for a correct model. What
kind of mistake would do this?
27
What can you do if the phases are not good enough?
1. Collect more heavy atom derivative data 2. Try
density modification techniques.
initial phases
Density modification
Fos and (new) phases
Fcs and new phases
Map
Modified map
28
Density modification techniques
Solvent Flattening Make the water part of the
map flat.
(1) Draw envelope around protein part
(2) Set solvent r to ltrgt and back transform.
29
Solvent flattening
Requires that the protein part can be
distinguished from the solvent part. BC Wangs
method Smooth the map using a 10Å Guassian. Then
take the top X of the map, where X is calculated
from the crystal density.
30
Skeletonization
(1) Calculate map. (2) Skeletonize it (draw ridge
lines) (3) Prune skeleton so that it is
protein-like (4) Back transform the skeleton to
get new phases.
Protein-like means (a) no cycles, (b) no islands
31
Non-crystallographic symmetry
If there are two molecules in the ASU, there is a
matrix and vector that rotate one to the other
Mr1 v r2 (1) Using Patterson Correlation
Function, find M and v. (2) Calculate initial
map. (3) Set r(r1) and r(r2) to (r(r1) r(r2)
)/2 (4) Back transform to get new phases.
32
What does a good map look like?
plexiglass stack
brass parts model
Before computers, maps were contoured on stacked
pieces of plexiglass. A Richards box was used
to build the model.
half-silvered mirror
33
Low-resolution
At 4-6Å resolution, alpha helices look like
sausages.
34
Medium resolution
3Å data is good enough to se the backbone with
space inbetween.
35
The program BONES traces the density
automatically, if the phases are good.
36
BONES models need to be manually connected and
sidechains attached. MaxSPROUT converts a fully
connected trace to an all-atom model.
37
Errors in the phases make some connections
ambiguous.
38
Contouring at two density cutoffs sometimes helps
39
Holes in rings are a good thing
Seeing a hole in a tyrosine or phenylalanine ring
is universally accepted as proof of good phases.
You need at least 2Å data.
40
Can you see in stereo?
Try this at home. In 3D, the density is much
easier to trace.
41
New rendering programs
CONSCRIPT A program for generating electron
density isosurfaces for presentation in protein
crystallography. M. C. Lawrence, P. D. Bourke
42
Great map holes in rings
43
Superior map Atomicity
Rarely is the data this good. 2 holes in Trp. All
atoms separated.
44
Only small molecule structures are this good
Atoms are separated down to several contours.
Proteins are never this well-ordered. But this is
what the density really looks like.
45
Refinement
  • The gradient of the R-factor with respect to
    each atomic position may be calculated.
  • Each atom is moved down-hill along the gradient.
  • Restraints may be imposed.


46
What is a restraint?
A restraint is a function of the coordinates that
is lowest when the coordinates are ideal, and
which increases as the coordinates become less
ideal..
Stereochemical restraints
also... planar groups Bs
bond lengths
bond angles
torsion angles
47
Calculated phases, observed amplitudes hybrid
F's
  • Fcalcs are calculated from the atomic
    coordinates
  • A new electron density map calculated from the
    Fcalc's would only reproduce the model. (of
    course!)
  • Instead we use the observed amplitudes Fobs,
    and the model phases, acalc.

Hybrid back transform
Hybrid maps show places where the current model
is wrong and needs to be changed.
48
The free R-factor cross-validation
The free R-factor is the test set residual,
calculated the same as the R-factor, but on the
test set. Free R-factor asks how well does
your model predict the data it hasnt seen?
49
Why cross-validate?
If you have three points, you can fit them to a
quadratic equation (3 parameters) with zero
residual, but is it right?
observed
calculated
R-factor 0.000!!
50
Fitting and overfitting
Fit is correct if additional data, not used in
fitting the curve, fall on the curve. Low
residual in the test set justifies the fit.
residual?0
51
cross-validation
Measuring the residual on data (the test set)
that were not used to create the model.
The residual on test data is likely to be small
if is large.
a line has 2 parameters
52
parameters versus data
Example from Drenth, Ch 13 Papain crystal
structure has 25,000 reflections. Papain has 2000
non-H atoms times 4 parameters each (x, y, z,
B) equals 8000 parameters data/parameters
25,000/8000 3 lt-- this is too small!
53
restraints are data
Bond lengths, angles, etc. are measurements
that must be fit by the model. The true
residual should include deviations from ideal
bond lengths, angles, etc. In practice, residual
in restraints (e.g. deviations from ideal bond
lengths, angles) is very low. This means that
restraints are essentially constraints.
bond lengths
bond angles
van der Waals
torsion angles
planar groups
54
constraints reduce the number of parameters
Bond lengths, angles, and planar groups may be
fixed to their ideal values during refinement
(Torsion angle refinement). Using
constraints, Ser has 3 parameters, Phe 4, and Arg
6.
bond lengths
bond angles
There are an average 3.5 torsion angles per
residue. Papain has 700 torsion angle
parameters. \ data/parameter700/25,00035
planar groups
55
radius of convergence
total residual
parameter space
...How far away from the truth can it be, and
still find the truth? radius of convergence
depends on data method. More data fewer false
(local) minima Better method one that can
overcome local minima
56
Molecular dynamics w/ Xray refinement
MD samples conformational space while maintaining
good geometry (low residual in restraints). E
(residual of restraints) (R-factor) dE/dxi is
calculated for each atom i, then we move i
downhill. Random vectors added, proportional to
temperature T. The simulated annealing MD
method (1) start the simulation hot (2) cool
slowly, trapping structure in lowest minimum.
X-plor Axel Brünger et al
57
Phase bias, and how to fix it.
The model biases the phases. Phase bias is
localized. To remove bias, we must remove the
wrong parts of the model. An OMIT MAP is
calculated. The phases for an omit map are
derived from a partial model, where some small
part has been omitted.
58
Omit maps
This residue has been removed before calculating
Fc.
2Fo-Fc density Fo (Fo - Fc) The native map
plus the difference map.
59
Two inhibitor peptides bound to thrombin. The
inhibitors were omited from the Fc calculation.
(stereo images)
FÉTHIÈRE et al, Protein Science (1996), 5 1174-
1183.
60
The final model
Other data commonly reported total unique
reflections, completeness, free R-factor
Write a Comment
User Comments (0)
About PowerShow.com