Title: Automated Model-Building with TEXTAL
1Automated Model-Building with TEXTAL
- Thomas R. Ioerger
- Department of Computer Science
- Texas AM University
2Overview of TEXTAL
- Automated model-building program
- Can we automate the kind of visual processing of
patterns that crystallographers use? - Intelligent methods to interpret density, despite
noise - Exploit knowledge about typical protein structure
- Focus on medium-resolution maps
- optimized for 2.8A (actually, 2.6-3.2A is fine)
- typical for MAD data (useful for high-throughput)
- other programs exist for higher-res data
(ARP/wARP)
Electron density map (not structure factors)
Protein model (may need refinement)
TEXTAL
3Main Stages of TEXTAL
electron density map
CAPRA
build-in side-chain and main-chain atoms locally
around each Ca
Reciprocal-space refinement/DM
Ca chains
LOOKUP
example real-space refinement
model (initial coordinates)
Human Crystallographer (editing)
Post-processing routines
model (final coordinates)
4CAPRA C-Alpha Pattern-Recognition Algorithm
tracing
Neural network estimates which pseudo-atoms
are closest to true Cas
linking
5Example of Ca-chains fit by CAPRA
Rat a2 urinary protein (P. Adams) data 2.5A
MR map generated at 2.8A
built 84 chains 2 lengths 47, 88 RMSD
0.82A
6Stage 2 LOOKUP
- LOOKUP is based on Pattern Recognition
- Given a local (5A-spherical) region of density,
have we seen a pattern like this before (in
another map)? - If so, use similar atomic coordinates.
- Use a database of maps with known structures
- 200 proteins from PDB-Select (non-redundant)
- back-transformed (calculated) maps at 2.8A (no
noise) - regions centered on 50,000 Cas
- Use feature extraction to match regions
efficiently - feature (e.g. moments) represent local density
patterns - features must be rotation-invariant (independent
of 3D orientation) - use density correlation for more precise
evaluation
7Examples of Numeric Density Features
Distance from center-of-sphere to
center-of-mass Moments of inertia - relative
dispersion along orthogonal axes Geometric
features like Spoke angles Local variance and
other statistics
TEXTAL uses 19 distinct numeric features to
represent the pattern of density in a region,
each calculated over 4 different radii, for a
total of 76 features.
8Flt1.72,-0.39,1.04,1.55...gt
Flt1.58,0.18,1.09,-0.25...gt
Flt0.90,0.65,-1.40,0.87...gt
Flt1.79,-0.43,0.88,1.52...gt
9The LOOKUP Process
Find optimal rotation
Database of known maps
Region in map to be interpreted
10Stage 3 Post-Processing
11Interfaces for Using TEXTAL
- Stand-alone commands and scripts
- capra-scale prot.xplor prot-scaled.xplor
- neotex.sh myprotein gt textal.log
- lots of intermediate files and logs
- WINTEX Tcl/Tk interface
- creates jobs in sub-directories
- Public Release July 2004
- http//textal.tamu.edu12321
- Integrated into Phenix
- http//phenix-online.org
- Python module
- model-building tasks in GUI
12Gallery of Examples
13Conclusions
- Pattern recognition is a successful technique for
macromolecular model-building - Future directions
- building ligands, co-factors, etc.
- recognizing disulfide bridges
- phase improvement (iterating with refinement)
- loop-building
- further integration with Phenix
- Intelligent Agent-based methods for
guiding/automating model-building - interactive graphics for specialized needs (e.g.
fixing chains, editing identities)
14Acknowledgements
- Funding
- National Institutes of Health
- People
- James C. Sacchettini
- Kevin Childs, Kreshna Gopal, Lalji Kanbi, Erik
McKee, Reetal Pai, Tod Romo - Our association with the PHENIX group
- Paul Adams (Lawrence Berkeley National Lab)
- Randy Read (Cambridge University)
- Tom Terwilliger (Los Alamos National Lab)