TEXTALTM : Artificial Intelligence Techniques for Protein Structure Determination

About This Presentation

Title:

TEXTALTM : Artificial Intelligence Techniques for Protein Structure Determination

Description:

R. Ioerger, Tod.D.Romo, James. C. Sacchettini. IAAI 2003. Acapulco, Mexico. 08/01/03. Texas A&M University. Overview. CAPRA. LOOKUP. Post processing Routines ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 17

Provided by: Ree73

Category:

more less

Transcript and Presenter's Notes

Title: TEXTALTM : Artificial Intelligence Techniques for Protein Structure Determination

1
TEXTALTM Artificial Intelligence Techniques for
Protein Structure Determination
Kreshna Gopal, Reetal Pai, Thomas. R. Ioerger,
Tod.D.Romo, James. C. Sacchettini
Texas AM University
IAAI 2003 Acapulco, Mexico
2
Overview

CAPRA
LOOKUP
Post processing Routines
Results
Discussion
TEXTAL Availability
Acknowledgements

3
CAPRA C? Pattern Recognition Algorithm
Map

Scale input map to enable comparisons of
patterns between
different maps
Trace gives connected skeleton of pseudo atoms
through the
backbone and the side chains
The feature vectors are calculated at each of
the tracer atoms

Scale
Tracer
Calculate features
4
CAPRA C? Pattern Recognition Algorithm

The neural network has
one input layer of 38 nodes
hidden layer of 20 nodes
one output node

Trained network parameters
Neural Network

Neural network associates certain
characteristics in the local density with an
estimate of proximity to true C?
The neural network predicts distances of pseudo
atoms to true C?s

5
CAPRA C? Pattern Recognition Algorithm

Select candidate C?s from all the pseudo atoms
(trace)
Use the distance predictions from the neural
network

Selection of way-points

Link all the C?s into linear chains
Integrate intuitive criteria
Identify the linearized sub-structure from the
tracer graph

Build chains
C? chains
6
LOOKUP The Core Pattern Matching Routine

Predicts co-ordinates of side chain atoms given
location of C? atom
Features used to determine regions with similar
patterns of density
Use of rotation invariant features
Feature values calculated at different radii
Database consists of feature vectors from regions
within previously solved maps

7
LOOKUP The Core Pattern Matching Routine
TEXTAL uses weighted Euclidean feature difference
as measure of similarity
8
LOOKUP SLIDER

SLIDER incrementally adjusts feature weights to
increase matches and decrease mismatches
For each region, a match and a mis-match are
found and these 3-tuples are used to tune weights
Weights can be optimized by finding a value
between (0 ? 1) producing most positive
crossovers
Not guaranteed to find globally optimal weight
vector, only a local optimum

9
Post Processing Routines

To correct errors in stereochemistry
Identify residues with backbone atoms in the
wrong direction
Identify correct backbone direction using a
voting procedure
Re-invoke LOOKUP
Real Space Refinement
Move atoms slightly to optimize their fit to
density
Preserve geometric constraints like bond
distances and angles
Sequence Alignment
Corrects the identities of mislabeled amino acids
Errors in LOOKUP output due to noise perturbing
local density
Errors corrected if the predicted fragment can be
fit into actual sequence

10
TEXTALTM Input
08/01/03
Texas AM University
11
TEXTALTM Output
Mean residue density corr.
Mean length of output chains
Length of longest chain
No. of chains output
of structure built
All-atom rms error (Å)
Side chain structural similarity
C? rms error (Å)
Protein
A2u-globulin
2
88
68.5
85
0.85
0.84
0.99
48.9
Armadillo
9
217
46.7
89
0.98
0.82
N.A.
43.7
Cyanase
6
94
32.0
94
1.1
0.79
1.03
42.7
Gere
2
44
30.5
90
0.85
0.83
1.00
30.0
GM-CSF
4
46
25.0
82
0.91
0.84
0.94
28.9
Nsf-d2
6
79
39.5
92
0.96
0.83
1.13
33.5
Penicillopepsin
13
58
25.0
91
1.13
0.78
1.09
41.9
Psd-95
8
58
31.8
94
1.00
0.82
1.04
34.7
Rab3a
8
30
20.5
90
0.90
0.82
1.06
30.5
Rh-dehalogenase
8
66
36.5
97
0.92
0.83
0.99
54.6
CzrA
3
57
33.3
94
1.05
0.82
1.15
39.1
MVK
10
58
28.7
88
0.83
0.82
1.00
44.5
08/01/03
Texas AM University
12
CAPRA Output (CzrA)
08/01/03
Texas AM University
13
LOOKUP Output (CzrA)
08/01/03
Texas AM University
14
Discussion

TEXTALTM has the potential to reduce one of the
major bottlenecks in the way of high-throughput
Structural Genomics
Future Work
Other feature comparison measures
Clustering of database
Reducing redundancy from database
Stitching CAPRA chains

15
TEXTALTM Development