TEXTALTM : Artificial Intelligence Techniques for Protein Structure Determination - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

TEXTALTM : Artificial Intelligence Techniques for Protein Structure Determination

Description:

R. Ioerger, Tod.D.Romo, James. C. Sacchettini. IAAI 2003. Acapulco, Mexico. 08/01/03. Texas A&M University. Overview. CAPRA. LOOKUP. Post processing Routines ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 17
Provided by: Ree73
Category:

less

Transcript and Presenter's Notes

Title: TEXTALTM : Artificial Intelligence Techniques for Protein Structure Determination


1
TEXTALTM Artificial Intelligence Techniques for
Protein Structure Determination
Kreshna Gopal, Reetal Pai, Thomas. R. Ioerger,
Tod.D.Romo, James. C. Sacchettini
Texas AM University
IAAI 2003 Acapulco, Mexico
2
Overview
  • CAPRA
  • LOOKUP
  • Post processing Routines
  • Results
  • Discussion
  • TEXTAL Availability
  • Acknowledgements

3
CAPRA C? Pattern Recognition Algorithm
Map
  • Scale input map to enable comparisons of
    patterns between
  • different maps
  • Trace gives connected skeleton of pseudo atoms
    through the
  • backbone and the side chains
  • The feature vectors are calculated at each of
    the tracer atoms

Scale
Tracer
Calculate features
4
CAPRA C? Pattern Recognition Algorithm
  • The neural network has
  • one input layer of 38 nodes
  • hidden layer of 20 nodes
  • one output node

Trained network parameters
Neural Network
  • Neural network associates certain
    characteristics in the local density with an
  • estimate of proximity to true C?
  • The neural network predicts distances of pseudo
    atoms to true C?s

5
CAPRA C? Pattern Recognition Algorithm
  • Select candidate C?s from all the pseudo atoms
    (trace)
  • Use the distance predictions from the neural
    network

Selection of way-points
  • Link all the C?s into linear chains
  • Integrate intuitive criteria
  • Identify the linearized sub-structure from the
    tracer graph

Build chains
C? chains
6
LOOKUP The Core Pattern Matching Routine
  • Predicts co-ordinates of side chain atoms given
    location of C? atom
  • Features used to determine regions with similar
    patterns of density
  • Use of rotation invariant features
  • Feature values calculated at different radii
  • Database consists of feature vectors from regions
    within previously solved maps

7
LOOKUP The Core Pattern Matching Routine
TEXTAL uses weighted Euclidean feature difference
as measure of similarity
8
LOOKUP SLIDER
  • SLIDER incrementally adjusts feature weights to
    increase matches and decrease mismatches
  • For each region, a match and a mis-match are
    found and these 3-tuples are used to tune weights
  • Weights can be optimized by finding a value
    between (0 ? 1) producing most positive
    crossovers
  • Not guaranteed to find globally optimal weight
    vector, only a local optimum

9
Post Processing Routines
  • To correct errors in stereochemistry
  • Identify residues with backbone atoms in the
    wrong direction
  • Identify correct backbone direction using a
    voting procedure
  • Re-invoke LOOKUP
  • Real Space Refinement
  • Move atoms slightly to optimize their fit to
    density
  • Preserve geometric constraints like bond
    distances and angles
  • Sequence Alignment
  • Corrects the identities of mislabeled amino acids
  • Errors in LOOKUP output due to noise perturbing
    local density
  • Errors corrected if the predicted fragment can be
    fit into actual sequence

10
TEXTALTM Input
08/01/03
Texas AM University
11
TEXTALTM Output
Mean residue density corr.
Mean length of output chains
Length of longest chain
No. of chains output
of structure built
All-atom rms error (Å)
Side chain structural similarity
C? rms error (Å)
Protein
A2u-globulin
2
88
68.5
85
0.85
0.84
0.99
48.9
Armadillo
9
217
46.7
89
0.98
0.82
N.A.
43.7
Cyanase
6
94
32.0
94
1.1
0.79
1.03
42.7
Gere
2
44
30.5
90
0.85
0.83
1.00
30.0
GM-CSF
4
46
25.0
82
0.91
0.84
0.94
28.9
Nsf-d2
6
79
39.5
92
0.96
0.83
1.13
33.5
Penicillopepsin
13
58
25.0
91
1.13
0.78
1.09
41.9
Psd-95
8
58
31.8
94
1.00
0.82
1.04
34.7
Rab3a
8
30
20.5
90
0.90
0.82
1.06
30.5
Rh-dehalogenase
8
66
36.5
97
0.92
0.83
0.99
54.6
CzrA
3
57
33.3
94
1.05
0.82
1.15
39.1
MVK
10
58
28.7
88
0.83
0.82
1.00
44.5
08/01/03
Texas AM University
12
CAPRA Output (CzrA)
08/01/03
Texas AM University
13
LOOKUP Output (CzrA)
08/01/03
Texas AM University
14
Discussion
  • TEXTALTM has the potential to reduce one of the
    major bottlenecks in the way of high-throughput
    Structural Genomics
  • Future Work
  • Other feature comparison measures
  • Clustering of database
  • Reducing redundancy from database
  • Stitching CAPRA chains

15
TEXTALTM Development
  • Access to TEXTALTM through http//textal.tamu.e
    du12321
  • PHENIX - Python-based Hierarchical ENvironment
    for Integrated Xtallography
  • PHENIX Members
  • Berkeley Lab
  • University of Cambridge
  • Los Alamos National Laboratory
  • Texas AM University
  • The alpha release of PHENIX is now available

16
Acknowledgements
  • National Institute of Health
  • TEXTALTM PIs
  • Dr. Thomas R. Ioerger
  • Dr. James C. Sacchettini
  • TEXTALTM Development Team
  • Kevin Childs
  • Kreshna Gopal
  • Reetal Pai
  • Tod D Romo
  • Vinod Reddy Melapudi
Write a Comment
User Comments (0)
About PowerShow.com