High%20Throughput%20Computing%20and%20Protein%20Structure - PowerPoint PPT Presentation

About This Presentation
Title:

High%20Throughput%20Computing%20and%20Protein%20Structure

Description:

Finding the secondary structure of a protein is a step towards ... Can we use a program like Trepan to get some definite rules about secondary structure. ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 21
Provided by: stephe267
Category:

less

Transcript and Presenter's Notes

Title: High%20Throughput%20Computing%20and%20Protein%20Structure


1
High Throughput Computing and Protein Structure
Stephen E. Hamby
2
Overview
  • Introduction To Protein Structure
  • Dihedral Angles
  • Previous Work
  • Support Vector Regression
  • Optimisation
  • Prediction
  • Results
  • Conclusions

3
Introduction To Protein Structure
  • Molecules with massive biological importance
  • Structure determination gives insight into .
  • Function, Dynamics, Potential drug targets.
  • Experimental structure determination is.
  • Expensive, Slow, Difficult

4
Introduction To Protein Structure
Primary Structure Order of Amino
Acids Secondary Structure Building
blocks Tertiary Structure Complete 3D Structure
5
Introduction To Protein Structure
Secondary Structure Types a-helix ß-sheet Random
Coil
6
Dihedral Angles
7
Dihedral Angles
8
Dihedral Angles
Finding the secondary structure of a protein is a
step towards finding its complete
structure Predicting dihedral angles can help us
to get the secondary structure
How Can We Predict Dihedral Angles?
9
Previous work
Destruct Multiple neural networks. Iterative
method. Predicts secondary structure and dihedral
angles.
10
Previous work
Real Spine
Twin neural networks give a consensus
prediction. Predicts dihedral angles from various
amino acid properties amino acid composition and
predicted structure.
11
Support Vector Regression
Kernel machine learning raises the data to a
higher dimension so a linear relationship can be
found.
12
Support Vector Regression
Attempts to fit a linear function to the data in
a high dimensional feature space Accurate
but Slow, needs optimisation, black box.
13
Support Vector Regression
Kernel Choice We tested the various kernels
available through the PyML package. These the
are linear, polynomial, and gaussian kernels. We
tested them using the CASP4 dataset. Gaussian
kernel produced the best results.
14
Optimisation
Three interdependent parameters Grid based
optimisation on a the CASP4 dataset Around 10000
3 hour jobs. Run in blocks of 10 on
Jupiter Accuracy assessed using the Pearson
correlation coefficient
15
Prediction
Support vector machine using a Gaussian kernel
and optimal parameters. Training on the CB513
dataset. Tested by 10 fold cross validation CASP
4 used as a test set.
16
Results
Results measured by cross validation
Destruct Real Spine SVM Prediction
Pearson Correlation Coefficient 0.42 0.62 0.57
CASP4 Test set gives Pearson Correlation
Coefficient of 0.56
17
Results
Using Secondary structure predictions made by
cascade correlation neural networks Dihedrals
assisted by predicted structure Pearson
correlation coefficient 0.582. Subsequent
iterations should lead to better predictions of
both structure and dihedral angles.
18
What Next?
Using further iterations to improve
accuracy. Current method is a black box. Can we
use a program like Trepan to get some definite
rules about secondary structure.
19
Conclusions
  • Dihedral Angles define protein secondary
    structure
  • Using Support Vector Machines it is possible to
    predict dihedral angles
  • We (hopefully!) can use predicted dihedral
    angles to improve the accuracy of secondary
    structure prediction.

20
Acknowledgements
Jonathan Hirst Hirst group members BBSRC The
University of Nottingham
Write a Comment
User Comments (0)
About PowerShow.com