Title: Structural Bioinformatics Workshop
1Structural Bioinformatics Workshop
- Max Shatsky
- Email maxshats_at_post.tau.ac.il
- Workshop home page http//bioinfo3d.cs.tau.ac.il/
- (Follow Courses link)
2Schedule
- Introduction
- Introduction to protein structure.
- Introduction to pattern matching.
- Protein structure alignment (comparison).
- Rigid/Flexible case.
- Protein Docking
- Rigid/Flexible case.
- GAMB library.
3Grade Ingredients
- GAMB exercise
- Presentation and Design Review
- Final Project
- Software Engineering
- Efficiency of Solution
- Working Examples and Test Cases
- Documentation
- Knowledge of all project aspects
4Bioinformatics - Computational Genomics
- DNA mapping.
- Protein or DNA sequence comparisons.
- Exploration of huge textual databases.
- In essence one- dimensional methods and
intuition.
5Structural Bioinformatics - Structural Genomics
- Elucidation of the 3D structures of biomolecules.
- Analysis and comparison of biomolecular
structures. - Prediction of biomolecular recognition.
- Handles three-dimensional (3-D) structures.
- Geometric Computing. (a methodology shared by
Computational Geometry, Computer Vision, Computer
Graphics, Pattern Recognition etc.)
6Protein Structural Comparison
Pseudoazurin - 1pmy
ApoAmicyanin - 1aaj
7Algorithmic Solution
About 1 sec. Fischer, Nussinov, Wolfson 1990.
8Multiple Structural Comparison Globins
9FlexProt Flexible Protein Alignment
10FlexProt Flexible Protein Alignment
http//bioinfo3d.math.tau.ac.il/FlexProt
11Example Trypsin/Trypsin inhibitor
Figure from B. Honigs Labs web-site at Columbia
University.
12Introduction to Protein Structure
13The central dogma
- DNA ---gt mRNA ---gt Protein
- A,C,G,T A,C,G,U A,D,..Y
- Guanine-Cytosine T-gtU
- Thymine-Adenine
- 4 letter alphabets 20 letter
alphabet - Sequence of nucleic acids seq of
amino acids
14When genes are expressed, the genetic information
(base sequence) on DNA is first transcribed
(copied) to a molecule of messenger RNA in a
process similar to DNA replication. The mRNA
molecules then leave the cell nucleus and enter
the cytoplasm, where triplets of bases ((codons)
forming the genetic code specify the particular
amino acids that make up an individual
protein. This process, called translation, is
accomplished by ribosomes (cellular components
composed of proteins and another class of RNA)
that read the genetic code from the mRNA, and
transfer RNAs (tRNAs) that transport amino
acids to the ribosomes for attachment to the
growing protein. (From www.ornl.gov/hgmis/public
at/primer/ )
15Amino acids and the peptide bond
Cb first side chain carbon (except for glycine).
16(No Transcript)
17Wire-frame or ribbons display
18(No Transcript)
19Geometric Representation
3-D Curve vi, i1n
20(No Transcript)
21Secondary structure
22? strands and sheets
23(No Transcript)
24(No Transcript)
25The Holy Grail - Protein Folding
- From Sequence to Structure.
- Relatively primitive computational folding models
have proved to be NP complete even in the 2-D
case.
26Determination of protein structures
- X-ray Crystallography
- NMR (Nuclear Magnetic Resonance)
- EM (Electron microscopy)
27An NMR result is an ensemble of models
28The Protein Data Bank (PDB)
- International repository of 3D molecular data.
- Contains x-y-z coordinates of all atoms of the
molecule and additional data. - http//pdb.tau.ac.il
- http//www.rcsb.org/pdb/
29(No Transcript)
30(No Transcript)
31Why bother with structureswhen we have sequences
?
- In evolutionary related proteins structure is
much better preserved than sequence.
- Structural motifs may predict similar
- biological function .
- Getting insight into protein folding.
- Recovering the limited (?) number of protein
- folds.
32Applications
- Classification of protein databases by structure.
- Search of partial and disconnected structural
patterns in large databases. - Extracting Structure information is difficult, we
want to extract new folds.
33Applications (continued)
- Speed up of drug discovery.
- Detection of structural pharmacophores in an
ensemble of drugs (similar substructures in
drugs acting on a given receptor
pharmacophore). - Comparison and detection of drug receptor active
sites (structurally similar receptor cavities
could bind similar drugs).
34Structural Bioinformatics Lab Goals
Development of state of the art algorithmic
methods to tackle major computational tasks
in protein structure analysis, biomolecular
recognition, and Computer Assisted Drug
Design. Establish truly interdisciplinary
collaboration between Life and Computer
Sciences.
35Object Recognition
36Geometric Task
Given two configurations of points in the
three dimensional space,
find those rotations and translations of one
of the point sets which produce large
superimpositions of corresponding 3-D
points.
37Geometric Task (continued)
- Aspects
- Object representation (points, vectors, segments)
- Object resemblance (distance function)
- Transformation (translations, rotations, scaling)
38Transformations
- Translation
- Translation and Rotation
- Rigid Motion (Euclidian Trans.)
- Translation, Rotation Scaling
-
39Distance Functions
- Two point sets Aai i1n
- Bbj j1m
- Pairwise Correspondence
- (ak1,bt1) (ak2,bt2) (akN,btN)
(1) Exact Matching aki bti0
(2) RMSD (Root Mean Square Distance)
Sqrt( Saki bti2/N) lt e
- Hausdorff distance h(A,B)maxa?A minb?B a
b - H(A,B)max(
h(A,B), h(B,A))