Title: Bioinformatics
1Bioinformatics
- Richard Tseng and Ishawar Hosamani
2Outline
- Homology modeling (Ishwar)
- Structural analysis
- Structure prediction
- Structure comparisons
- Cluster analysis
- Partitioning method
- Density-based method
- Phylogenetic analaysis
3Structural Analysis
- Overview
- Structure prediction
- Structural alignment
- Similarity
4- Tools for protein structure prediction
- Protein
- Secondary structure prediction SSEA
- http//protein.cribi.unipd.it/ssea/
- Tertiary structure prediction
- Wurst http//www.zbh.uni-hamburg.de/wurst/
- LOOPP http//cbsuapps.tc.cornell.edu/loopp.aspx
5- WURST( Torda et al. (2004) Wurst A protein
threading server with a structural scoring
function, sequence profiles and optimized
substitution matrices Nucleic Acids Res., 32,
W532-W535) - Rationale
- Alignment Sequence to structure alignments are
done with a Smith-Waterman style alignment and
the Gotoh algorithm - Score function fragment-based sequence to
structure compatibility score and a pure
sequence-sequence component substitution score - Library Dali PDB90 (24599 srtuctures)
6- Tools for structure comparison
- Pair structures comparison
- TopMatch
- Matras (http//biunit.naist.jp/matras/)
- Multiple structures comparison
- 3D-surfer
- Matras (http//biunit.naist.jp/matras/)
7- TopMatch (Sippl Wiederstein (2008) A note on
difficult structure alignment problems.
Bioinformatics 24, 426-427) - Rationale
- Structure alignment http//www.cgl.ucsf.edu/home/
meng/grpmt/structalign.html - Similarity measurement
- Input format
- PDB, SCOP and CATH code
- PDB structure directly
- Exercise http//topmatch.services.came.sbg.ac.at/
8- 3D-surfer (David La et al. 3D-SURFER software
for high throughput protein surface comparison
and analysis. Bioinformatics , in press. (2009)) - Rationale
- Define a surface function
- Transform the surface function into a 3D Zernike
description function - Input format
- PDB and CATH code
- PDB structure directly
- Exercise http//dragon.bio.purdue.edu/3d-surfer/
9Cluster analysis
- Goal
- Grouping the data into classes or clusters, so
that objects within a cluster have high
similarity in comparison to one another but are
very dissimilar to objects in other clusters. - Methods
- Partitioning method k-means
- Density-based method Ordering Points to
Identify the Clustering Structure (OPTICS)
10- k-means
- Rationale Partition n observations
into k clusters in which each observation belongs
to the cluster with the nearest mean - Exercise
http//cgm.cs.ntust.edu.tw/etrex/kMeansClustering/
kMeansClustering2.html
11- OPTICS
- Rationle Partition observations based on the
density of similar objects - Exercise
http//www.dbs.informatik.uni-muenchen.de/Forschun
g/KDD/Clustering/OPTICS/Demo/
12- Example Folding of Trp-cage peptide
13Phylogenetic analysis
- Overviews
- Comparisons of more than two sequences
- Analysis of gene families, including functional
predictions - Estimation of evolutionary relationships among
organisms
14- Theoretical tree
- Parsimony method
- Distance matrix method
- Maximum likelihood and Bayesian method
- Invariants method
15- Software
- Collections of tools
- http//evolution.genetics.washington.edu/phylip/so
ftware.html - A web server version for tree construction and
display - PHYLIP, http//bioweb2.pasteur.fr/phylogeny/intro-
en.html - Interactive tree of life, http//itol.embl.de/
- Mostly common used stand alone software
- PHYLIP, tool for evaluating similarity of
nucleotide and amino acid sequences. - http//evolution.gs.washington.edu/phylip.html
- TreeView, tool for visualization and manipulation
of family tree. - http//taxonomy.zoology.gla.ac.uk/rod/treeview.htm
l - Matlab - bioinformatics tool box
16- Example Alignment phylogenetic tree of Tubulin
family - Searching homologous sequences of Tubulin (PDB
code 1JFF) from RCSB protein databank - Blast for pair sequence alignment
- Clustalw for comparative sequence alignment
- Evaluating protein distance matrix
- using Protdist of PHYILIP (Particularly, Point
Accepted Mutation (PAM) matrix is used) - Clustering proteins using Neighbor of PHYILIP
(Neightboring-Joint method is considered)
17- Example n-distance phylogenetic tree
- Evaluating n-distance matrix
- n-distance method
- Clustering proteins using Neighbor of PHYILIP
(Neightboring-Joint method is considered) - 16S and 18S Ribosomal RNA sequenecs of 35
organisms
18Summary
- Homology modeling
- Tools for structure prediction and comparisons
- Tools for phylogenetic tree construction
Thanks for your attention!!
19(No Transcript)
201Z5V_A 3CB2_A 1JFF_B 1FFX_B 1TUB_B 1Z2B_B
1Z5V_A 0 0.000010 1.349411 1.349411 1.303115 1.345634
3CB2_A 0.000010 0 1.350506 1.350506 1.303115 1.346730
1JFF_B 1.349411 1.350506 0 0.000010 0.000010 0.010729
1FFX_B 1.349411 1.350506 0.000010 0 0.000010 0.010729
1TUB_B 1.303115 1.303115 0.000010 0.000010 0 0.006725
1Z2B_B 1.345634 1.346730 0.010729 0.010729 0.006725 0
21 22- n-distance method
- Frequency count of n-letter words
- n-dsiatnce matrix
- Advantage
- Identify fully conservative words located at
nearly the same sites - Effecient
MREIVHIQAGQCGNQIGAKFWEVISDEHGIDPTGSYHGDSDLQLERINVY
YNE
23(No Transcript)
24(No Transcript)