Bioinformatics - PowerPoint PPT Presentation

About This Presentation
Title:

Bioinformatics

Description:

Bioinformatics NSF Summer School 2003 Z. Luthey-Schulten, UIUC – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 58
Provided by: ZaidaLuth
Learn more at: http://www.ks.uiuc.edu
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics


1
Bioinformatics NSF Summer School 2003 Z.
Luthey-Schulten, UIUC
2
Sequence-Sequence Alignment
  • Smith-Watermann
  • Needleman-Wunsch

Sequence-Structure Alignment
  • Threading
  • Hidden Markov

3
Sequence Alignment Dynamic Programming
number of possible alignments
Seq. 1 a1 a2 a3 - - a4 a5an Seq. 2 c1 - c2
c3 c4 c5 - cm
Smith-Waterman alignment algorithm
Score Matrix H Traceback
AWGHE AW--HE
4
Smith-Waterman Local Alignment Score Matrix
AWGHE AW--HE
5
Blosum 40 Substitution Matrix
6
Protein Structural Relationships
Can protein structural relationships help us to
understand evolutionary dynamics? Is there a
connection between evolutionary events and
changes in protein structure? What is the
effect of gene duplication, horizontal gene
transfer, and other evolutionary mechanisms on
protein shape?
Substitution
Indel
Domain Insertion
ODonoghue and Luthey-Schulten, UIUC 2003
7
Sequence Alignment Dynamic Programming
number of possible alignments
Seq. 1 a1 a2 a3 - - a4 a5an Seq. 2 c1 - c2
c3 c4 c5 - cm
Needleman-Wunsch alignment algorithm
Score Matrix H Traceback
??? Tutorial Wd
8
Needleman-Wunsch Global Alignment
Similarity Values
Initialization of Gap Penalties
http//www.dkfz-heidelberg.de/tbi/bioinfo/Practica
lSection/AliApplet/index.html
9
Filling out the Score Matrix H
http//www.dkfz-heidelberg.de/tbi/bioinfo/Practica
lSection/AliApplet/index.html
10
Traceback and Alignment
The Alignment
Traceback (blue) from optimal score
http//www.dkfz-heidelberg.de/tbi/bioinfo/Practica
lSection/AliApplet/index.html
11
Energy Landscape Theory of Structure Prediction
12
Protein Structure Prediction
1-D protein sequence
3-D protein structure
Ab Initio protein folding
SISSIRVKSKRIQLG.
Sequence Alignment
SISSRVKSKRIQLGLNQAELAQKV------GTTQ QFANEFKVRRIKL
GYTQTNVGEALAAVHGS
Target protein of unknown structure
Homologous/analogous protein of known structure
Sequence Alignment the Energy Function
E Ematch Egap
Egap
?
Ematch
13
Threading Sequence-Structure Alignment
Scaffold structure
Target sequence
threading alignment between target and scaffold
A1
A3
A2
A4
A5
Threading Energy Function
R. Goldstein, Z. Luthey-Schulten, P. Wolynes
(1992, PNAS)
14
Gap Penalties
Distribution of Gaps
Insertion
Deletion
Sequence-Structure Gap Energy
target
scaffold
Bulge
R. Goldstein, Z. Luthey-Schulten, P. Wolynes
(1994) Proc 27th Annu Hawaii Int Conf Sys Sci306.
15
Similarity Measures
Sequence Identity fraction of identically
matched residues
Q Structural Identity fraction of native
contacts
16
lt?Es/?Egt
2?
?Es
17
Homology Modeling - Threading
18
Results from CASP5 CM/FR
The prediction is never better than the
scaffold. Threading Energy function requires
improvement.
19
You are now entering the twilight zone of
sequence identity. We need profiles!
Watch for Bioinformants!!!
20
Profiles Evolution Revisited
  • What molecular sequences taught us in the 1960s
    was that the genealogical history of an organism
    is written to one extent or another into the
    sequences of each of its genes, an insight that
    became the central tenet of a new discipline,
    molecular evolution
  • Woese (PNAS, 2000)
    Pauling (1965)

21
Universal Tree
The Universal Phylogenetic Tree inferred from
comparative analyses of rRNA sequences
Woese(PNAS, 1990)
22
Horizontal Gene Transfer
ODonoghue and Luthey-Schulten, UIUC 2003
23
Multiple Sequence Alignments
  • The aminoacyl-tRNA synthetases, perhaps better
    than any other molecules in the cell, eptiomize
    the current situation and help to under standard
    (the effects) of HGT Woese (PNAS, 2000 MMBR
    2000)

24
Standard Dogma Molecular Biology
  • DNA RNA
    Proteins
  • Role of AARS?
  • Charging of t-RNA

25
NCBI 3D
26
LeuRS Canonical Tree
Woese, Olsen (UIUC), Ibba (Panum Inst.), Soll
(Yale) Micro. Mol. Biol. Rev. March 2000..
27
D,N Sequence Phylogenetic Trees
Woese, Olsen (UIUC), Ibba (Panum Inst.), Soll
(Yale) Micro. Mol. Biol. Rev. March 2000..
28
Fold Motifs of AARSs
ODonoghue and Luthey-Schulten, UIUC 2003
29
Structure Conserved More than Sequence
Structural Overlap of Class II AARS
Conserved helices
Conserved sheets
30
Subset of Class II Structural Tree
ODonoghue and Luthey-Schulten, UIUC 2003
31
(No Transcript)
32
Novel Evolutionary Connections from Sequence and
Structure
Canonical Pattern D E F L W Y
Canonical Pattern A B I H P M
Gemini K1 K2 C S G N Q
Basal Canonical V T A R
B
A
No canonical pattern Horizontal transfer
after B-AE split.
ODonoghue, Luthey-Schulten, UIUC 2003
Woese, Olsen (UIUC), Ibba (Panum Inst.), Soll
(Yale) Micro. Mol. Biol. Rev. March 2000..
33
Gap Distribution Functions
Spatial Gap Distribution Funciton
Length Gap Distribution Function
l, gap length (residues)
rij, spatial gap distance (Å)
B. Qian R. Goldstein. (2001) Proteins 45102.
34
Structural Alignment Methods
  • PDB - Structural Neighbors CE (Bourne)
  • Stamp - Russell

35
Multiple Structural Alignments
  • STAMP
  • Initial Alignment
  • Multiple Sequence alignment
  • Ridged Body Scan
  • Refine Initial Alignment Produce Multiple
    Structural Alignment
  • Dynamic Programming (Smith-Waterman) through P
    matrix gives optimal set of equivalent residues.
  • This set is used to re-superpose the two chains.
    Then iterate until alignment score is unchanged.
  • This procedure is performed for all pairs.

R. Russell, G. Barton (1992) Proteins 14 309.
36
Multiple Structural Alignments
  • STAMP contd
  • Refine Initial Alignment Produce Multiple
    Structural Alignment

Alignment score
  • Multiple Alignment
  • Create a dendrogram using the alignment score.
  • Successively align groups of proteins (from
    branch tips to root).
  • When 2 or more sequences are in a group,
  • then average coordinates are used.

37
Stamp Output/Secondary Structure
ODonoghue and Luthey-Schulten, UIUC 2003
38
Stamp Output/Clustal Format
ODonoghue and Luthey-Schulten, UIUC 2003
39
Examples of Useful Web Tools
  • Genomes Sequence and Gene Information
  • Domain Architecture
  • Multiple Sequence Alignments
  • Phylogenetic Trees
  • Structural Databases
  • Hidden Markov Methods

40
NCBI Genomes
41
Charging the tRNA
Woese, Olsen (UIUC), Ibba (Panum Inst.), Soll
(Yale) Micro. Mol. Biol. Rev. March 2000..
42
NCBI 3D
43
Report from SWISS-PROT
44
PFAM Report
45
(No Transcript)
46
Sequence Dendrogram from Clustal
Luthey-Schulten, UIUC 2003
47
Phylogenetic Tree in Tutorial
Pogorelov and Luthey-Schulten, UIUC 2003
48
(No Transcript)
49
Alignment in MOE
50
Alignment in MOE
51
Transmembrane Proteins - HMM
Example Bacteriorhodpsin Anurag Sethi UIUC
52
Stamp Profile
Sethi and Luthey-Schulten, UIUC 2003
53
(No Transcript)
54
HMMer Profile-Profile Alignment
Sethi and Luthey-Schulten, UIUC 2003
55
Clustal Profile-Profile Alignment
Sethi and Luthey-Schulten, 2003
56
Structure Prediction Modeller 6.2/Hmmer
Sethi and Luthey-Schulten, UIUC 2003
Modeller 6.2 A. Sali, et al.
57
Acknowledgements
  • Felix Autenrieth
  • Barry Isralewitz
  • Patrick ODonoghue
  • Taras Pogorelov
  • Anurag Sethi
Write a Comment
User Comments (0)
About PowerShow.com