Protein Evolution, Coevolution and Interaction Networks Day 2 - PowerPoint PPT Presentation

1 / 46
About This Presentation
Title:

Protein Evolution, Coevolution and Interaction Networks Day 2

Description:

Rosetta Inpharmatics ... Rosetta Stone Pairs of proteins that are fused in. some organism ... Rosetta Stone Method Identifies Protein Fusions ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 47
Provided by: matteope
Category:

less

Transcript and Presenter's Notes

Title: Protein Evolution, Coevolution and Interaction Networks Day 2


1
Protein Evolution, Co-evolution and Interaction
Networks(Day 2)
  • Matteo Pellegrini
  • Rosetta Inpharmatics

2
Identifying The Components of Cellular Pathways
and Protein Complexes using Co-evolution
3
Proteins are Components of Molecular Machines
Hartwell LH, Hopfield JJ, Leibler S, Murray AW.
From molecular to modular cell biology. Nature.
1999 Dec 2402(6761 Suppl)C47-52.
4
Techniques to Study Protein Interactions
Protein Interactions
5
Bacterial Diversity
  • 150 fully sequenced genomes in Genbank
  • 30,000 species represented in Genbank
  • Sea may support 2,000,000
  • Soil may support 4,000,000

T.P. Curtis, W.T. Sloan, and J.W. Scannell.
2002. Estimating prokaryotic diversity and its
limits Proc Natl Acad Sci USA 99 10494-10499.
6
The Study of the Co-Evolution of Non-Homologous
Proteins
  • Because selection generally acts to maintain or
    delete entire complexes and pathways, pairs of
    proteins that are part of these will appear to
    co-evolve across bacteria
  • By studying the co-evolution of non-homologous
    proteins across these bacteria we attempt to
    reconstruct the components of complexes and
    pathways

7
Methods to Infer Co-evolution
Method
Basis
Phylogenetic Profile Pairs of
genes that are always present or absent
together in genomes Rosetta Stone
Pairs of proteins that are
fused in some organism Gene
Neighbor Pairs of genes that
are coded nearby in multiple
organisms Gene Cluster Gene
proximity within genome
8
Phylogenetic Profile
Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg
D, Yeates TO, Assigning protein functions by
comparative genome analysis protein phylogenetic
profiles. Proc Natl Acad Sci U S A.
96(8)4285-8,. 1999
9
Flagellar Proteins Phylogenetic Profiles
10
Hypergeometric Function
m
n
k
N
11
Gene Neighbor Method
Pellegrini M, Thompson MJ, Fierro J, Bowers P, A
Computational Method to Assign Microbial Genes to
Pathways. Journal of Cellular Biochemistry Suppl
37106-9, 2001
12
Linking Dihydrofolate reductase and Thymidilate
synthase
13
Gene Neighbor Probability
14
Rosetta Stone Method Identifies Protein Fusions
  • Monomeric proteins that are found fused in
    another organism are likely to be functionally
    related and physically interacting.

Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates
TO, Eisenberg D, Detecting protein function and
protein-protein interactions from genome
sequences. Science 285(5428)751-3, 1999
15
Rosetta Stone Probability
Protein i
Protein j
K Rosetta Stone fusion proteins
has m homologs
has n homologs
16
Gene Cluster
genes
genomic DNA
17
Tryptophan Operon
P0.67
P0.53
Plt0.01
P0.09
Plt0.01
Plt0.01
P0.91
yciG
trpA
trpB
trpC
trpD
trpE
trpL
yciV
Here, a p-value threshold of 0.1 captures all but
one of the genes for this operon.
18
Combining Inferences of Co-evolution from
Previous Methods
  • We use a Bayesian approach to combine the
    probabilities from the previous four methods to
    arrive at a single probability that two proteins
    co-evolve

Where positive pairs are proteins with common
pathway annotation and negative pairs are
proteins with different annotation
Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ,
Chung S, Emili A, Snyder M, Greenblatt JF,
Gerstein M. A Bayesian networks approach for
predicting protein-protein interactions from
genomic data. Science. 2003 Oct
17302(5644)449-53.
19
True and False Interactions are derived from
Pathway Classification Schemes
Information Storage and Processing Translation,
ribosomal structure and biogenesis Transcription D
NA replication, recombination and
repair Cellular processes Cell division and
chromosome partitioning Posttranslational
modification, protein turnover, chaperones Cell
envelope biogenesis, outer membrane Cell motility
and secretion Inorganic ion transport and
metabolism Signal transduction mechanisms Metabol
ism Energy production and conversion Carbohydrate
transport and metabolism Amino acid transport and
metabolism Nucleotide transport and
metabolism Coenzyme metabolism Lipid
metabolism Secondary metabolites biosynthesis,
transport and catabolism
Pathway categorization scheme
20
Networks of Co-evolving Proteins
  • We can generate networks of co-evolution by
    selecting only pairs of proteins whose
    probability of co-evolution is above a threshold

21
Bacterial Flagella Network Using Combined Methods
22
Alternative Representations of Network
Strong M, Graeber TG, Beeby M, Pelligrini M,
Thompson MJ, Yeates TO, Eisenberg D. Inference
and Visualization of Protein Networks in
Mycobacterium tuberculosis Based on Hierarchical
Clustering of Whole Genome Functional Linkage
Maps. Submitted to Nucleic Acids Research
23
Hierarchical Clustering Reveals Modular Evolution
24
Clusters are Enriched for Pathways and Complexes
25
Examples of Clusters that Contain Components of
Biochemical Pathways
26
Cluster Reeveals Additional ORFs Involved in
Lipopolysaccharide Biosynthesis
27
Clusters are also Enriched for Subunits of
Protein Complexes
True positive interactions are between subunits
of known complexes and false positive ones are
between subunits of different complexes. For
high confidence links, we recover one third of
true interactions and only one thousandth of the
false positive ones
28
Clusters Containing Subunits of Protein Complexes
Cytochrome c oxidase controls the last step of
food oxidation
ATP Synthase
29
Identification of an Uncharacterized Protein
Complex in Pseudomonas Auruginosa
30
Parallel Pathways and Protein Complexes
  • Clustered Maps of co-evolving genes may be used
    not only to identify groups of proteins that are
    part of a complex or pathway but also to identify
    duplicated complexes and pathways

Li H, Pellegrini M, Eisenberg D. Discovering
parallel pathways and protein complexes from
genome sequences. In preparation.
31
Schematic of Pathway Duplication Identification
32
Nitrogenases in Rhodopseudomonas palustris
N2 8e 8H 16ATP 16ADP 16Pi 2NH3 H2
Iron protein (NifH)
Mo-Fe protein (a2 ß2) NifD a subunit NifK ß
subunit
33
Co-evolution Network of Nitrogenases
Full network
Distinct sub-networks
34
Predicting Protein Functions In Yeast
Xiaoqun Joyce Duan, Matteo Pellegrini, and David
Eisenberg. Discovering Biological Modules and
Function from Various Genome-scale Protein
Networks Submitted to PLOS Biology.
35
Guilt By Association Predicting MIPS Categories
for Yeast Genes
36
Combining Methods to Improve Function Prediction
P
P
Network 2
Network 1
P
P
S1
S2
37
Using Bayesian Formalism to Combine Methods
38
Benchmarking Combined Method
Accuracy
CS
Recovery
Accuracy
Accuracy
Coverage
Coverage
39
MIPS Category Accuracy and Coverage
Accuracy
CS GN PP RS TFBP
Methods
Coverage
CS GN PP RS TFBP
Methods
67.07 C-compound transporters 67.04.07
anion transporters 67.04.01 cation
transporters 67.04.01.07 other cation
transport 67.04.01.01 heavy methal
transporter 67.04 ion transporter 67 Transport
facilitation
40
Identifying the function of a Histone Deacetylase
Complex
YDR155C
YDR155C
YIL112W
YIL112W
YOL068C
YOL068C
YMR273C
YMR273C
YGL194C
YGL194C
YKR029C
YKR029C
YCR033W
YCR033W
YBR103W
YBR103W
3 un-annotated (empty circles) two share
transcription control (dark gray circles) 3
annotated with other various functions (light
gray circles)
Our combined scoring algorithm infers function
transcriptional control to seven out of the
eight proteins (dark gray circles)
41
Conclusions
  • Protein modules appear to co-evolve across
    bacterial species
  • Modules are enriched for proteins that
    participate in the same pathway or complex
  • We can identify and reconstruct duplicated
    complexes and pathways
  • Co-evolution may be used to identify functions of
    yeast proteins

42
PROLINKS Database
  • We have constructed a database that contains
    co-evolution links between the genes of 83 (soon
    to be 150) fully sequenced genomes
  • The Prolinks database may be accessed through the
    Proteome Navigator web browser interface at
  • dip.doe-mbi.ucla.edu/pronav

Peter M Bowers, Matteo Pellegrini, Mike J.
Thompson, Joe Fierro, Todd O. Yeates, David
Eisenberg. PROLINKS A Database of Protein
Functional Linkages Derived from Co-evolution,
Genome Biology, in press
43
Proteome Navigator Access Page
44
Proteome Navigator Network Page
45
Future Directions
  • Distinguish pathways from complexes
  • Combine multiple data types
  • Compute statistics for 3 or more genes to
    co-evolve
  • Account for Phylogenies

46
Acknowledgements
Michael Thompson Peter Bowers
Michael Strong
Huiying Li
Todd Yeates
Joseph Fierro
David Eisenberg
Joyce duan
Edward Marcotte
Write a Comment
User Comments (0)
About PowerShow.com