Protein-Protein Interactions Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Protein-Protein Interactions Networks

Description:

Protein-Protein Interactions Networks A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae P.Utez et al, Nature 2000 – PowerPoint PPT presentation

Number of Views:188

Avg rating:3.0/5.0

Slides: 68

Provided by: a1565

Category:

more less

Transcript and Presenter's Notes

Title: Protein-Protein Interactions Networks

1
Protein-Protein Interactions Networks

A comprehensive analysis of protein-protein
interactions in Saccharomyces cerevisiaeP.Utez
et al, Nature 2000
Functional organisation of the yeast proteome
by systematic analysis of protein complexes G.
Gavin et al, Nature 2002
Global Mapping of the Yeast Genetic Interaction
Network Tong et al, Science 2004
Global analysis of protein activities using
proteome chips Zhu, H. et al. Science 2001
Conserved patterns of protein interaction in
multiple species R. Sharan et al, PNAS 2005

2
Genomics

Genomics The large scale study of genomes and
their functions
Why protein network?

3
Why protein network?

Assemblies represent more than the sum of their
parts.
complexity' may partly rely on the contextual
combination of the gene products.

4
Yeast as a model

Why yeast genomics? A model eukaryote organism

5
(No Transcript)
6
The best-studied organism

5,500 genes.
16(!) chromosomes.
13 Mb of DNA (humans have 3,000 Mb).

We know (?) the function of gt1/2 of the yeast
genes.
All the essential functions are conserved from
yeast to humans.

7
Example cell cycle
Lee Hartwell, Nobel Prize 2001
8
4 methodologies for high throughput research

Two hybrid systems
Analysis of protein complexes
Synthetic lethal
Protein Chips (?)

9
Two hybrid system

Aim
Identify pairs of Physical interactions.
Solution
Use the transcription mechanism of the cell

10
The central dogma
3
11
Transcription factors
Movie transcription (molecular model, real
time) 7.2
12
Transcription real time (viedo)
13
Reporter gene
14
Two hybrid system

Isolate double plasmids using reporter or
selection methods.

15
All against All
16
Focus on the baits

Baits are analyzed separately.
192 baits vs. 6000 pray yeast strains.

A component of RNA polymerase I, III,
identification of three new interacting proteins
17
Two hybrid system
18
Two hybrid system

A comprehensive two-hybrid analysis to explore
the yeast protein interactome Ito T. et al, PNAS
2001.

19
Analysis of protein complexes

Aim Identification of complexes and their sub
units.
Solution a two step method
Isolation of only relevant complexes
Identification of complex units.

20
Double Isolation
21
Identification of the members

Divide and conquer-

Denaturate assembly

Digest with protease

Mass spectrometry

22
How does it work?

The deflection route of ionized molecules is used
to determine the molecules mass.
The output

23
Analysis of protein complexes

Cross results of peptide mass with protein
database.

Mass spectrometry can be implied again if the
data is not sufficient, this time for the
peptides.

24
Analysis of protein complexes

Systematic(1) 1739 bait proteins.
232 complexes with 589 baits.
Systematic(2) 725 bait proteins.
3,617 interactions with 493 baits.

25
(No Transcript)
26
Analysis of protein complexes

About 25 false positive rate.
Covers 56/60, 10/35 in Y2H, of known complexes.
Only 7 of the interactions were seen by Y2H
assays.
But,
Can evaluate protein-
Concentration.
Localization.
Post-translational modifications.

27
Synthetic lethality

First, few words on essentiality.
Create new strains, each strain with one gene
deleted (96 coverage)
Tag each strains with a unique sequence.
Grow all the strains.
Measure the amount of each seq.
Some 18.7 (1,105) are essential.

28
Synthetic lethality

High genetic redundancy hardens the discovery of
many gene functions (30).
Only the double mutation is lethal, either of the
single mutations is viable.
Why?
Single biochemical pathway.
Two distinct pathways for one process.

29
The naïve approach

But how do you genomics it?

30
All vs. All

5100 non essential mutants.

Main tricks
1. Haploid strains
2. Resistant markers.
3. Extra marker for the library haploid.

31
Synthetic lethality Making it genomics

Mass analysis Crossing the query haploid with a
library (synthetic genetic array)

Tetrad analysis Validation and finding synthetic
sick

32
The genetic interaction map

8 genes against all produced a network of
synthetic lethal pairs.

33
Synthetic lethality Making it genomics

132 query genes vs. 4700
False negatives 17-42.
At least 4 times more dense than the PPI network.
Predicting 100,000 interactions (?)

34
PPI Summery (2003)
35
PPI Summery

S. Cerevisiae (Yeast)
4389 proteins
14319 interactions

C. Elegans (Worm)
2718 proteins
3926 interactions

D. Melanogaster (Fly)
7038 proteins
20720 interactions

Sharan et al. PNAS 2005
36
We like Networks

Exploit graph theory methods.
Provide a general solution for data integration.

37
Network Structure and Function

Identify highly nonrandom network structural
patterns that reflect function
Ideker et al Finding co-regulated sub-graphs.
Lee at el The repeated instances of each motif
are the result of evolutionary convergence.
Barabasi at el Network motifs are associated
with specific cellular tasks.

38
Conserved patterns of PPI in multiple species
Bakers yeast (Saccharomyes cerevisiae) 15000
interactions 5000 interacting genes
Bacterial pathogen (Helicobacter pylori) 1500
interactions 700 interacting genes
Kelley et al. PNAS 2003
39
Goals

Separating true PPI from false positives.
Assign functional roles to interactions.
Predict interactions.
Organizing the data into models of cellular
signaling and regulatory machinery.
How?
Use approach based on evolutionary cross-species
comparisons.

40
Interaction graph (per species)

Vertices are the organisms interacting proteins.
Edges are pair-wise interactions between
proteins.
Edges are weighted using a logistic regression
model
A Number of times an interaction was observed.
For Fly and worm observation In one experiment.
B Correlation coefficient of the gene
expression.
Shown to be correlated to interaction.
C Proteins small world clustering coefficient.
Sum of the neighbors logHG probs.

41
How do we find Sub-network conservation?

Interactions within each species should
approximate the desired structure
Pathway. Signal transduction.
Cluster. Protein complex.
Many-to-many correspondence between the sets of
proteins.

42
Network alignment graph

Each node corresponds to k sequence-similar
proteins.
BLAST E value lt -7 considering the 10 best
matches only.
Cannot be split into two parts with no sequence
similarity between them.
Edge represents a conserved interaction.
Match -gt One pair of proteins directly interacts
and all other include proteins with distance lt2
in the interaction maps.
Gap gt All protein pairs are of distance 2 in the
interaction maps.
Match-Gap-gt At least max2, k -1 protein pairs
directly interact.
A subgraph corresponds to a conserved
sub-network.

43
A probabilistic model
(
)

P
S
q(e) interaction similarity
44
Searching for conserved sub-networks

Identifying high-scoring subgraphs of the network
alignment graph.
This problem is computationally hard.
Exhaustively we find seeds - paths with 4 nodes.
Expand high scoring seeds. Greedily add/remove
nodes.
Filter subgraphs with a high degree of overlap
(gt80).

45
Statistical evaluation of sub-networks

Randomized data is produced
Random shuffling of each of the interaction
graphs.
Randomizing the sequence-similarity
relationships.
Find the highest-scoring sub-networks of a given
size.
P-value is computed by the distribution of the
top scores.

46
The final product
47
3-way Comparison

S. cerevisiae
4389 proteins
14319 interactions

C. elegans
2718 proteins
3926 interactions

D. melanogaster
7038 proteins
20720 interactions

Sharan et al. PNAS 2005
48
Multiple Network Alignment
Subnetwork search
Network alignment
Preprocessing Interaction scores logistic
regression on observations, expression
correlation, clustering coeff.
Filtering Visualizing p-valuelt0.01, ?80 overlap
Conserved paths
Conserved clusters
Protein groups
Conserved interactions
49
(No Transcript)
50
Reduced false positives

Compared these conserved clusters to known
complexes in yeast -
Pure cluster - contain gt2 annotated proteins and
gt1/2 of these shared the same annotation.
94(gt83 in mono specie) pure clusters.
Did sticky proteins biased the clusters?
Of 39 proteins (gt 50 neighbors), only 10 were
included in conserved clusters. And they were
annotated so.

51
Cross Validation Function

Guilty by association.
Enrichment of GO annotation (plt0.01).
More then half of the annotated proteins had the
annotation.

Outperforms sequence-based approach at 37-53.

52
Cross Validation Interaction

1 Evidence that proteins with similar sequences
interact within other species.
2 Co-occurrence of these proteins in the same
conserved cluster.

53
Wet Validation Interaction

The tests were performed by using two-hybrid
assays.
Of the 65 yeast predicted interactions
5 were self inducing.
31 tested positive.

54
Conclusions

Associate proteins that are not necessarily each
others best sequence match.
177/679 conserved clusters.
31/129 conserved paths.
Inter module interaction is reinforced by
inter-species observations.
40-52 gtgt 0.042 as a random PPI prediction.
Many PPI circuits are conserved over evolution.

55
Thanks!!!

Recoverin, a calcium-activated myristoyl switch.

56
GO Gene Ontology

all all ( 171472 )
GO0008150 biological_process ( 109503 )
GO0007582 physiological process ( 70981 )
GO0008152 metabolism ( 41395 )
GO0009058 biosynthesis ( 10256 )
GO0009059 macromolecule biosynthesis (
6876 )
GO0006412 protein biosynthesis ( 4611 )
GO0043170 macromolecule metabolism ( 17198
)
GO0009059 macromolecule biosynthesis (
6876 )
GO0006412 protein biosynthesis ( 4611 )
GO0019538 protein metabolism ( 12856 )
GO0006412 protein biosynthesis ( 4611 )
GO0005575 cellular_component ( 98453 )
GO0003674 molecular_function ( 108120 )

back
57
Interaction distribution
58
Expression data

Yeast - 794 conditions.
Fly - over 90 CC time points170 profiles.
Worm - over 553 conditions.

back
59
Edge weight

where 0, . . . , 3 are the parameters of the
distribution.
Maximize the likelihood
Positive MIPS interactions.
Negative random or false positives in the cross
validation test.
Yeast - 1006 positive and negative examples.
Fly - 96 positive and negative examples.
Worm 24 positive and 50 negative examples.

back
60
back
71 conserved regions 183 significant clusters
and 240 significant paths.
61
A probabilistic model

Ms - the sub-network model.
Mn - the null model.
Ouv - the set of available observations on u-v.
Puv- fraction of (u,v) in order preserving graphs
family.
T/Fuv True/False edge (u,v).

back
62
A probabilistic model

Each species interaction map was randomly
constructed.
Randomizing assumptions
Each interaction should be present independently
with high probability.
The probability depends on their total number of
connections in the network.

63
Why Yeast?

back
Comparative Genomics of the Eukaryotes Rubin
GM. et al. Science 2000
64
Analysis of protein complexes

IsolationA straight forward method, using
Affinity chromatography. A target protein is
attached to polymer beads that are packed into a
column. Cell proteins are washed through the
column.Proteins the interact with the target
protein adhere to the affinity matrix and are
eluted later.

65
Analysis of protein complexes

IsolationCo-immunoprecipitation. An antibody
that recognizes the target protein is used to
isolate the protein. Usually the there isnt a
highly specific antibody for the target protein.
A chimera protein is formed, using a the target
protein and an epitope tag.The common tag is a
enzyme glutathione S-transferase (GST).

66
Analysis of protein complexes