Bioinformatic analysis of protein complexes - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Bioinformatic analysis of protein complexes

Description:

Comparing protein complexes and interacting proteins ... Ribosomal biogenesis. From Schafer et al, EMBO Journal (2003) Clustering of proteins ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 39
Provided by: rol78
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatic analysis of protein complexes


1
Bioinformatic analysis of protein complexes
  • Roland Krause
  • Cellzome AG, Heidelberg

2
Overview
  • The Proteome proteins and their interactions
  • The yeast proteome project at Cellzome
  • Experimental data generation
  • Functional analysis
  • Obtaining protein complexes
  • Comparing protein complexes and interacting
    proteins
  • Shared components Building blocks or biochemical
    artifacts?

3
Proteomics
  • The study of the protein repertoire expressed in
    the cell
  • Protein expression levels
  • Qualitative
  • Quantitative
  • Localization
  • Protein interactions
  • Powerful tool for the elucidation of protein
    function
  • Pair-wise interactions
  • Protein complexes
  • Protein complexes
  • Visible structural bodies
  • Important players in molecular life

4
Acknowledgements
  • Cellzome AG
  • Dr. Giulio Superti-Furga
  • Dr. Gitte Neubauer
  • Yeast biology
  • Dr. Anne-Claude Gavin
  • Dr. Paola Grandi
  • Dr. Christina Rau
  • Mass spectrometry
  • Bernhard Küster, PhD
  • Markus Bösche
  • Bioinformatics
  • Dr. Georg Casari
  • Jens Rick
  • Julien Gagneur
  • EMBL
  • Dr. Peer Bork
  • Dr. Christian von Mering
  • Biozentrum Würzburg
  • Prof. Dr. Thomas Dandekar
  • Prof. Dr. Jörg Schultz

5
The Yeast Proteome Projectat Cellzome
  • Tandem Affinity Purification (TAP)
  • Mass-Spectrometry (MS)

6
Workflow TAP-MS
  • Homologous transformation (addition of TAP-tag)
  • Test for expression
  • Large scale culture
  • Purification
  • Gel separation of complexes
  • Mass spectrometry (MALDI)

7
Large scale culture
Transformation
Separation
LIMS system
An integrated workflow
Mass spectrometry
8
Handling laboratory informationand annotation
  • Selection of protocol according to features of
    gene
  • Size, membrane association
  • Process information/ user management
  • Annotation of new complexes and novel findings
  • Collection of information for patent
  • Database of known drug targets

9
Key figures of the screen
  • Purifications 589
  • Proteins retrieved 1440, 300 novel
  • Complexes discovered 232, 60 appear as novel
  • Overlap to the Y2H-data 165 of 1500
    interactions
  • 37 percent of proteins in complexes are shared
  • Gavin, AC., Bösche, M., Krause, R., et al.
    (2002) Nature

10
Functional analysis
  • Protein complexes share many components
  • The resulting network of complexes builds a
    higher order network
  • Highly conserved and essential proteins tend to
    interact with each other
  • All localizations are sampled well but for
    membrane proteins
  • Small proteins are underrepresented

11
Examples of new findings
  • New complexes
  • 90S Pre-Ribosome
  • Gives rise to the primordial, nucleolar ribosome
  • Third largest complex in the yeast cell
  • Established functionally by Grandi et al, 2002
    (Mol. Cell.), Dragon et al, 2002, (Nature)
  • COP9/Signalosome
  • Missing complex known in human, fly,
    Arabidopsis
  • Known to be related to the 19S regulatory part of
    the proteasome
  • Shares components with the proteasome in yeast
  • New interactors for known complexes
  • Iwr1 with RNA polymerase II
  • YFL049w with SWI/SNF complex
  • Apparent underestimate of protein complexes in
    the reference literature

12
A comprehensive list of protein complexes
  • Cluster analysis for protein complexes

13
Obtaining complexes


TAP purifications
TAP-tagged protein (entry point)
yTAP-complexes (232)
14
A comprehensive list of protein complexes
  • Assembly of individual interactions into
    physiological protein complexes
  • Allows interpretation and annotation of results
  • Manually performed for the publication in Nature
  • Used known complexes as a guide
  • Contains several inconsistencies
  • An automatic procedure would be beneficial
  • Cluster analysis
  • Should preserve features of real complexes
  • Possible clusters
  • Of proteins
  • Of purifications
  • Large number of clusters compared to clustering
    of transcription profiles

15
Benchmarking protein complexes
  • There is no standard on comparing sets of
    complexes
  • How shall we treat the intricate structure of
    protein complexes?
  • Variant complexes
  • RSC complex
  • Lsm1-Lsm7 vs Lsm2-Lsm8
  • Cyclin dependent kinases
  • Megacomplexes
  • Assemblies of complexes
  • Transient interactions
  • Definitions vary
  • Kinetics
  • Cell cycle
  • Functional associations

16
Ribosomal biogenesis
From Schafer et al, EMBO Journal (2003)
17
Clustering of proteins
  • Clustering of proteins
  • Shared components are not preserved
  • Each protein is assigned to a single complex
  • Megacomplexes did not allow for a good separation
  • Very few data points 80 of the proteins have
    less than 3 identifications
  • Simpler approach Cluster of purifications
  • A purification should contain complexes already

18
Clustering of purifications
19
Similarity indices for comparing complexes or
purifications
  • Dice-Index
  • Jaccard-Index
  • na, nb Number of components in group a or b
  • Geometric index
  • Simpson-Index
  • ni Number of components in the intersection

20
Experimental complications
  • Sensitive identification of background proteins
  • Ribosomal proteins
  • Heat shock proteins
  • Abundant enzymes
  • Filtering by class and detection frequency
  • Missing identifications
  • Differences in expression levels
  • Small proteins
  • Membrane associated proteins

21
Refinements of similarity indices
  • Normalized Dice-like index (by column)
  • Normalized Simpson-like-index


f Frequency of detection
22
Comparing clustering results
  • Manually refined the MIPS and YPD complex sets
    for benchmark
  • Parameter exploration, comparing the results to
    the benchmark set
  • 80 complexes are contained in the curated
    complex set
  • No increase when expanding beyond 250 complexes
  • 252 complexes from the TAP set using means
    clustering and a threshold of 0.3

23
Results and conclusions
  • Combined HMS-PCI and the TAP set
  • 494 clusters (a reasonable total number of
    complexes)
  • 46 of 94 identical entry points occur in the same
    cluster
  • Refining the distance index is crucial to the
    clustering
  • Future work
  • Clustering of proteins (bi-clustering) and
    classification of proteins
  • Different clustering algorithms
  • Including more information into distance measure
  • Bait protein
  • Refine benchmarking
  • Krause R., et al. (2003) Bioinformatics.

24
Comparison of protein-protein interaction screens
  • Differences between individual methods and
    reference sets

25
Comparison of different data sets
  • Biochemical purifications
  • Gavin et al. (2002) (TAP)
  • Ho et al. (2002) (HMS-PCI)
  • Yeast-two hybrid
  • Ito et al. (2000, 2001), Uetz et al. (2000)
  • mRNA-co-expression
  • Eisen et al., (1998) Marcotte (2000)
  • In silico predictions
  • STRING (von Mering et al., (2003)
  • Synthetic lethals

26
Interaction density
27
Functional biases
28
Comparison
29
Conclusions
  • The overlap between the individual methods is
    surprisingly small
  • Different methods complement each other
  • Individual methods are not exhausted
  • Single experimental methods can be as reliable as
    combined sets
  • Integration
  • Bader, G. and Hogue, C. (2002) Nat. Biot.
  • Kemmeren H., et al. (2002) Mol. Cell
  • Von Mering C., Krause, R., et al. (2002) Nature

30
Shared components of protein complexes
  • Biochemical artifacts or versatile building
    blocks?

31
Shared components in the Cellzome screen
Co-activator of Pol II transcription
SAGA complex
Cytoskeleton
NuA4 histone acetylase
Chromatin remodelling
Histone deacetylase complex
32
Motivation and approach
  • Artifacts or structural principle?
  • Relevance to medical target discovery
  • Target to several processes
  • Understanding side effects
  • Evolutionary insights
  • Study of known shared components

33
Functional arrangements
  • Dihydrolipoamide dehydrogenase (Lpd1)
  • 2-Oxoglutarate dehydrogenase
  • Glycine decarboxylase
  • Pyruvate dehydrogenase
  • Common enzymatic function
  • RNA polymerases
  • Shared proteins are not the business end
  • Regulatory structural roles

Cramer, P., et al. (2000) Science
34
Structural arrangements for shared components
Examples Spt6 Tethers exosome to the RNA
polymerase for surveillance
Examples Lsm1-7 complex Lsm2-8 complex
Examples Signaling networks
Manuscript in preparation
35
Research interests
36
Research interests
  • Improve clustering approaches
  • Find a sensible structure for protein complexes
    and their interactions
  • Benchmark set of protein complexes in yeast
  • Functional properties of protein complexes
  • Conquering the human proteome and experimental
    planning
  • Hypothesis-free research

37
Thank you!
38
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com