Identifying functional subnetworks in large-scale datasets - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

Identifying functional subnetworks in large-scale datasets

Description:

Title: Sample Page Title Author: Alan Urdan Last modified by: BS Created Date: 8/7/2002 8:16:11 PM Document presentation format: On-screen Show Company – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 50
Provided by: Alan208
Category:

less

Transcript and Presenter's Notes

Title: Identifying functional subnetworks in large-scale datasets


1
Identifying functional subnetworksin large-scale
datasets
  • Benno Schwikowski
  • Institut Pasteur Systems Biology Group
  • http//systemsbiology.fr

2
The three levels of this talk
  1. Discovery of pathways active in HepC infection
  2. Cytoscape plug-ins
  3. Cytoscape platform

3
Hepatitis C infection
  • One person out of 30 is infected
  • No vaccine exists
  • In 20 of chronic infections, liver fibrosis and
    cirrhosis
  • Frequently requires liver transplants

4
Studying HepC infection mRNA changes
  • 50 of transplant livers become re-infected with
    Hepatitis C
  • Study expression of 7000 genes in re-infected
    livers after transplantation
  • 1-24 month post-transplant
  • Samples in 3-6 month intervals
  • 28 biopsies from 11 patients
  • Mixture of hepatocytes, hepatic stellate cell,
    Kupffer cells, various types of blood cells
  • Compare against pre-transplant reference pool

5
Result of mRNA expression analysis
  • Most genes (5968 of 7000)were significantly
    under- or overexpressed in one or more
    experiments
  • High patient-to-patient variation

6
Our approach
  1. Construct seed networkamong known molecular
    players
  2. Expand seed networkto include differentially
    expressed genes
  3. Identify putative pathwaysby the Active Modules
    approach

7
Seed network
8
InteractionFetcher plug-in
  • Purpose
  • Dynamically retrieves remote information for
    selected nodes
  • From SQL database
  • Requests data via XML-RPC protocol
  • Currently implemented types
  • Protein/gene synonyms
  • Orthologs
  • Sequences (DNA, protein, DNA upstream)
  • Gene, protein,
  • Interactions/associations
  • Options
  • Cross-species queries
  • Ortholog information from Homologene
  • Inferred interactions (interologs)
  • Interactive links to Source Web pages
  • 100 open-source (client and server)

9
2. Expand seed network
  • Purpose
  • Bring significantly up-/downregulated genes into
    the picture
  • Approach
  • Add interactions with differentially expressed
    genes (in silico pull-down)
  • Use BIND, HPRD databases
  • Only human-curated interactions

10
  • Network after InteractionFetcher expansion

11
Identifying putative pathwaysWhy clustering can
be problematic
  • Many clustering methods are not model-based ?
    significance of clusters is unclear
  • Any given cluster may not be supported by all
    experiments noise problem
  • Clusters tend to contain unrelated genes with
    vaguely similar profiles

12
The three levels of this talk
  1. Discovery of pathways active in HepC infection
  2. Cytoscape plug-ins
  3. Cytoscape platform

13
How can the clustering issuesbe addressed? The
ActiveModules Plug-in
  • Define up-/downregulated on the basis of a
    well-defined statistical model
  • Also derive clusters from some of the input
    experiments
  • Use additional evidence to focus on plausible
    clusters ? protein interactions

14
Interaction networks
Schwikowski, Uetz, FieldsNature Biotechnology
(2000)
15
Modular organization of interaction networks
16
A lot of interaction data is becoming available
  • Databases on...
  • Protein-protein interactions
  • Protein-DNA interactions
  • Genetic interactions
  • Metabolic pathways
  • Cell signaling pathways, similarity
    relationships, literature-based relationships

17
Multi-criteria detection of modules
1. Interaction networkbetween genes/proteins
2. Differential Gene/ProteinAbundances/Activities
Experiments
Genes ??
18
Scoring a module candidate
Perturbations /conditions
Pz 1-F(zA(j))
Rank adjustment Binomial summation
rA(j)F-1(1-pA(j))
m total number of conditions j size of subset
of conditions
Ideker, Ozier, Schwikowski, Siegel(2002)
Bioinformatics 18. S233-240
19
Pathways in Rosettas compendium(300 conditions)
20
The three levels of this talk
  1. Discovery of pathways active in HepC infection
  2. Cytoscape plug-ins
  3. Cytoscape platform

21
Active Modules plug-in appliedto HCV
re-infection data
  • Iterative application results in four significant
    highly overlapping subnetworks
  • Repeat analysis only retaining late-active
    re-infection experiments
  • Eliminates pathways activated by transplant
    operation
  • Cutoff 8 months

22
Which observations can we make locally?
Network after InteractionFetcher expansion Bold
Differentially regulated subnetwork Red/Green
Late-active subnetwork
23
Cytotalk plug-in
  • Overrepresentation analysis using Cytotalk
    plug-in, R, of overrepresentation of genes in
    Gene Ontology classes
  • Cytotalk enables interactive communication with
  • C/C programs
  • Java processes
  • Python
  • UNIX shell scripts
  • R, R scripts
  • Can be run on same machine or any other
    Internet-connected machine
  • Can function as Cytoscape plug-in
  • 100 open-source

24
The three levels of this talk
  1. Discovery of pathways active in HepC infection
  2. Cytoscape plug-ins
  3. Cytoscape platform

25
Some Network Visualization Tools
  • Pajek - Slovenia
  • Osprey - SLRI, Toronto
  • VisANT - BU
  • Biolayout - EBI
  • GraphViz
  • PowerPoint
  • Others
  • Cytoscape (only open-source biology)

26
Cytoscape
27
Cytoscape Basic Concepts
  • Objectsvisualized as nodes
  • Relationshipsvisualized as edges
  • Attributes (name, sequence, source,...)
  • Mappingattributes ? drawing customizable
    throughvisual mapper

28
Cytoscape file formats
Sample interaction file
  • YDR216W pd YIL056W
  • YDR216W pd YKR042W
  • YDR216W pd YGL096W
  • YDR216W pd YDR077W
  • ...

Sample interaction file
GENE DESC exp0.sig exp1.sig exp0.sig exp1.sig GEN
E0 G0 0.0 0.0 23.2 11.5 GENE1 G1 0.0 0.0 34.6 5.2
GENE2 G2 0.0 0.0 10.0 28.0 GENE3 G3 0.0 0.0 1.6
4 4.77 ...
29
Cytoscape
  • Display
  • gene protein expression
  • protein interactions (physical andnon-physical)
  • protein classifications
  • Analysis plug-in modules
  • http//www.cytoscape.org/
  • Java platform independent web-start
  • 100 open-source

30
Visual Styles
Display gene expressionas clear text
31
Visual Styles
Map expression values to node colors using
a continuous mapper
32
Visual Styles
Expression data mapped to node colors
33
Multidimensional attributes
Cytoscape, pre-release plug-in Data from Ideker
et al., Science (2001)
34
Layout
  • 16 algorithms available through plug-ins
  • Zooming, hide/show, alignment

35
yFiles Circular
36
(No Transcript)
37
Cytoscape Core Differences to most other
approaches
  • Emphasis on data analysis integration
  • No built-in semantics(added by plug-ins)
  • Very simple concepts
  • Human-readable input formats
  • Extensibility

38
Cytoscape extensibility
  • Core 100 open source Java
  • Plug-in API
  • Plug-ins are independently licensed
  • Just need to do the biology
  • Template code samples

Plug-in
39
Biomodules plug-in
Prinz S, Avila-Campillo I, Aldridge C, Srinivasan
A, Dimitrov K, Siegel AF, and Galitski T Genome
Res. 2004 14 380-390
40
Cytoscape Plugins
Modules in Complex Networks Iliana
Avila-Campillo, Tim Galitski
Discovering Regulatory and Signaling Circuits in
Molecular Interaction Networks Trey Ideker, Owen
Ozier, Benno Schwikowski, Andrew Siegel
Data Integration in Juvenile Diabetes
Research Marta Janer, Paul Shannon
A network motif sampler David Reiss, Benno
Schwikowski
41
Cytoscape Core Features
  • Visualize and lay out networks
  • Display network data using visual styles
  • Easily organize multiple networks
  • Birds eye view navigation of large networks
  • Supports SIF and GML, molecular profiling
    formats, node/edge attributes
  • Functional annotation from GO KEGG
  • Metanode support (hierarchical groupings)
  • Extensible through plugins (20 developed)

42
Baliga et al.Genome ResearchJune 2004
43
Collaborators HCV
  • Institute for Systems Biology, Seattle, WA
  • David Reiss
  • Iliana Avila-Campillo
  • Vesteinn Thorsson
  • Tim Galitski

44
(No Transcript)
45
Collaborators Cytoscape
  • ISBLeroy HoodRowan Christmas
  • Agilent Technologies
  • Unilever PLC
  • Long-term funding from NIH and participating
    institutions
  • UCSDTrey IdekerChris Workman
  • Memorial-Sloan KetteringCancer CenterChris
    SanderGary BaderEthan Cerami
  • Pasteur Melissa ClineAndrea SplendianiTero
    Aittokallio

46
Shannon, P., et al. (2003). Cytoscape A software
environment for integrated models of biomolecular
interaction networks. Genome Res 13, 2498-504.
47
Collaborators Active Networks
  • Trey Ideker
  • Owen Ozier
  • Andrew Siegel
  • Richard Karp

48
(No Transcript)
49
Levels of Biological Information
DNA mRNA Protein Pathways Networks Cells Tissues O
rgans Individuals Populations Ecologies
Write a Comment
User Comments (0)
About PowerShow.com