Title: Identifying Changes in Signaling from HighThroughput Data
1Identifying Changes in Signaling from
High-Throughput Data
Michael Ochs Fox Chase Cancer Center
2The New Paradigm
Group 1
Group 2
Targeted Therapies
Personalized Medicine
Your Chromosomes Here
3Outline
- Signaling and Gene Expression
- Bayesian Decomposition
- Examples of Analyses
4Cellular Signaling
Extracellular Signal
Signal Transduction
Metabolic Changes
Transcription
Downward, Nature, 411, 759, 2001
5Gene Expression
6Identifying Pathways
A B C D E
7Goal of Analysis
8Biological Model
But the Gene Lists are Incomplete as are
the Network Diagrams!
9Issues to Solve
- Overlapping Signals
- Genes are involved in multiple processes
- Various processes are active simultaneously in
any observed data - Identification of Process Behind Signal
- If find a signal, what is the cause
- Do identification without a complete model
10Outline
- Signaling and Gene Expression
- Bayesian Decomposition
- Examples of Analyses
11Data
(Spellman et al, Mol Biol Cell, 9, 3273,
1999) (Cho et al, Mol Cell, 2, 65, 1998)
12BD Identification of Signals
condition 1
condition M
gene 1
pattern k
pattern 1
gene 1
X
gene N
Data
gene N
13Markov Chain Monte Carlo
We cannot always solve the problem directly, we
can only estimate relative probabilities of
possible solutions
Markov Chain Monte Carlo is used to explore the
possible solutions
14Bayesian Statistics
p(data model) p(model)
p(model data)
p(data)
condition 1
condition M
pattern 1
pattern k
gene 1
gene 1
pattern 1
X
pattern k
condition M
condition 1
gene N
gene N
15Outline
- Signaling and Gene Expression
- Bayesian Decomposition
- Examples of Analyses
16Acknowledgements
- Tom Moloshok (Cell Cycle, Mouse)
- Ghislain Bidaut (Yeast Deletion Mutants)
- Andrew Kossenkov (TFs, YDMs)
- Bill Speier, DJ Datta, Daniel Chung, Ryan
Goldstein, Matt Lewandowski
17Cell Cycle
Tobin and Morel, Asking About Cells, Harcourt
Brace, 1997
18Data
- Data Expression data of 788 yeast cell-cycle
regulated genes Cho, 1998 across 17 different
time points was taken for analysis. - Coregulation 11 groups (from 5 to 17 genes in
each group 67 genes in total, 18 from 67 genes
belong to more than one group) were composed,
based on literature review (not cell cycle
literature). - Analysis with and without coregulation
information
19Validation
Cherepinsky et al, PNAS, 100, 9668, 2003
20ROC Analysis
ROC Receiver Operator Characteristic
Fraction of called positives that are correct
Sensitivity
Fraction of called negatives that are correct
TP true positive TN true negative FP false
positive FN false negative
1 - Specificity
Area under the curve is the measurement of
algorithm efficacy
21Hierarchical Clustering
ROC Curve
Cherepinsky et al, PNAS, 100, 9668, 2003
22Bayesian Decomposition
Sensitivity
1 - Specificity
23Deletion Mutant Data Set
(Hughes et al, Cell, 102, 109, 2000)
- 300 Deletion Mutants in S. cerevisiae
- Biological/Technical Replicates with Gene
Specific Error Model - Filter Genes
- gt25 Data Missing in Ratios or Uncertainties
- lt 2 Experiments with 3 Fold Change
- Filter Experiments
- lt 2 Genes Changing by 3 Fold
- 228 Experiments/764 Genes
24BD Matrix Decomposition
Distribution of Patterns (what genes are in
patterns)
Mutant 1
Mutant M
gene 1
pattern k
pattern 1
gene 1
X
Patterns of Behavior (does mutant
contain pattern)
gene N
Data
gene N
25Analysis
- Bayesian Decomposition
- Identify patterns and linked genes
- Use genes to determine function
- Interpretation of Functions
- Gene Ontology
- Transcription factor data
- Validation
26Use of Ontology Pattern 13
13
15
27The Other Pattern 15
13
15
28Transcription Factors
Signaling Pathways
29Genes from Pattern 13
Fig1 Prm6 Fus1 Ste2 Aga1 Fus3 Pes4 Prm1 ORF
Bar1
known to be involved in mating response
known to be regulated by Ste12p
30Validation
(Posas, et al, Curr Opin Microbiology, 1, 175,
1998)
31Pattern 13 Mutants
32Pattern 15 Mutants
33Conclusions
- Transcriptional Response Provides Signatures of
Pathway Activity - Ontologies Can Guide Interpretation
- Bayesian Decomposition Can Dissect Strongly
Overlapping Signatures
34Acknowledgements
Fox Chase
- Tom Moloshok
- Jeffrey Grant
- Yue Zhang
- Elizabeth Goralczyk
- Liat Shimoni
- Luke Somers (UPenn)
- Olga Tchuvatkina
- Michael Slifker
- Sinoula Apostolou
- Brendan Reilly
Ghislain Bidaut (UPenn CBIL) Andrew
Kossenkov Vladimir Minayev (MPEI) Garo Toby (Dana
Farber) Yan Zhou Aidan Petersen Bill Speier
(Johns Hopkins) Daniel Chung (Columbia) DJ Datta
(UCSF) Elizabeth Faulkner (UPenn) Frank
Manion Bob Beck
- Collaborators
- A. Godwin (FCCC)
- A. Favorov (GosNIIGenetika)
- J.-M. Claverie (CNRS)
- G. Parmigiani (JHU)
- O. Favorova (RMSU)
35Patterns as Basis Vectors
BD
36MakingProteins(Phenotype)
37ROSETTA DATA
- From 5 to 20 patterns were posited in the
analysis. - Results were checked on information about
Metabolic Pathways taken from Saccharomyces
Genome Database - 11 groups of 4-6 genes, known
to be involved in the same metabolic pathways. - ROC analysis was performed
38ROSETTA DATA