Title: Challenges in Describing Signaling Networks
1(No Transcript)
2Challenges in Describing Signaling Networks
3Cell Morphology Signaling State
4Understanding how Signaling Networks that
Regulate Morphology are Organized and How
Information Flows through these Networks
RhoGTPases
5Acquiring Morphological Signatures from Complex
Images
DAPIGFPF-Actin
6Raw Morphological Data and Data Reduction
145 phenotypic features
7Using Feature Graphs to Model Single-cell
Distributions
dsRNA y
4i,4j,-4k
Feature values i,j,k
2i,2j,-2k
8Inference Based on Feature Graphs
This is the unknown signaling network we will
infer. For this slide, assume we know the
signaling network ahead of time
Intuition for Inference Based on Feature Graphs
Question What is the relationship between
feature graphs of genes in a signaling pathway?
F1, F2
RNAi Gene C
Expect feature graph to have a relatively small
number of edges Feature graph is approximately
the intersection of the feature graphs for RNAi A
and RNAi B
RNAi Gene A
F1, F2, F4
F1, F2, F3
Expect feature graph to have relatively large
number of edges
RNAi Gene B
Expect feature graph to have a relatively large
number of edges
9Focus on Details of Feature Graph Construction
(future data sets will be larger)
For each FG, need to compute linear correlation
for C50 data points for all F(F-1)/2150149/2
pairs of features. Since there are N 250 TCs,
there are a total of NF(F-1)/2 linear
correlations to compute.
10Focus on Details of Feature Graph Construction
For each FG, need to compute linear correlation
for C50 data points for all F(F-1)/2150149/2
pairs of features.
How to compute all pairwise correlations
efficiently?
matmul of FxC and CxF
Computation is dominated by matmul of FxC and CxF
matmul of Fx1 and 1xF
Matlab built-in corr does not work with ppeval
11Parallelize in Dimension of TCs
Speed-up?
Parallel
Serial
12What if the TCs Have Different Numbers of Cells
(C)?
13What if the TCs Have Different Numbers of Cells
(C)?
Serial
Parallel
14Summary and Conclusions
- Feature graph construction depends on
computation of numerous linear correlations - Parallelization was implemented
- But speed-ups were not realized (why not?)
- In fact, slower because of time required to move
data to/from the server - Speed-ups are realized for very large data
sets because the server can handle larger data
more smoothly than a typical PC. But this is not
due to parallelization, rather due to hard drive
usage. - Why didnt parallelization result in gains in
speed? - Interactive Supercomputing doesnt preallocate
matrices in Matlab - Structure of problem?
- Coding?
15Acknowledgments
Chris Bakal
Bonnie Berger
John Aach
Norbert Perrimon
George Church