Title: Nested Effects Models for the Reconstruction of Signalling Pathways
1Nested Effects Models for the Reconstruction of
Signalling Pathways
2Overview
- Definition of Nested Effects Models
- Model Identifiability
- Fast update Formula
- Exhaustive Search
- Feature Selection
- Results on simulated Measurements and on the
Boutros Drosophila Dataset - Responsivity Prior
- Perspectives
3Definition of Nested Effects Models
Information flow through a graph of components
4Definition of Nested Effects Models
Information flow through a graph of components
Task Reconstruct the arrows
5Definition of Nested Effects Models
Information flow through a graph of components
Task Reconstruct the arrows, without measuring
all components
6Definition of Nested Effects Models
Information flow through a graph of components
too ambitious
Task Reconstruct the arrows, without measuring
all components, from noisy observational data
7Definition of Nested Effects Models
Information flow through a graph of components
Focus your question
Task Reconstruct the wiring of a small subset of
components, perform interventions on these
components, make use of all observable components.
8Definition of Nested Effects Models
Effect of action a on observation s
Actions
(a,s)
Observables
Predicted effects
9Definition of Nested Effects Models
Effect of action a on observation s
Actions
(a,s)
(a,s)
Observables
Predicted effects
Measured effects
10Definition of Nested Effects Models
Effect of action a on observation s
Actions
(a,s)
Observables
Predicted effects Ft
Measured effects Dt
11Definition of Nested Effects Models
Let N0 be the null matrix, i.e. the model
predicting no effects at all. Then,
12Definition of Nested Effects Models
It follows that
Note Missing data is handeled easily. Simple set
Rs,a0
13Definition of Nested Effects Models
Actions graph Adjacency matrix G
Actions
Assumption The diagonal of G consists of 1s
Effects graphAdjacency matrix T
Observables
Assumption Each observable is linked to exactly
one action, i.e. T can be synonymously encoded by
a function ?Observations?Actions
Predicted effects Ft
Definition A nested effects model (NEM) is a
model F for which F G T
14Definition of Nested Effects Models
Actions
Why nested ?
If the actions graph is transitively closed, then
the effects are nested in the sense that a ?
a implies effects(a) ? effects(a)
Observables
? ?
Predicted effects
15Model Identifiability
For nested effects models, we have
Theorem Let Ftrue GtrueTtrue . If the data is
perfect, i.e. R gt 0 iff Ftrue 1, then
16Model Identifiability
- Theorem
- If links every action to at least one
observation, thenis unique.
17Model Identifiability
- Theorem
- If links every action to at least one
observation, then is unique. - If no two actions have the same parents in ,
then is unique.
2
1
parents(3) 1,2,3,4 parents(4)
4
3
?
?
18Fast update Formula
Main objective Find the best scoring actions
graph.
Secondary objective Given an actions graph G,
find the best scoring effects graph
19Exhaustive Search
Main objective Find the best scoring actions
graph.
An elementary move is the transition from one
action graph to another by the insertion or the
deletion of one edge. Idea Traverse the actions
graph space in elementary moves. Use a gray code
to avoid redundance
1
2
1
2
1
2
1
2
1
2
3
3
3
3
3
Nb This would not be possible if we required the
actions graph to be a DAG or to be transitively
closed.
20Results on simulated Data
True graphs G,T
simulatedmeasure-ments (R)
idealmeasure-ments (GT)
21Results on simulated Data
True graphs G,T
simulatedmeasure-ments (R)
idealmeasure-ments (GT)
22Results on simulated Data
True graph
Estimated graph
12 edges, 2124096 action graphs, 4seconds
Distribution of the likelihoods
23Automatic Feature Selection I
Problem Some (most) of the observables may not
be assigned to any action. Solution Introduce a
null action, i.e. add a null row to the actions
graph. Observations that are assigned to this
action do not contribute to the likelihood.
Advantages The variable selection is included
into the estimation procedure, additional time
cost is approximately zero.
24Results on simulated Data
30 genes assigned, 60 genes unassigned
True graph
Estimated graph
25Automatic Feature Selection II
50 observations assigned, 5000 not assigned
Penalize R R ? R - d
Measurements matrix R
True graphs G,T
26Automatic Feature Selection II
Define the Posterior per Observation Score (PpO)
false Negatives
Scoring function
27Results on the Boutros Drosophila Dataset
Estimated graph (70 genes selected)
28MAP Estimation, Structure Prior
Assume independent priors P(G) and P(T) for the
actions graph and the effects graph. Then, the
posterior is
or
Task Find an actions graph prior which
effectively reduces the search space, and which
favours biologically sensible solutions.
29MAP Estimation, Structure Prior
Perform calculations in the Boolean algebra
(0,1, OR, AND)
G
G is transitively closed iff G2 G.
(note each vertex in G has a self-loop)
30MAP Estimation, Structure Prior
G is transitively closed iff G2 G. Define
where dist is the Hamming distance the number
of non-identical entries and t?0,? is a penalty
weight.
Remark (fast update) Let G be an actions graph
that differs from G only in one edge.If P(G) is
known, then P(G) can be calculated in linear
time.
31Perspectives
- Multiple knockdowns
- Partially unknown action graphs
- MCMC sampling / Simulated annealing for larger
action graphs
- Acknowledgements
- Florian Markowetz, Lewis-Sigler Institute,
Princeton University
References
Computational identification of cellular networks
and pathways F. Markowetz, Olga G. Troyanskaya.
Molecular BioSystems, to appear, 2007
Non-transcriptional Pathway Features
Reconstructed from Secondary Effects of RNA
Interference F. Markowetz, J. Bloch, R. Spang.
Bioinformatics 2005 21 4026-4032.
32Thank you!
Questions?
Comments?