Structure Learning in Nested Effects Models - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Structure Learning in Nested Effects Models

Description:

Information flow through a signalling cascade. Receptor Protein. Small ... Handcraft Model. Page 30. Achim Tresch. Results on the Boutros Drosophila Dataset ... – PowerPoint PPT presentation

Number of Views:140
Avg rating:3.0/5.0
Slides: 43
Provided by: jrgenmar
Category:

less

Transcript and Presenter's Notes

Title: Structure Learning in Nested Effects Models


1
Structure Learning in Nested Effects Models
  • Achim Tresch

2
Motivation
Information flow through a signalling cascade
3
Motivation
Information flow through a signalling cascade
Task Reconstruct the arrows
4
Motivation
Information flow through a signalling cascade
Task Reconstruct the arrows
Task Reconstruct the arrows,without measuring
all components
(extracellular hormone concentration)
(mRNA expression)
(intracellular metabolite concentration)
(protein concentration)
5
Motivation
Information flow through a signalling cascade
Task Reconstruct the arrows,without measuring
all components
Task Reconstruct the arrows,without measuring
all components, from noisy observational data
too ambitious
6
Motivation
Information flow through a signalling cascade
Focus your question
Task Reconstruct the wiring of a small subset of
components, perform interventions on these
components, make use of all observable components.
7
Definition of Nested Effects Models (NEMs)
Actions
Actions graph Adjacency matrix G
Effects graphAdjacency matrix T
Observables
Assumption Each observable is linked to exactly
one action
Predicted effects Ft
Definition A nested effects model (NEM) is a
model F for which F G T
Predicted effect of the leftmost action on the
bottom observable (0 no effect, 1 effect)
8
Definition of Nested Effects Models (NEMs)
Why nested ?
Actions
If the actions graph is transitively closed, then
the effects are nested in the sense that a ?
b implies effects(a) ? effects(b)
Observables
The present formulation of a NEM does not rely
on this assumption
? ?
Predicted effects
9
Likelihood of Nested Effects Models
Effect of action a on observation s
Actions
a
Rs,a
s
Observables
Predicted effects Ft
Measured effects R
10
Likelihood of Nested Effects Models
Actions
a
s
Observables
Predicted effects
Measured effects R
11
Likelihood of Nested Effects Models
Recall the definition of R
Hence Rs,agt0 iff an effect of a on s is more
likely than no effect,Rs,alt0 iff the data favors
no effect over effect.
12
Likelihood of Nested Effects Models
It follows that
Note Missing data is handeled easily. Simple set
Rs,a0
13
Likelihood of Nested Effects Models
It follows that
Note Missing data is handeled easily. Simple set
Rs,a0
14
Topics
  • Model Identifiability
  • Computaional complexity of the calculation of
    the Likelihood
  • Model Search
  • Feature Selection (Selection of Observables)
  • Validation on simulated and real Data
  • Extensions of the NEM model
  • Incorporation of Priors
  • Multiple Assignments / Multiple Actions

15
Model Identifiability
For nested effects models, we have
Theorem Let Ftrue GtrueTtrue . If the data is
perfect, i.e. R gt 0 iff Ftrue 1, then
16
Model Identifiability
No!
a
parents(b) a,b,c parents(c)
b
c
?
?
1
But this is a limitation of any effects model,
not only of a NEM. Assumption No two vertices
have the same set of parents
17
Model Identifiability
No!
3
Reversals
a
c
b
1
2
a
b
c
1
But
2
3
Theorem A nested effects model is unique up to
reversals.
18
Computational Complexity
Main objective Find the best scoring actions
graph.
Secondary objective Given an actions graph G,
find the best scoring effects graph
19
Exhaustive Search
Main objective Find the best scoring actions
graph.
An elementary move is the transition from one
action graph to another by the insertion or the
deletion of one edge. Idea Traverse the actions
graph space in elementary moves. Use a gray code
to avoid redundance

1
2
1
2
1
2
1
2
1
2
3
3
3
3
3
Nb This would not be possible if we required the
actions graph to be a DAG or to be transitively
closed.
20
Results on simulated Data
R/Bioconductor package Nessy
True graphs G,T
simulatedmeasure-ments (R)
idealmeasure-ments (GT)
21
Results on simulated Data
True graph
Estimated graph
12 edges, 2124096 action graphs, 4seconds
Distribution of the likelihoods
22
Results on simulated Data
Each NEM 4 actions, 5 edges, 50 observablesFor
each noise level, 100 sample graphs were drawn
(std.dev of the noise in fractions of the signal)
23
Results on simulated Data
Rank of the true models likelihood
24
Automatic Feature Selection I
Problem Some (most) of the observables may not
be assigned to any action. Solution Introduce a
null action, i.e. add a null row to the actions
graph. Observations that are assigned to this
action do not contribute to the likelihood.
Advantages The variable selection is included
into the estimation procedure, additional time
cost is approximately zero.
25
Results on simulated Data
30 genes assigned, 60 genes unassigned
True graph
Estimated graph
26
Automatic Feature Selection II
50 observations assigned, 5000 not assigned
Penalize R R ? R - d
Measurements matrix R
True graphs G,T
27
Automatic Feature Selection II
Define the Posterior per Observation Score (PpO)
false Negatives
Scoring function
28
Automatic Feature Selection II
Reconstructions for different choices of the
regularization parameter d
(non-responding observables were removed from the
plot)
Model with 77 responders
Model with 300 responders
Model with 38 responders
29
Results on the Boutros Drosophila Dataset
LPS stimulation (unperturbed system)
Feature Selection with the help of the Control
experiment
Handcraft Model
30
Results on the Boutros Drosophila Dataset
Automatic Feature Selection, without Control
experimentEstimated graph (120 genes selected)
31
MAP Estimation, Structure Prior
Assume independent priors P(G) and P(T) for the
actions graph and the effects graph. Then, the
posterior is
or
Task Find an actions graph prior which
effectively reduces the search space, and which
favours biologically sensible solutions.
32
MAP Estimation, Structure Prior
Each NEM 4 actions, 5 edges, 50 observablesFor
each prior type, 100 sample graphs were drawn
Prior specifies 3 truly absent edges as absent
Increase number of known, truly present edges
Increase number of known, truly absent edges
33
MAP Estimation, Structure Prior
Perform calculations in the Boolean algebra
(0,1, OR, AND)
G
G is transitively closed iff G2 G.
(note each vertex in G has a self-loop)
34
MAP Estimation, Structure Prior
G is transitively closed iff G2 G. Define
where dist is the Hamming distance the number
of non-identical entries and t?0,? is a penalty
weight.
Remark (fast update) Let G be an actions graph
that differs from G only in one edge.If P(G) is
known, then P(G) can be calculated in linear
time.
35
Multiple Actions, Multiple Assignments
Multiple Assignments Graph N
Multiple Actions Graph M
a
a
a
1
2
ac
3
c
c
c
4
abc
ab
5
6
b
b
b
Actions Graph G
Effects Graph T
The likelihood function then keeps its convenient
form
36
Results on the Fellmann Dataset
15 Genes 17 Knockdown Experiments 6 of them
double Knockdowns
37
Results on the Fellmann Dataset
Same Data, With prior knowledge.
38
Application of NEMs to Synthetic Lethality (SL)
Data
  • Hypotheses
  • SL between two genes occurs if the genes are
    located in different pathways
  • Genes sharing the same synthetic lethality
    partners have an increased chance of being
    located in the same pathway (Ye, Bader et al.,
    Mol.Systems Biology 2005)

Pathway II
Pathway I
1
a
2
b
3
  • Consequence
  • A gene a whose SL partners are nested into the SL
    partners of another gene b is likely to be
    located beneath b in the same pathway.

39
Application of NEMs to Synthetic Lethality (SL)
Data
40
Application of NEMs to Synthetic Lethality (SL)
Data
41
References
  • R/Bioconductor packages
  • NEM
  • Nessy
  • References
  • Computational identification of cellular networks
    and pathways F. Markowetz, Olga G. Troyanskaya,
    Dennis Kostka, Rainer Spang. Molecular
    BioSystems, to appear, 2007
  • Non-transcriptional Pathway Features
    Reconstructed from Secondary Effects of RNA
    Interference. F. Markowetz, J. Bloch, R. Spang.
    Bioinformatics 2005 21 4026-4032.
  • Structure Learning in Nested Effects Models. A.
    Tresch, F. Markowetz. Submitted to Biostatistics
  • Nested Effects Models. H. Fröhlich, T.
    Beissbarth, to appear in BMC Bioinformatics

42
Acknowledgements
  • Florian Markowetz Lewis-Sigler Institute,
    Princeton
  • Tim Beissbarth, Holger FröhlichGerman Cancer
    Research Center, Heidelberg
  • Rainer Spang, Dennis Kostka, Juby
    JacobComputational Diagnostics Group, Regensburg
Write a Comment
User Comments (0)
About PowerShow.com