Module Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Module Networks

Description:

Lines represent 500 bp of genomic sequence located upstream to the start codon ... stock variable, instance trading day. ... Stock Market Data ... – PowerPoint PPT presentation

Number of Views:19

Avg rating:3.0/5.0

Slides: 42

Provided by: csta3

Category:

more less

Transcript and Presenter's Notes

Title: Module Networks

1
Module Networks

Discovering Regulatory Modules and their
Condition Specific Regulators from Gene
Expression Data

Cohen Jony
2
Outline

The Problem
Regulators
Module Networks
Learning Module Networks
Results
Conclusion

3
The Problem

Inferring regulatory networks from gene
expression data.

From
4
Regulators
5
Regulation types
6
Regulators example
This is an example for a regulating module.
7
Known solution Bayesian Networks
The problem Too many variables and too little
data cause statistical noise to lead to spurious
dependencies, resulting in models that
significantly over fit the data.
8
From Bayesian To Module
9
Module Networks

We assume that we are given a domain of random
variables X X1 Xn.
We use Val(Xi) to denote the domain of values of
the variable Xi.
A module set C is a set of such formal
variables M1 MK. As all the variables in
a module share the same CPD.
Note that all the variables must have the same
domain of values!

10
Module Networks

A module network template T (S ?) for C
defines, for each module Mj in C
1) a set of parents PaMj from X
2) a conditional probability template (CPT) P( Mj
PaMj ) which specifies a distribution over Val
(Mj ) for each assignment in Val (PaMj ).
We use S to denote the dependency structure
encoded by PaMj Mj in C and ? to denote the
parameters required for the CPTs P( Mj PaMj )
Mj in C.

11
Module Networks

A module assignment function for C is a function
A X ? 1 K such
that A(Xi) j only if Val (Xi) Val (Mj ).
A module network is defined by both the module
network template and the assignment function.

12
Example

In our example, we have three modules M1, M2, and
M3.
PaM1 Ø , PaM2 MSFT, and PaM3 AMAT
INTL.
In our example, we have that A(MSFT) 1, A(MOT)
2, A(INTL) 2, and so on.

13
Learning Module Networks

The iterative learning procedure attempts to
search for the model with the highest score by
using the expectation Maximization (EM)
algorithm.
An important property of the EM algorithm is that
each iteration is guaranteed to improve the
likelihood of the model, until convergence to a
local maximum of the score.
Each iteration of the algorithm consists of two
steps

M-step
E-step
14
Learning Module Networks cont.
M-step

In the , the procedure is given
a partition of the genes into modules and learns
the best regulation program (regression tree) for
each module.
The regulation program is learned via a
combinatorial search over the space of trees.
The tree is grown from the root to its leaves. At
any given node, the query which best partitions
the gene expression into two distinct
distributions is chosen, until no such split
exists.

15
Learning Module Networks cont.
E-step

In the , given the inferred
regulation programs, we determine the module
whose associated regulation program best predicts
each genes behavior.
We test the probability of a genes measured
expression values in the dataset under each
regulatory program, obtaining an overall
probability that this genes expression profile
was generated by this regulation program.
We then select the module whose program gives the
genes expression profile the highest
probability, and re-assign the gene to this
module.
We take care not to assign a regulator gene to a
module in which it is also a regulatory input.

16
Bayesian score

When the priors satisfy the assumptions above,
the Bayesian score decomposes into local module
scores

Where

17
Bayesian score cont.

Where Lj(U,X, ?MjD) is the Likelihood function .
Where P(?Mj Sj u) is the Priors.

Where Sj U denotes that we chose a structure
where U are the parents of module Mj.
Where Aj X denotes that A is such that Xj X.

18
Assumptions

Let P(A), P(S A), P(? S,A) be assignment,
structure, and parameter priors.
P(? S,A) satisfies parameter independence if
P(? S,A) satisfies parameter modularity if
for all structures S1 and S2 such that

19
Assumptions

P(?, S A) satisfies assignment independence if
P(? S, A) P(? S) and P(S A) P(S).
P(S) satisfies structure modularity if
where Sj denotes the choice of parents for
module Mj , and ?j is a distribution over the
possible parent sets for module Mj.
P(A) satisfies assignment modularity if
where Aj is the choice of variables assigned to
module Mj, and aj j 1 K is a family
of functions from 2X to the positive reals.

20
Assumptions - Explainations

Parameter independence, parameter modularity, and
structure modularity are the natural analogues of
standard assumptions in Bayesian network
learning.
Parameter independence implies that P(? S, A)
is a product of terms that parallels the
decomposition of the likelihood, with one prior
term per local likelihood term Lj.
Parameter modularity states that the prior for
the parameters of a module Mj depends only on the
choice of parents for Mj and not on other aspects
of the structure.
Structure modularity implies that the prior over
the structure S is a product of terms, one per
each module.

21
Assumptions - Explainations

These two assumptions are new to module networks.
Assignment independence makes the priors on the
parents and parameters of a module independent of
the exact set of variables assigned to the
module.
Assignment modularity implies that the prior on
A is proportional to a product of local terms,
one corresponding to each module.
Thus, the reassignment of one variable from one
module Mi to another Mj does not change our
preferences on the assignment of variables in
modules other than i j.

22
Experiments

The network learning procedure was evaluated on
synthetic data, gene expression data, and stock
market data.
The data consisted solely of continuous values.
As all of the variables have the same domain, the
definition of the module set reduces simply to a
specification of the total number of modules.
Beam search was used as the search algorithm,
using a look ahead of three splits to evaluate
each operator.
As a comparison, Bayesian networks were used with
precisely the same structure learning algorithm,
simply treating each variable as its own module.

23
Synthetic data

The synthetic data was generated by a known
module network.
The generating model had 10 modules and a total
of 35 variables that were a parent of some
module. From the learned module network, 500
variables where selected, including the 35
parents.
This procedure was run for training sets of
various sizes ranging from 25 instances to 500
instances, each repeated 10 times for different
training sets.

24
Synthetic data - results

Generalization to unseen test data, measuring the
likelihood ascribed by the learned model to4500
unseen instances.
As expected, models learned with larger training
sets do better but, when run using the correct
number of 10 modules, the gain of increasing the
number of data instances beyond 100 samples is
small.
Models learned with a larger number of modules
had a wider spread for the assignments of
variables to modules and consequently achieved
poor performance.

25
Synthetic data results cont.

Log-likelihood per instance assigned to held-out
data.

For all training set sizes, except 25, the model
with 10 modules performs the best.

26
Synthetic data results cont.

Fraction of variables assigned to the largest 10
modules.

Models learned using 100, 200, or 500 instances
and up to 50 modules assigned 80 of the
variables to 10 modules.

27
Synthetic data results cont.

Average percentage of correct parent-child
relationships recovered.

The total number of parent-child relationships in
the generating model was 2250.
The procedure recovers 74 of the true
relationships when learning from a dataset of
size 500 instances.

28
Synthetic data results cont.

As the variables begin fragmenting over a large
number of modules, the learned structure contains
many spurious relationships.
Thus in domains with a modular structure,
statistical noise is likely to prevent overly
detailed learned models such as Bayesian networks
from extracting the commonality between different
variables with a shared behavior.

29
Gene Expression Data

Expression data which measured the response of
yeast to different stress conditions was used.
The data consists of 6157 genes and 173
experiments.
2355 genes that varied significantly in the data
were selected and learned a module network over
these genes.
A Bayesian network was also learned over this
data set.

30
Candidate regulators

A set of 466 candidate regulators was compiled
from SGD and YPD.
Both transcriptional factors and signaling
proteins that may have transcriptional impact.
Also included genes described to be similar to
such regulators.
Excluded global regulators, whose regulation is
not specific to a small set of genes or process.

31
Gene Expression reasults

The figure demonstrates that module networks
generalize much better then Bayesian network to
unseen data for almost all choices of number of
modules.

32
Biological validity

Biological validity of the learned module network
with 50 modules was tested.
The enriched annotations reflect the key
biological processes expected in our dataset.
For example, the protein folding module
contains 10 genes, 7 of which are annotated as
protein folding genes. In the whole data set,
there are only 26 genes with this annotation.
Thus, the p-value of this annotation, that is,
the probability of choosing 7 or more genes in
this category by choosing 10 random genes, is
less than 10-12.
42 modules, out of 50, had at least one
significantly enriched annotation with a p-value
less than 0.005.

33
Biological validity Cont.

The enrichment of both HAP4 motif and STRE,
recognized by Hap4 and Msn4, respectively,
supporting their inclusion in the modules
regulation program.
Lines represent 500 bp of genomic sequence
located upstream to the start codon of each of
the genes colored boxes represent the presence
of cis-regulatory motifs locates in these
regions.

34
Stock Market Data

NASDAQ stock prices for 2143 companies, covering
273 trading days.
stock ? variable, instance ? trading day.
The value of the variable is the log of the ratio
between that days and the previous days closing
stock price.
As potential controllers, 250 of the 2143 stocks,
whose average trading volume was the largest
across the dataset were selected.

35
Stock Market Data

Cross validation is used to evaluate the
generalization ability of different models.
Module networks perform significantly better than
Bayesian networks in this domain.

36
Stock Market Data

Module networks compared with Autoclass

Significant enrichment for 21 annotations,
covering a wide variety of sectors where found.
In 20 of the 21 cases, the enrichment was far
more significant in the modules learned using
module networks compared to the one learned by
AutoClass.

37
Conclusions

The results show that learned module networks
have much higher generalization performance than
a Bayesian network learned from the same data.
Parameter sharing between variables in the same
module allows each parameter to be estimated
based on a much larger sample, this allows us to
learn dependencies that are considered too weak
based on statistics of single variables. (these
are well-known advantages of parameter sharing)
An interesting aspect of the method is that it
determine automatically which variables have
shared parameters.

38
Conclusions

The assumption of shared structure significantly
restricts the space of possible dependency
structures, allowing us to learn more robust
models than those learned in a classical Bayesian
network setting.
In module network, a spurious correlation would
have to arise between a possible parent and a
large number of other variables before the
algorithm would introduce the dependency.

39
Overview on Module Networks
40
Literature

Reference Discovering Regulatory Modules and
their Condition Specific Regulators from Gene
Expression Data.
By Eran Segal, Michal Shapira, Aviv Regev, Dana
Peer, David Botstein, Daphne Koller Nir
Friedman.
Bibliography
P. Cheeseman, J. Kelly, M. Self, J. Stutz, W.
Taylor, and D. Freeman. Autoclass a Bayesian
classification system. In ML 88. 1988.

41
THE END

Write a Comment

User Comments (0)