Identifying Differentially Regulated Genes - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Identifying Differentially Regulated Genes

Description:

Identifying Differentially Regulated Genes Nirmalya Bandyopadhyay, Manas Somaiya, Sanjay Ranka, and Tamer Kahveci Bioinformatics Lab., CISE Department, – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 23
Provided by: Nirma3
Learn more at: https://www.cise.ufl.edu
Category:

less

Transcript and Presenter's Notes

Title: Identifying Differentially Regulated Genes


1
Identifying Differentially Regulated Genes
  • Nirmalya Bandyopadhyay, Manas Somaiya,
  • Sanjay Ranka, and Tamer Kahveci
  • Bioinformatics Lab., CISE Department,
  • University of Florida

2
Gene interaction through regulatory networks
  • Gene networks The genes are nodes and the
    interactions are directed edges.
  • Neighbors
  • incoming neighbors and outgoing neighbors.
  • A gene can changes the state of other genes
  • Activation
  • Inhibition

3
Perturbation experiments
Perturbation
ERK
K-Ras
Raf
MEK
JNK
Differentially expressed genes
RalGDS
Ral
RalBP1
Cob42Rac
PLD1
  • In a perturbation experiment stimulant
    (radiation, toxic element, medication), also
    known as perturbation, is applied on tissues.
  • Gene expression is measured before and after the
    perturbation.
  • A gene can change its expression as a result of
    perturbation.
  • Differentially expressed gene (DE).
  • Equally expressed gene (EE).

4
Perturbation experiment single dataset
Perturbation
Primarily affected genes
ERK
K-Ras
Raf
MEK
JNK
Secondarily affected genes
RalGDS
Ral
RalBP1
Cob42Rac
PLD1
  • Primarily affected genes Directly affected by
    perturbation.
  • Secondarily affected genes Primarily affected
    genes affect some other genes.

5
Differentially and Equally regulated
Differentially expressed
g3
g2
g3
g2
Equally expressed
g1
g4
g5
g1
g4
g5
DB
DA
  • Some dataset inherently has two groups.
  • Fasting vs non-fasting, Caucasian American vs
    African American
  • For these datasets, a gene is
  • Differentially regulated DE in one group and EE
    in another.
  • Equally regulated DE or EE in both the groups.
  • Here, gene g1 is DE in data DA and EE in DB.
    Hence, it is DR.

6
Two datasets Primary and secondary effects
Primarily differentially expressed
g0
g2
g2
g3
g3
Secondarily differentially expressed
g1
g4
g5
g1
g4
g5
Equally expressed
DB
DA
  • Primarily differentially regulated genes (PDR)
    Directly affected by perturbation.
  • Secondarily differentially regulated genes (SDR)
    Primarily affected genes affect some other genes.

6
6
7
Problem method
  • Input Gene expression (control and non-control)
    of two data groups DA and DB.
  • Problem Analyzing the primary and secondary
    affects of the perturbation
  • Estimate probability that a gene is
    differentially regulated because of the
    perturbation or because of the other genes
    (incoming neighbors)?
  • What are the primarily differentially regulated
    genes?
  • Method
  • Probabilistic Bayesian method, where we employ
    Markov Random Field to leverage domain knowledge.

8
Notation
SAi SBi SAj SBj Zi Zj Xij
DE DE DE DE 1 1 1
DE DE DE EE 1 2 2
DE DE EE DE 1 3 3
DE DE EE EE 1 4 4
DE EE DE DE 2 1 5
DE EE DE EE 2 2 6
DE EE EE DE 2 3 7
DE EE EE EE 2 4 8
EE DE DE DE 3 1 9
EE DE DE EE 3 2 10
EE DE EE DE 3 3 11
EE DE EE EE 3 4 12
EE EE DE DE 4 1 13
EE EE DE EE 4 2 14
EE EE EE DE 4 3 15
EE EE EE EE 4 4 16
  • Observed variables
  • Microarray datasets
  • Two data groups DA, DB
  • A single gene gi in group C, (C ? A,B)
  • For All genes in group A

  • Neighborhood variables
  • Hidden variables
  • State variables
  • Regulation variables Zi
  • Interaction variables Xij

9
Problem formulation
  • Input to the problem
  • Microarray expression Y
  • Gene network V G, W
  • G g0, g1, g2, , gM where g0 is metagene.
  • Goal
  • Estimate the density p(Xij X- Xij, Y, V, Wij 1
    ) for all Wij. This gene estimates the
    probability that a gene is DR due to the
    perturbation or due to an incoming neighbor gene.
  • Note A higher value for p(Xij 2, 3 X- Xij,
    Y, V, Wij 1 ) indicates a higher chance that gj
    is affected by gi

10
Bayesian distribution
  • We propound a Bayesian model as it allows us to
    incorporate our beliefs into the model.
  • The joint probability distribution over X
  • We can derivate the density of Xij , p(Xij X-
    Xij, Y, V, Wij 1) from the joint density
    function.

11
Prior density function Markov random field
g0
  • MRF is an undirected graph ? (X, E).
  • X Xij represents an edge in the gene network.
  • E (Xij, Xpj) Wpi Wij 1 U (Xij, Xik)
    Wjk Wij 1
  • An edge in MRF corresponds to two edges in the
    gene network.
  • (X23, X25) corresponds to (g2, g3) and (g3, g5)

g3
g2
g3
g2
g1
g4
g5
g1
g4
g5
DA
DB
(a) Gene network
X01 (2)
X02 (1)
X03 (1)
X05 (3)
X04 (4)
X12 (5)
X23 (1)
X35 (3)
X25 (7)
X14 (8)
X13 (5)
(b) Markov random field
12
Prior density function Feature functions
  • Three beliefs relevant to our model
  • In a data group, the meta gene g0 can affect the
    states of all other genes. (modeled by adding
    directed edges from g0 to all other genes.)
  • In a data group, a gene can affect the state of
    its outgoing neighbors.
  • A gene has high probability of being equally
    regulated.
  • We incorporate these beliefs into the MRF graph
    using seven feature functions.
  • Feature function Unary or Binary function over
    the nodes of MRF. A feature function allows us to
    introduce our belief on the graph.

13
Feature Functions
  • Unary Capture the frequency of Xij.
  • Binary Encapsulates the second belief that In a
    data group, a gene can affect the state of its
    outgoing neighbors.
  • Unary Capture the third belief that a gene has
    high probability of being equally regulated.
  • Prior density function

Left External Equality
Right External Equality
Left Internal Equality
Right Internal Equality
Feature functions
14
Binary External feature functions
  • The external feature functions encapsulate the
    belief that in a data group, a gene can affect
    the state of its outgoing neighbors.
  • Left Equality
  • Xij Xpj Zi Zp
  • Right Equality
  • Xij Xik Zj Zk

g1
g2
g3
g4
(a) Gene network
Right equality for X23
X12
X23
X13
X24
X34
Left equality for X23
(a) MRF network
15
Unary Internal feature functions
  • The internal feature function represents the
    belief that a gene has high probability of being
    equally regulated.
  • gi is equally regulated.
  • Xij 1,2,3,4 Zi 1 (DE)
  • Xij 13,14,15,15 Zi 4 (EE)
  • gj is equally regulated.
  • Xij 1,5,9,13 Zj 1 (DE)
  • Xij 4,8,12,16 Zj 4 (EE)

16
Objective function optimization
Differential evolution
Obtain an initial estimate of state variables.
Estimate parameters for likelihood density.
Estimate parameters that maximize the prior
density.
Students t
Estimate parameters that maximize the
pseudo-likelihood density.
Rank the DE genes based on the likelihood w.r.t
the metagene.
ICM
17
Dataset and experimental setup
  • DataSet
  • Real Adapted from Smirnov et al. generated using
    10 Gy ionizing radiation over immortalized B
    cells obtained from 155 doner.
  • Real/Synthetic We created synthetic data to
    simulate the perturbation experiment based on the
    real dataset. The simulated model is taken from
    Modeling of Multiple Valued Gene Regulatory
    Networks, by Garg et. al.
  • Gene regulatory network 24,663 genetic
    interactions over 2,335 genes collected from KEGG
    database.
  • Experimental setup
  • Implemented our method in MATLAB and java.
  • Ran our code on a quad core AMD Opteron 2 Ghz
    workstation with 32GB memory.

18
Comparison with other methods
  • We compared our method with three other methods
  • SMRF Our old method, developed to analyze the
    effect of external perturbation on a single data
    group.
  • SSEM A method to differentiate between primary
    and secondary effect of perturbation on gene
    expression dataset.
  • Two sample t-test (Students t test)

19
Comparison with other methods
20
Conclusions
  • Our method could find primarily affected genes
    with high accuracy.
  • It achieved significantly better accuracy than
    SMRF, SSEM and the students t test method.
  • Our method produces a probability distribution
    rather than a fixed binary decision.

21
Acknowledgement
  • This work was supported partially by NSF under
    grants CCF-0829867 and IIS-0845439.

22
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com