Simulation and Application on learning gene causal relationships - PowerPoint PPT Presentation

About This Presentation
Title:

Simulation and Application on learning gene causal relationships

Description:

High-throughput genetic technologies empowers to study how genes interact ... if gk Sij, and k i, k j, add arrowheads pointing at gk, such as gi - gk - gj; ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 24
Provided by: Xin110
Category:

less

Transcript and Presenter's Notes

Title: Simulation and Application on learning gene causal relationships


1
Simulation and Application on learning gene
causal relationships
  • Xin Zhang

2
Introduction
  • High-throughput genetic technologies empowers to
    study how genes interact with each other
  • Simulation to evaluate how well IC algorithm
    learns gene causal relationships
  • We present an algorithm (mIC algorithm) for
    learning causal relationship with knowledge of
    topological ordering information, and apply it on
    Melanoma dataset
  • Apply mIC algorithm on Melanoma dataset

3
Steps for Simulation Study
  • Construct a causal network N
  • Generate datasets based on the causal network
  • Learning the simulated data using causal
    algorithms (e.g. IC algorithm) to obtain network
    N
  • Compare the original network N with obtained
    network N w.r.t precision and recall

4
Modeling and simulation of a causal Boolean
network (BN)
  • Boolean network
  • Constructing a causal structure
  • Assign parameters (proper functions) for each
    node with casual parents
  • Assign probability distribution

5
Constructing Boolean Network
  • 1. Generate M BNs with up to 3 causal parents for
    each node
  • 2. For each BN, generate a random proper function
    for each node
  • 3. Assign random probabilities for the root
    gene(s)
  • 4. Given one configuration, get probability
    distribution
  • 5. Collect 200 data points for each network
  • 6. Repeat above steps 3-5 for all M networks.

6
Constructing Causal Structure
7
Steps for constructing causal structure
8
Proper function (1)
Proper function The function that reflects the
influence of the operators. Example
By simplifying f, c is a function of a with c
a b is a pseudo predictor of c, and has no
effect on c.
f is not a proper function.
9
Proper function (2)
  • Definition
  • With n predictors, the number of proper function
    is given by

10
Probability Distribution
11
Generating dataset
12
Steps of learning gene causal relationships
  • Step1 obtain the probability distribution and
    data sampling
  • Step2 apply algorithms to find causal relations
  • Step3 compare the original and obtained networks
    based on the two notions of precision and recall
  • Step4 repeat step 1-3 for every random network

13
Comparing two networks
A
B
A
B
D
C
D
C
Original Network
Obtained Network
14
Precision and Recall
  • Original graph is a DAG, while obtained graph has
    both directed and undirected edges

Orig Graph Obt. Graph
FN
TP
TN
FP
PFN, PTP
PTN, PFP
Recall ATP/(AFNATP), Precision ATP/(ATP
AFP)
15
Observational equivalence and Transitive Closure
  • Two DAGs are said to be observational equivalent
    (OE) if they have the same skeleton and the same
    set of v-structure

OE
  • Transitive closure (TC)
  • A -gtB -gt C with A -gt C
  • cc(x,y) is true if there is a directed or an
    undirected edge from x to y
  • pcc(x,y) is true if there is a path from x to y
    consisting of properly directed and undirected
    edges
  • pcc(x,y) cc(x,y) pcc(x,z) ? pcc(z,y)

16
Result for IC algorithm
17
How to improve IC algorithm
  • The original IC algorithm did not have good
    results on learning gene causal relationships
  • A possible way to improve the performance is to
    incorporate extra information
  • If we know the topological ordering of the
    regulatory network, it would be helpful to
    improve the learning result

18
Gene topological ordering
  • If a specific gene is the causal parent of
    another gene
  • In a pathway, if one gene appears before another
    gene
  • If one gene is at the beginning or at the end of
    the pathway

IC algorithm topological ordering information
19
mIC algorithm
  • mIC algorithm based on IC, but incorporates both
    topological ordering information with steady
    state data to infer causality
  • 3 Steps of mIC algorithm
  • Find conditional independence
  • For each pair of gene gi and gj in a dataset,
    test pairwise conditional independence. If they
    are dependent, search for a set
  • Sij gk gi and gj are independent given gk,
    with iltkltj, or jltklti.
  • Construct an undirected graph G such that gi and
    gj are connected with an edge if an only if they
    are pairwise dependent and no Sij can be found
  • Find v-structure
  • For each pair of nonadjacent genes gi and gj
    with common neighbor gk, if gk ?Sij, and kgti,
    kgtj, add arrowheads pointing at gk, such as
    gi -gtgk lt- gj
  • Orientate more directed edges according to rules
  • Orientate the undirected edges without creating
    new cycles and v-structures

20
Results from mIC algorithm
21
Melanoma dataset
  • The 10 genes involved in this study chosen from
    587 genes from the melonoma data
  • Previous studies show that WNT5A has been
    identified as a gene of interest involved in
    melanoma
  • Controlling the influence of WNT5A in the
    regulation can reduce the chance of melanoma
    metastasizing

22
Applying mIC algorithm on Melanoma Dataset
Partial biological prior knowledge MMP3 is
expected to be the end of the pathway
23
Conclusion
  • Evaluated IC algorithm using simulation data
  • We presented mIC algorithm that can infer gene
    causal relationship from steady state data with
    gene topological ordering information
  • Performed simulation based on Boolean network to
    evaluate the performance of the causal
    algorithms
  • We applied mIC algorithm to real biological
    microarray data Melanoma dataset
  • The result showed that some of the important
    causal relationships associated with WNT5A gene
    have been identified using mIC algorithm.
Write a Comment
User Comments (0)
About PowerShow.com