Title: Smooth Response Surface
1Patching the Puzzle of Genetic Network
Grace S. Shieh Institute of Statistical
Science, Academia Sinica gshieh_at_stat.sinica.edu.tw
2(No Transcript)
3Outline
- What is Genetic Network?
- Why the area is one of the frontiers?
- How Statistical modeling/computational
algorithms simplify the complex puzzle? - Applications
4 Dogma of biology
- DNA -gt mRNA -gt Protein
- Proteins the elements that function in
organisms, e.g. yeast and human.
5Somatic mutations affect key pathways in Lung
adenocarcinoma Nature, Oct.2008
6Science, Sept, 2008
7(No Transcript)
8Complex human disease
- Digenic effects may underlie
- Type II diabetes
- Schizophrenia
- Retinitis pigmentosa
- Glaucoma
- Tong et al., Science 2004
9Complex human disease
- These diseases may have similar synthetic effect
in the yeast genetic interaction map
Elements of genetic network derived from model
organism, e.g. yeast, are likely to be conserved
The topology of the genetic network of
neighborhood of SGS1
(Tong et al., 2004)
10Experimental method to reveal genetic interactions
- Systematic Genetic Analysis with ordered Arrays
of Yeast Deletion Mutants Tong et al., 2001,
Science - Global mapping of the Yeast Genetic interaction
network - Tong et al., 2004, Science
- Genome landscape of a cell
- Costanzo et al. 2010, Science
11Costanzo et al., Science 2010
12- Synthetic sick or lethal (SSL) gene pairs when
both genes are mutated, the organism will die,
but neither lethal - SSL is important for understanding how an
organism tolerates genetic mutations -
- Hartman, Garvik and Hartwell, 2001, Science
-
13(No Transcript)
14Scenarios resulting in synthetic interaction
lt 2
lt 4
15A Pattern Recognition Approach to Infer Gene
Networks
Grace S. Shieh
joined with C.-L. Chuang, C.-H. Jen and C.-M.
Chen
Bioinformatics 2008
16Excerpted from Tong et al. (2001) Science
17- Transcriptional Compensation (transcription
reverse compensation) interactions (Lesage et al.
2004 Wong Roth, 2005, Genetics Kafri et
al.,2005, Nature Genetics) - among paralogues or SSL gene pairs, when one
gene is mutated, its partner genes expression
increases (decreases) -
- Goal to predict TC and TRC interactions among
SSL gene pairs
18Four sets of Yeast (Sachromyces cerevisiae)
micro-array gene expression data (Spellman, et
al, 1998) were used. The red channel R
intensities of synchronized yeast by alpha
factor arrest, arrest of a cdc 15 or cdc 28
mutant and Elutration The Green channel G
average of non-synchronized.
19Cell cycles of CLN2 gene
20qRT-PCR experiments
- For a given pair of SSL genes,
- Experimental group gene As expression,
- gene B been knocked
out - Control group gene As expression,
- gene B wildtype
- if A gtgt B gt A B may be TC
- if A ltlt B gt A B may be TRC
21(No Transcript)
22Gene expression of Transcription Compensation
(TC) pairs
23Gene expression of Transcription Reverse
Compensation (TRC) pairs
24The dependence of patterns and their associated
interactions
- Assumption for PARE
- the dependence of CP (SP) and TC (TD)
interactions is significant. To test this
hypothesis Fishers exact test -
25The Proportion of Complementary Pattern (CP) in TC
- Screen genes with significant changes over time
by - resulted in 35 gene pairs
CP SP Total
TC 13 9 22
TD 2 11 13
Total 15 20 35
Fishers exact test p-value lt 0.02 significant
at 95 level
26PARE
The gene expression of the regulating gene
is treated as object contour, and the lagged-1
expression of the target gene the boundary of
interest in image segmentation algorithm
27Discrete Signals
Because gene expression is discrete signal,
the 1st- and 2nd-order partial differential terms
can be modified as follows the
interaction can be determined as weighted
sum of the internal and external energies
28PARE
- In this study, each gene is represented by a node
in a graphical model, which is denoted by ,
where i 1, 2, , N. The edge represents
the gene-gene interaction between - and , where the enhancer gene
plays a key role in activating or repressing the
target gene .
29Training set vs test set
- Leave-one-out cross validation
- among n pairs, use n-1 pairs to train PARE,
then predict the left 1 pair, iteratively for n.
- 3-fold cross validation
- among all pairs, use 2/3 pairs to train, then
predict the left 1/3, from all combinations
iterative this for N times
30Experimental Results (TC/TRC) alpha data set (18
time points)
Table 1. The prediction results, checked against
the qRT-PCR experiments
Training Training Test Test Test
TPR FPR TPR Std FPR
Lagged Corr. Lagged Corr. 46
EB-GGMs EB-GGMs 52
PARE n-fold 76 20 73 23
PARE 3-fold 78 18 71 3 23
Since 500 times 3-fold CVs were performed, only
averages of TPRs are reported.
31Experimental Results (TC/TRC)
- For the alpha dataset, PARE yields
- 71-73 of true-positive rate
- prediction accuracy 81
- FPR for predicting TC (TD) interaction was
bounded by 12 (10) genome-wide.
32Experimental Results (TC/TRC)
33Checking against published literature
- These genetic interactions are consistent with
the following experimental results - Sgs1 and Srs2 are known redundant pathways in
replication (Ira et al., 1999 Lee et al., 1999) - Ex Srs2 and Sgs1-Top3 suppress crossovers
during double stand break repair in yeast.
34- Sgs1/Top3/Rmi1 and Mus81/Mms4 complex are
involved in both double-strand break repair and
homologous recombination (Frabe et al., 2002). - This indicates that Sgs1/Top3/Rmi1 and Mus81/Mms4
are alternative pathways to resolve recombination
intermediates.
35Inferring transcriptional interactions
- 132 pairs of Activator-target gene (AT) and
- Repressor-target (RT) gene interactions
- were collected from published literatures
-
- (MIPS, Mewes et al, 1999, Nucleic Acids
Research - Gancedo, 1998, Microbiology Molecular
Biology - Draper et al., 1994, Molecular Cellular
Biology, etc)
36(No Transcript)
37Test for CP (SP) associatied with RT (AT) pairs
in the data
Chi-Squared test
38Experimental Results (AT/RT)
Table 2. The prediction results using Elu data
set, checked
against the 132 TIs from literatures.
Training Training Test Test Test
TPR FPR TPR Std FPR
Lagged Corr. Lagged Corr. 51
EB-GGMs EB-GGMs 59
PARE n-fold 79 16 77 17
PARE 3-fold 81 16 74 3 19
the average of 500 times repeats
FPRs for genome-wide TIs predictions, and they
are bounded by 21.
39Conclusions
- The proposed PARE learns gene expression
patterns, then it can predict similar genetic
interactions using microarray data. - TPRs of PARE applied to the alpha (Elu) dataset
are about 73 (77) for inferring TC/TD
interactions (TI), respectively.
40(No Transcript)
41Inferring genesis of obesity in human (join w.
Karine Jean-Daniel
- MGED from
- Human adipocyte-derived cell lines
- Adipocytes
- cells that primarily compose adipose tissue
- specialized in storing energy as fat
Time-course MGED
42PARE to infer genesis of obesity in human
- Training stage
- MGED of human adipocytes-derived cell lines
- 70 known transcriptional interactions (TIs) from
iHOP - Prediction results
- 40 pairs of TIs and some genetic interactions
predicted - Some are consistent with existing experimental
results, some novel ones
43Inferring TIs
- Data preparation
- Select significantly expressed genes
- P-value lt 0.01
- Significantly expressed in at least 1 time point
(5 time points in total) - -gt36 genes with a function of interest
- Interact with 14 genes of interest (AP2,
CCL2, CCL5, LEP, etc) -gt 504 gene pairs
44WebPARE webcomputing service of PARE (Chuang,
Wu, Cheng and Shieh, 2010, Bioinformatics)
- To provide a simple web-interface for users to
infer GIs/TIs using time course gene expression
data and existing knowledge, e.g. pre-stored
validated TIs in yeast, mouse, human, etc
(TRANSFAC)
45Architecture of WebPARE
45
46- An example
- A list of genes involved in cell cycle and a data
set (e.g. Elu) were uploaded to WebPARE, TIs of
these pairs were of interest. - Using integrated (pre-stored) pairs of TIs in
yeast, PARE correctly predicted 118 out of 176
TIs, mTPR67 - e.g. The significant
- predicted network
- from 66 pairs -gt
46
47WebPARE html www.stat.sinica.edu.tw/WebPARE
48Demo
- WebPARE can be assessed at http//www.stat.sinica
.edu.tw/WebPARE
49Acknowledgement
Dr. Ting-Fang Wang and Da-Yow Huang,
Inst. of Biological Chemistry, Academia
Sinica Drs. Karine Clement and J-D. Zucker,
INSERM IRD, France Cheng-Long Chuang,
Chin-Yuan Guo, Chia-Chang Wang, Dr. Shi-Fong Guo,
Yu-Bin Wang, Jia-Hung Wu Inst. of
Statistical Science
50- Thank you for your attention!
51 Wanted (??)
- ?? PhD students
- Research assistants
- to work at Shieh lab.(????????)
-
?????? -
-
52Parameter estimation
Next, we estimate parameters via the
particle swarm optimization (PSO) algorithm
(Kennedy and Eberhart, 1995) is a stochastic
optimization technique that simulate the
behavior of a flock of birds.
53Example (finding largest gradient)Evolutionary
Process of PSO
54Gene expression of Activator-Target (AT) gene
pairs
55Gene expression of Repressor-Target (RT) gene
pairs