Title: b
1Bayesian inference in differential expression
experiments
Sylvia Richardson Natalia Bochkina Alex
Lewin Centre for Biostatistics Imperial College,
London
Biological Atlas of Insulin Resistance
www.bgx.org.uk
BBSRC
2Background
- Investigating changes of gene expression under
different conditions is one of the key questions
in many biological experiments - Specificity of the context is
- High dimensional data (ten of thousands of genes)
and few samples - Need to borrow information
- Many sources of variability
- Important to adopt a flexible modelling framework
Bayesian Hierarchical Modelling allows to capture
important features of the data while maintaining
generalisibility of the tools/ techniques
developed
3Modelling differential expression
Condition 2
Condition 1
Start with given point estimates of expression
Hierarchical model of replicate variability and
array effect
Hierarchical model of replicate variability and
array effect
Posterior distribution (flat prior)
Differential expression parameter
Mixture modelling for classification
4Outline
- Background
- Bayesian hierarchical models for differential
expression experiments - Decision rules based on tail posterior
probabilities - Comparison with existing approaches
- FDR estimation for tail posterior probabilities
- Extension of tail posterior probabilities to
analysing multiclass experiments - Illustration
- Discussion and further work
5I -- Bayesian hierarchical model for differential
expression (Lewin et al, Biometrics, 2006)
- Data ygcr log gene expression gene g,
replicate r, condition c - ?g gene effect
- dg differential effect for gene g between
2 conditions - ?r(g)c array effect modelled as a smooth
(spline) function of ?g - ?gc2 gene specific variance
- 1st level yg1r ? N(?g ½ dg ?r(g)1 ,
?g12) - yg2r ? N(?g ½ dg ?r(g)2 , ?g22)
- Sr?r(g)c 0, ?r(g)c function of ?g ,
parameters c,d -
- 2nd level Flat priors for ?g , dg, c,d
- ?gc2 ? g (ac, bc)
- (lognormal or inverse-gamma)
Exchangeable variances
6Joint modelling of array effects and differential
expression
- Performs normalisation simultaneously with
estimation - Gives fewer false positives than plug in
- BHM set up allows to check some of the modelling
assumptions using mixed posterior predictive
checks - the need for gene specific variances
- their 2nd level distribution
-
Found that lognormal or 2 parameter inverse gamma
distribution for the variances gave similar
model checks
7Selecting genes that are differentially expressed
- Interested in testing the null hypothesis
- Two broad approaches have been used
P value type Mixture P(H0 ygcr)
H0 H1 U 0,1 close to 0 close to 1 close to 0
References Baldi and Long Smyth 2004, Moderated t stat Lonnstedt Speed 02, Newton Kendziorski, 01, 03 Lonnstedt Britton 05, Gottardo 06, .
8Bayesian mixtures
- Relies on specification of prior model for
- Choice of model for the alternative (see the
poster by Alex Lewin) - Could influence the performance of the
classification - To check how the alternative fits the data is non
standard
Investigate properties of Bayesian selection
rules based on non informative prior for
9II -- Bayesian selection rules for pairwise
comparisons
- 1st level (no array effect)
- Hierarchical model
-
- Extend p value approach to consider the tail
probabilities of appropriate function of
parameters
10Posterior distributions
- Define the Bayesian T statistic
- The following conditional distributions hold
11Tail posterior probabilities 1 (N. Bochkina and
SR, 2006)
- Use selection rules of the form
- What statistic to choose
- How to define its percentiles ?
- we suppose that we could have observed data
- with (its expected value of
under the null) - work out the percentiles using posterior
distributions conditional on
Summarise the distribution of the Tg by a tail
area
12Tail posterior probabilities 2Recall
- Corresponding distribution function involves
numerical integration ? computationally expensive
- But
- Distribution function of
- does not involve gene specific parameters
? The percentile is easy to calculate ? Consider
the tail probability
13Key point F0 is gene independent (conjugate case)
14Another Bayesian rule
- A natural idea is to compare the parameter
to 0, - i.e. to consider
- or its complementary or the 2-sided alternative
- It turns out that this Bayesian selection rule
behaves like a p-value - Distribution of is uniform under H0
- There is equivalence with frequentist testing
based on the marginal distribution of under
the null, in the spirit of the moderated t
statistic introduced by Smyth 2004
15Link between p(dg,0) and the moderated t statistic
Moderated t statistic
16Histograms of measure of differential
expression Simulated data
17Tail posterior probabilities 3
- Investigate the performance of selection rules
based on - In particular
- what is the FDR associated with each value of
? - In the conjugate case
- How does this rule compares to rules based on
Use F0
Use Storey
Use observed proportion
18Comparison of estimated (solid line) and true
FDR (dashed line) on simulated data
p0 0.90
p0 0.70
p0 0.95
19III-- Data Sets and Biological questions
- Biological Questions
- Understand the mechanisms of insulin resistance
- Cell line experiments where reaction of mouse
muscle cell line to treatment by insulin or
metformin (an insulin replacement drug) is
observed after 2 and 12 hours - Questions of interest related to simple and
compound comparisons - 3 replicates for each condition, Affymetrix
MOE430A chip, 22690 genes per chip - Data pre-processed by RMA and normalised using
intensity dependent LOESS normalisation
20Volcano plots for muscle cell data Change
between insulin and control at 2 hours
p(tg , t (a)), a 0.05
2max p(dg ,0), 1- p(dg ,0) - 1
Cut-off 0. 925
Peaked around zero Varies steeply as a function
of
Less peaked around zero Allows better separation
21Insulin versus control
p0 0.61
p0 0.98
22Metformin versus control
Tail posterior probabilities
2 hours
12 hours
p0 0.56
p0 0.79
Estimated FDR
72 selected (FDR 0.5)
1854 selected (FDR 0.5)
23IV Extension to the analysis of multi class data
- In our case study, 3 groups (control c0, insulin
c1, metformin c2) and 3 times points t0, t1
( 2 hours), t2 (12 hours) each replicated 3
times - ANOVA like model formulation suited to the
analysis of such multifactorial experiments
Global variance parametrisation (borrowing
information)
24Joint tail posterior probabilities
- Interest is in testing a compound null
hypothesis, i.e. involving several differential
parameters - e.g. testing jointly for the effect of insulin
and metformin at 2 hours - In this case, we are interested in a specific
alternative - Note Rejecting the null hypothesis in an ANOVA
setting corresponds to a different alternative - Define joint tail posterior probabilities
- where is the Bayesian T statistic for each
treatment
25Benefits of joint posterior probabilities
- Takes into account correlation of the
differential expression measures between the
conditions induced by sharing the same variance
parameter - Usual practice is to
- Carry out pairwise comparisons
- Select genes for each comparison using same
cut-off on the pp - Intersect lists and find genes common to both
lists - Joint pp shown to lead to fewer false positives
in this case of positive correlation (simulation
study)
26Correlation of DE parameters and Bayesian T
statistic for insulin and metformin (2 hours)
- With joint tail posterior probabilities, and a
cut-off of pcut 0.92, 280 selected as jointly
perturbed at 2 hours - Applying pairwise comparison and combining the
lists adds another 47 genes to the list
27Discussion 1
- Tail posterior probabilities (Tpp) is a generic
tool that can be used in any situations where a
large number of hypotheses related in a
hierarchical fashion are to be tested - We have derived the distribution of the Tpp under
the null and proposed a corresponding estimate of
FDR - This distribution requires numerical integration
but is gene independent (conjugate case), so only
needs to be evaluated once - Tpp is a smooth function of the amount of DE with
a gradient that spreads the genes, thus
allowing to choose genes with desired level of
uncertainty about their DE - Interesting connection between Bayesian and
frequentist inference for the differential
expression parameter
28Discussion 2
- Interesting to compare performance of Tpp with
that of mixture models - E.g Gamma mixtures (see poster by Alex Lewin)
- dg p0d0 p1G (-x1.5, ?1) p2G (x1.5, ?2)
- H0 H1
- Dirichlet distribution for (p0, p1, p2)
- Exp(1) hyper prior for ?1 and ?2
- Also Normal and t mixtures have been considered
- dg p0d0 (1-p0) T(?,µ,t) (µ 1, t,
? -1 Exp(1) ) - dg p0d0 (1-p0) N(µ,t) (µ 1, t
Exp(1) )
29Simulated data
- 3000 variables, 6 replicates, 2 conditions
- yg1r ? N(?g, ?g2)
- yg2r ? N(?g dg, ?g2)
- ?g2 0.03 LogNorm(-3.85, 0.82),
- ?g Norm(7, 25),
- dg slightly asymmetric
- 5 dg dg gt 0 h( dg),
- 10 dg dg lt 0 h(-dg),
- 85 dg N(0, 0.01),
30Comparison of mixture and tail pp
- Fit 3 mixture models (Gamma, Normal, t
alternative) and flat model. - Classification mixtures P H1 data, flat tail
posterior probability.
Comparable performance, with a little edge for
the Gamma and Normal mixture
31Thanks
BBSRC Exploiting Genomics grant Wellcome Trust
BAIR consortium Colleagues in the Biostatistics
group Marta Blangiardo, Anne Mette Hein, Maria
de Iorio Colleagues in the Biology group at
Imperial Tim Aitman, Ulrika Andersson, Dave
Carling Papers and technical reports
www.bgx.org.uk/ For the tail probability
paper www.bgx.org.uk/Natalia/Bochkina.ps