Introducing Bayesian Approaches to Twin Data Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Introducing Bayesian Approaches to Twin Data Analysis

Description:

Title: PowerPoint Presentation Author: Lindon Eaves Last modified by: Workshop User Created Date: 3/6/2001 3:24:37 PM Document presentation format – PowerPoint PPT presentation

Number of Views:105
Avg rating:3.0/5.0
Slides: 37
Provided by: Lindon
Category:

less

Transcript and Presenter's Notes

Title: Introducing Bayesian Approaches to Twin Data Analysis


1
Introducing Bayesian Approaches to Twin Data
Analysis
  • Lindon Eaves,
  • VIPBG, Richmond.
  • Boulder, March 2001

2
Outline
  • Why use a Bayesian approach?
  • Basic concepts
  • BUGS
  • Live Demo of simple application
  • Applications to twin data

3
Why Use Bayesian Approach?
  • Intellectually satisfying
  • Get more information out of existing problems
    (distributions of model parameters,
    individualgenetic scores)
  • Tackle problems other methods find difficult
    (non-linear mixed models growth curves GxE
    interaction)

4
Some references
  • Gilks, W.R., Richardson, S., Spiegelhalter, D.J.
    (1996) Markov Chain Monte Carlo in Practice.
    Chapman Hall, London.
  • Spiegelhalter, D., Thomas, A., Best, N. (2000)
    WinBUGS Version 1.3, User Manual, MRC BUGS
    Project Cambridge.
  • Eaves, L.J., Erkanli, A. (In preparation) Markov
    Chain Monte Carlo Approaches to Analysis of
    Genetic and Environmental Components of Human
    Developmental Change and GxE Interaction. (For
    Behavior Genetics).
  •  

5
The Traditional ApproachVia Likelihood
  • Given Data D and parameters q
  • The likelihood function, l, is
  • lP(Dq).
  • We find q that maximizes l.

6
Typically
  • Maximize likelihood numerically
  • Fairly easy for linear models and normal
    variables (LISREL)
  • Mx works well (best!)

7
BUT.
  • Some things dont work so well

8
For example
  • Getting confidence intervals etc.
  • Non-linear models (require integration over
    latent variables hard for large of
    parameters)
  • Estimating large numbers of latent variables
    (e.g. genetic factor scores)

9
Markov Chain Monte Carlo Methods(MCMC)
  • Allow more general models
  • Obtain confidence intervals and other summary
    statistics
  • Estimates missing values
  • Estimates latent trait values
  • All as part of the model-fitting process

10
Bayesian approach
  • ML works with
  • lP(Dq).
  • Bayesian approach seeks distribution of
    parameters given data
  • BP(qD).

11
How do we get P(qD)?
  • Use Bayes theorem
  • P(qD)P( q D)/P(D)
  • P(Dq).P(q)/P(D)

12
A couple of problems
  • We dont know P(q)
  • What is P(D)?

13
P(q)
  • Prior distribution not known but
  • may know (guess?) its form, e.g.,
  • Means may be normal
  • Variances may be gamma

14
P(D)
  • P(D)SP(Dq).P(q)dq
  • Where Sintegral sign (!)

15
How do we get integral?
  • If we know P(q) we could sample q many times and
    evaluate function. Integral is approximated to
    desired accuracy by mean of k (large) samples
  • (Monte Carlo integration)

16
We still have a problem..
  • We dont know P(q)
  • We only know its shape

17
Markov Chain Monte Carlo
  • Simulate a sequence of samples of q that
    ultimately converge to (non-independent) samples
    from the desired distribution, P(q).

18
If we succeed
  • When the sequence has converged (stationary
    distribution, after burn in from trial q) we
    may construct P(q) from sequence of samples.

19
One algorithm that can generate chains in large
number of cases
  • The Gibbs Sampler, hence
  • Bayesian Inference Using Gibbs Sampling
  • BUGS for short
  • Spiegelhalter, D., Thomas, A., Best, N. (2000)
    WinBUGS Version 1.3, User Manual, MRC BUGS
    Project Cambridge.

20
Obtaining WinBUGS
  • Find MRC BUGS project on www (search on WinBUGS)
  • Download educational version (free)
  • Register by email (at site)
  • Install educational version (Instructions at
    site)
  • Follow instructions in reply email to convert to
    production version (free)

21
Preview of example Using BUGS to estimate a
mean and variance
22
list(n50)y 14.1110 9.5125 13.2752
10.5952 8.6699... 10.0673 10.7618
8.2337 9.2170 7.3803 8.9194
4.9589list(mu10,tau0.2)
Data and Initial Values for Mean-Variance Problem
23
Doodle for mean and variance model
24
BUGS Code for Mean and Variance
25
Values of Mean (mu) First 200 iterations of MCMC
Algorithm
26
Values of Variance (Sigma2) First 200 iterations
of MCMC Algorithm
27
MCMC Estimates of Mean and Variance 5000
iterations after 1000 iteration burn in.
28
Application to Twin Data
  • Fitting the AE model to bivariate twin data

29
Table 1 Population parameter values used in
simulation of bivariate twin data and values
realized using Mx for ML estimation (N100 MZ
and 100 DZ pairs).    
 
30
Doodle for Multivariate AE Model
31
list(N2,nmz100,ndz100,meanc(0,0),precis
structure(.Datac(0.0001,0, 0,
0.0001),.Dimc(2,2)),omega.gstructure(.Datac(0.
0001,0,0,0.0001),.Dimc(2,2)),omega.estructure(.
Datac(0.0001,0,0,0.0001),.Dimc(2,2)))ymz,1,1
ymz,1,2 ymz,2,1 ymz,2,2 9.9648 9.4397
10.1008 9.6549 8.9251 10.5722 9.5299 10.5583
10.7032 9.9130 11.1373 10.2855 10.8231 11.5187
11.0396 10.7342 11.3261 12.4088 11.4504 11.9600
9.4009 10.7828 9.5653 11.8201
Start of Data for Bivariate Twin Example
32
Iteration history for estimates of means
33
Iteration History for Genetic Covariances
34
Summary statistics for 5000 MCMC iterations of
bivariate AE model after 2000 iteration "burn in"
node mean sd MC error 2.5 median 97.5 de
viance 985.7 67.7 2.91 856.7 985.4 1118.0
mu1 9.991 0.05906 0.001452 9.877 9.992 10.11
mu2 10.05 0.06184 0.001871 9.924 10.05 10.17
g1,1 0.705 0.08122 0.003302 0.5567 0.7018 0.8
731 g1,2 0.3719 0.06418 0.0024 0.252 0.3711 0.
5065 g2,2 0.7394 0.08337 0.002786 0.5885 0.736
7 0.9113 e1,1 0.197 0.02872 0.001223 0.1476
0.1943 0.2606 e1,2 0.09904 0.02398 9.144E-4 0.
05537 0.09788 0.1486 e2,2 0.2581 0.03528 0.0012
3 0.1977 0.2555 0.3337
35
Comparison of ML and MCMC Estimates for Bivariate
AE model
Parameter ML MCMC
Mu(1) 10.00 10.00
Mu(2) 10.05 10.05
G(1,1) 0.704 0.705
G(1,2) 0.371 0.372
G(2,2) 0.741 0.739
E(1,1) 0.194 0.197
E(1,2) 0.098 0.099
E(2,2) 0.254 0.258
36
Illustrative MCMC estimates of genetic effects
first two DZ twin pairs on two variables
Observation Est S.e.
MC error 2.5 Median 97.5
g1dz1,1,1 11.53 0.3756 0.007243 10.77 11.53 12
.27 g1dz1,1,2 10.94 0.4263 0.007021 10.1 10.9
4 11.78 g1dz1,2,1 11.44 0.3763 0.006501 10.7
11.43 12.18 g1dz1,2,2 10.93 0.4286 0.007602 10
.08 10.95 11.76 g1dz2,1,1 9.231 0.3776 0.00786
6 8.499 9.228 9.992 g1dz2,1,2 10.25 0.4248 0.0
07331 9.409 10.25 11.09 g1dz2,2,1 9.331 0.373
0.005969 8.606 9.329 10.07 g1dz2,2,2 9.5
05 0.4188 0.007975 8.694 9.5 10.31
Write a Comment
User Comments (0)
About PowerShow.com