Title: Continuous heterogeneity
1Continuous heterogeneity
- Shaun Purcell
- Boulder Twin Workshop
- March 2004
2Raw data VS summary statistics
- Zyg T1 T2
- 1 1.2 0.8
- 1 -1.3 -2.2
- 2 0.7 1.9
- 2 0.2 -0.8
- .. ... ...
3Raw data VS summary statistics
Zyg T1 T2 1 1.2 0.8 1 -1.3 -2.2 2 0.7 1.9 2 0
.2 -0.8 .. ... ...
4Raw data VS summary statistics
Zyg T1 T2 age 1 1.2 0.8 12.3 1 -1.3 -2.2 10
.3 2 0.7 1.9 8.7 2 0.2 -0.8 14.5 .. ... ...
...
5Bivariate normal distribution
6Introducing Definition variables
- Zygosity as a definition variable
- Rectangular file data.raw
1 1 0.361769 -0.35641 2 1 0.888986
1.46342 3 1 0.535161 0.636073 ... 1 2
0.234099 0.0848318 2 2 -0.547252 -0.22976 3
2 -0.307926 -0.253692 ...
7- !Using definition variables
- Group1 Defines Matrices
- Calc NGroups2
- Begin Matrices
- X Lower 1 1 free
- Y Lower 1 1 free
- Z Lower 1 1 free
- M full 1 1 free
- H Full 1 1
- End Matrices
- Begin Algebra
- A XX' C YY' E ZZ'
- End Algebra
- Ma X 0
- Ma Y 0
- Ma Z 1
- Ma M 0
- Options MXPrawfit.txt
- End
8Output from zyg.mx
- RE FILEDATA.RAW
- Rectangular continuous data read initiated
-
- NOTE Rectangular file contained 500 records
with data - that contained a total of 2000
observations -
- LABELS ID ZYG T1 T2
- SELECT T1 T2 ZYG /
- DEFINITION ZYG /
-
- NOTE Selection yields 500 data vectors for
analysis - NOTE Vectors contain a total of 1500
observations -
-
- NOTE Definition yields 500 data vectors for
analysis - NOTE Vectors contain a total of 1000
observations
9Output from zyg.mx
- Summary of VL file data for group 2
-
- ZYG T1 T2
- Code -1.0000 1.0000 2.0000
- Number 500.0000 500.0000 500.0000
- Mean 1.5000 -0.0140 0.0240
- Variance 0.2500 0.5601 0.5211
- Minimum 1.0000 -2.1941 -1.9823
- Maximum 2.0000 2.1218 2.7670
10Output from zyg.mx
- MATRIX H
- This is a FULL matrix of order 1 by 1
- 1
- 1 -1
-
- MATRIX M
- This is a FULL matrix of order 1 by 1
- 1
- 1 4
-
- MATRIX X
- This is a LOWER TRIANGULAR matrix of order 1
by 1 - 1
- 1 1
-
- MATRIX Y
- This is a LOWER TRIANGULAR matrix of order 1
by 1 - 1
- 1 2
Specify H -1
11Output from zyg.mx
- Your model has 4 estimated parameters and
- 1000 Observed statistics
-
- -2 times log-likelihood of data gtgtgt 2134.998
- Degrees of freedom gtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgt 996
- Fixing X to zero
- Your model has 3 estimated parameters and
1000 Observed statistics -
- -2 times log-likelihood of data gtgtgt 2154.626
- Degrees of freedom gtgtgtgtgtgtgtgtgtgtgtgtgtgtgtgt 997
-
12Continuous moderators
- Traits often best defined continuously
- Many environmental moderators also likely to be
continuous in nature - Age
- Gestational age
- Socio-economic status
- Educational level
- Consumption of food / alcohol / drugs
- How to test for G x E interaction in this case?
13Continuous moderators
Heritability
100
0
Age (yrs)
4
6
8
10
- Problems?
- Stratification of sample ? reduced sample size
- Modelling proportions of variance
- implicitly assumes equality of variance w.r.t
moderator - Logical to assume a linear G ? E interaction
- linearity at the level of effect, not variance
- No obvious statistical test for heterogeneity
14Biometrical G ? E model
- At a hypothetical single locus
- additive genetic value a
- allele frequency p
- QTL variance 2p(1-p)a2
- Assuming a linear interaction
- additive genetic value a ?M
- allele frequency p
- QTL variance 2p(1-p)(a ?M)2
15Biometrical G ? E model
No interaction
a
0
-a
M
Aa
AA
aa
16Model-fitting approach to GxE
A
C
E
A
C
E
c
a
e
a
c
e
Twin 1
Twin 2
17Model-fitting approach to GxE
A
C
E
A
C
E
c
a?XM
e
a?XM
c
e
Twin 1
Twin 2
Continuous moderator variable M Can be coded 0 /
1 in the dichotomous case
18Individual specific moderators
A
C
E
A
C
E
c
a?XM1
e
a?XM2
c
e
Twin 1
Twin 2
19E x E interactions
A
C
E
A
C
E
c?YM1
c?YM2
a?XM1
a?XM2
e?ZM1
e?ZM2
Twin 1
Twin 2
20ACE - XYZ - M
A
C
E
A
C
E
c?YM1
c?YM2
a?XM1
a?XM2
e?ZM1
e?ZM2
m?MM1
m?MM2
Twin 1
Twin 2
M
M
Main effects and moderating effects statistically
and conceptually distinct
21Model-fitting approach to GxE
C
Component of variance
A
E
Moderator variable
22Turkheimer et al (2003)
- 320 twin pairs recruited at birth from urban
hospitals - G additive genetic variance
- E SES
- parental education, occupation, income
- X IQ
- Wechsler Verbal, Performance, Full
23C
E
A
Full scale IQ
Verbal IQ
Non-Verbal IQ
24Standard model
- Means vector
- Covariance matrix
25Allowing for a main effect of X
- Means vector
- Covariance matrix
26- ! Basic model main effect of a definition
variable - G1 Define Matrices
- Data Calc NGroups3
- Begin Matrices
- A full 1 1 free
- C full 1 1 free
- E full 1 1 free
- M full 1 1 free ! grand mean
- B full 1 1 free ! moderator-linked means model
- H full 1 1
- R full 1 1 ! twin 1 moderator (definition
variable) - S full 1 1 ! twin 2 moderator (definition
variable) - End Matrices
- Ma M 0
- Ma B 0
- Ma A 1
- Ma C 1
- Ma E 1
- Matrix H .5
27- G2 MZ
- Data NInput_vars6 NObservations0
- Missing -999
- RE Filef1.dat
- Labels id zyg p1 p2 m1 m2
- Select if zyg 1 /
- Select p1 p2 m1 m2 /
- Definition m1 m2 /
- Matrices Group 1
- Means M BR M BS /
- Covariance
- AA' CC' EE' AA' CC' _
- AA' CC' AA' CC' EE' /
- !twin 1 moderator variable
- Specify R -1
- !twin 2 moderator variable
- Specify S -2
28- G3 DZ
- Data NInput_vars6 NObservations0
- Missing -999
- RE Filef1.dat
- Labels id zyg p1 p2 m1 m2
- Select if zyg 2 /
- Select p1 p2 m1 m2 /
- Definition m1 m2 /
- Matrices Group 1
- Means M BR M BS /
- Covariance
- AA' CC' EE' H_at_AA' CC' _
- H_at_AA' CC' AA' CC' EE' /
- !twin 1 moderator variable
- Specify R -1
- !twin 2 moderator variable
- Specify S -2
29- MATRIX A
- This is a FULL matrix of order 1 by 1
- 1
- 1 1.3228
-
- MATRIX B
- This is a FULL matrix of order 1 by 1
- 1
- 1 0.3381
-
- MATRIX C
- This is a FULL matrix of order 1 by 1
- 1
- 1 1.1051
-
- MATRIX E
- This is a FULL matrix of order 1 by 1
- 1
- 1 0.9728
30- MATRIX A
- This is a FULL matrix of order 1 by 1
- 1
- 1 1.3078
-
- MATRIX B
- This is a FULL matrix of order 1 by 1
- 1
- 1 0.0000
-
- MATRIX C
- This is a FULL matrix of order 1 by 1
- 1
- 1 1.1733
-
- MATRIX E
- This is a FULL matrix of order 1 by 1
- 1
- 1 0.9749
31Continuous heterogeneity model
- Means vector
- Covariance matrix
32- ! GxE - Basic model
- G1 Define Matrices
- Data Calc NGroups3
- Begin Matrices
- A full 1 1 free
- C full 1 1 free
- E full 1 1 free
- T full 1 1 free ! moderator-linked A component
- U full 1 1 free ! moderator-linked C component
- V full 1 1 free ! moderator-linked E component
- M full 1 1 free ! grand mean
- B full 1 1 free ! moderator-linked means model
- H full 1 1
- R full 1 1 ! twin 1 moderator (definition
variable) - S full 1 1 ! twin 2 moderator (definition
variable) - End Matrices
- Ma T 0
- Ma U 0
- Ma V 0
33- G2 MZ
- Data NInput_vars6 NObservations0
- Missing -999
- RE Filef1.dat
- Labels id zyg p1 p2 m1 m2
- Select if zyg 1 /
- Select p1 p2 m1 m2 /
- Definition m1 m2 /
- Matrices Group 1
- Means M BR M BS /
- Covariance
- (ATR)(ATR) (CUR)(CUR)
(EVR)(EVR) - (ATR)(ATS) (CUR)(CUS) _
- (ATS)(ATR) (CUS)(CUR)
- (ATS)(ATS) (CUS)(CUS)
(EVS)(EVS) / - !twin 1 moderator variable
- Specify R -1
34- G3 DZ
- Data NInput_vars6 NObservations0
- Missing -999
- RE Filef1.dat
- Labels id zyg p1 p2 m1 m2
- Select if zyg 2 /
- Select p1 p2 m1 m2 /
- Definition m1 m2 /
- Matrices Group 1
- Means M BR M BS /
- Covariance
- (ATR)(ATR) (CUR)(CUR)
(EVR)(EVR) - H_at_(ATR)(ATS) (CUR)(CUS) _
- H_at_(ATS)(ATR) (CUS)(CUR)
- (ATS)(ATS) (CUS)(CUS)
(EVS)(EVS) / - !twin 1 moderator variable
- Specify R -1
35Practical 1
- The script mod.mx
- The data f1.dat
- ID zygosity trait_twin_1 trait_twin_2 mod_twin_1 m
od_twin_2 - Any evidence for G E for this trait ?
- i.e. does the A latent variable show
heterogeneity with respect to the moderator
variable - If so, in what way?
- i.e. how would you interpret/describe the effect?
36Practical 1 f1.dat
Moderator distribution
MZ pairs (trait)
DZ pairs (trait)
All twin 1s v.s. moderator
37nomod.mx
- a 1.3078 a2 1.7
- c 1.1733 c2 1.4
- e 0.9749 e2 0.95
- a2c2e2 4.05
- i.e. variance is 42, 35 and 23
38Parameter estimates mod.mx
39Plotting VCs
- For the additive genetic VC, for example
- Given a, ? and a range of values for the
moderator variable - For example,
- a 0.5, ? -0.2 and M ranges from -2 to 2
40(No Transcript)
41Specific test of GE
42Other tests
All made against the full model ACE-XYZ-M, -2LL
3024.689
43Confidence intervals
- Easy to get CIs for individual parameters
- Additionally, CIs on the moderated VCs are useful
for interpretation - e.g. a 95 CI for (a?M)2, for a specific M
44- Define two extra vectors in Group 1
- P full 1 13
- O Unit 1 13
- Matrix P -3 -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
2.5 3 - Add a 4th group to calculate the CIs
- CIs
- Calc
- Matrices Group 1
- Begin Algebra
- F ( A_at_O T_at_P ) . ( A_at_O T_at_P ) /
- G ( C_at_O U_at_P ) . ( C_at_O U_at_P ) /
- I ( E_at_O V_at_P ) . ( E_at_O V_at_P ) /
- End Algebra
- Interval _at_ 95 F 1 1 to F 1 13
- Interval _at_ 95 G 1 1 to G 1 13
- Interval _at_ 95 I 1 1 to I 1 13
45Calculation of CIs
- F ( A_at_O T_at_P ) . ( A_at_O T_at_P ) /
- E.g. if P were
then ( A_at_O T_at_P ) equals
or
or
Finally, the dot-product squares all elements to
give
46Confidence intervals on VCs
A
C
E
47Other considerations
- Simple approach to test for heterogeneity
- easily adapted, e.g. for ordinal data models
- Extensions / things to watch for
- scalar v.s. qualitative heterogeneity
- v. low power
- the environment may show shared genetic influence
with the trait - nonlinear effects in both mediation and moderation
48E
X
49(No Transcript)
50Turkheimer et al, 2003
IQ
SES
V(IQ)
SES
51Simulated twin data
A 3 df test of any moderating effect
Standard analysis linear means model (in HA and
H0) Quadratic analysis linear and quadratic
means model (in HA and H0)
Quadratic
E(Trait)
Standard
Moderator
52More complex G ? E interaction
Trait P(disease)
E-risk
53Include E-risk in means model
Residual Trait P(disease E-risk)
E-risk
54Biometrical model
Additive genetic effect
Quadratic form
Aa
AA
aa
E-risk
55ACE - XYZ - X2Y2Z2 - M
A
C
E
A
C
E
a ?XM1 ?XM21
a?XM2 ?XM22
c
e
c
e
Twin 1
Twin 2