Missing Heritability - PowerPoint PPT Presentation

About This Presentation
Title:

Missing Heritability

Description:

Genome-wide association studies (GWASs) uncovered thousands of genetic variants associated with hundreds of diseases. However, the variants that reach statistical significance typically explain only a small fraction of the heritability and hence, the heritability estimates obtained in GWAS are much lower than those of traditional quantitative methods. This has been called the “missing heritability problem”. – PowerPoint PPT presentation

Number of Views:78

less

Transcript and Presenter's Notes

Title: Missing Heritability


1
Instructions for use
ICAR-Indian Agricultural Statistical Research
Institute
Missing Heritability
Shashank kshandakar Ph.D.(Agricultural
Statistics), Roll No.10684
2
Contents
  • Introduction
  • Dissolving the of Missing Heritability Problem
  • Estimation of Heritability from Common Variants
  • HasemanElston Regression Method
  • Heritability from Case-control Study
  • Liability Threshold Model
  • Illustration
  • Conclusions
  • References

3
Introduction
  • Heritability is a key genetic parameter that can
    help to understand the genetic architecture of
    complex traits.
  • Heritability (in narrow sense) defined as the
    proportion of total phenotypic variation that is
    due to additive genetic factors
  • In GWAS, statistical significance variants
    explain only a small fraction of the heritability
    and the heritability estimates obtained in GWAS
    are much lower than those of traditional
    quantitative methods.

4
Introduction
  • Estimation and knowledge of missing heritability
    is important because disease susceptibility is
    known to be due to genetic factors, and
    understanding this genetic variation may
    contribute to better prevention, diagnosis and
    treatment of disease.
  • Knowledge of Missing heritability help in
    planning of research strategies to uncover the
    genetic risk factors (Maher, 2008)

5
Heritability and number of loci for several traits (Manolio et al.? 2009)  Heritability and number of loci for several traits (Manolio et al.? 2009)  Heritability and number of loci for several traits (Manolio et al.? 2009) 
Disease Number of loci Proportion of heritability explained (GWASs)
ARMD 5 50
Crohns disease 32 20
Fasting glucose 4 1.5
HDL cholesterol 7 5.2
Height 40 5
Myocardial infarction 9 2.8
Type 2 diabetes 18 6
6
Probable Reasons for Missing heritability
  • Nadeau (2009), pointing that the missing
    heritability is due to epigenetic factors
  • Epigenetics is the study of heritable changes in
    gene expression that are not caused by changes in
    DNA sequence
  • Epigenetic modifications are influenced by the
    environmental factors, such as smoking, stress,
    and nutrients
  • Trans-generational epigenetic inheritance
    that contribute to disease risk would not be
    detectable in GWAS but may contribute to average
    risk and to similarities among relatives.
    (Richards, 2006)

7
Probable Reasons for Missing heritability
  • Manolio et al. (2009) termed Missing Heritability
    as a dark matter of GWAS, dark matter in the
    sense that one is sure it exists, can detect its
    influence, but simply cannot see it (yet)
  • Kong (2009), suggest that the rare variants
    contribute to missing heritability in two ways
    first, they are more difficult to discover and
    second, even if discovered, their contribution to
    heritability would be underestimated when
    evaluated under models that do not take parental
    origin into account

8
Probable Reasons for Missing heritability
  • Eichler et al. (2010), explore the impact of
    large variants (deletions, duplications and
    inversions) that are individually rare but
    collectively common in the population
  • Rare variants that can potentially affect many
    genes and various biological pathways in an
    individual organism are inaccessible by most
    existing genotyping and sequencing technologies
    (Zuk et al. 2014)

9
Dissolving the Missing Heritability Problem by
Traditional Method
  • Heritability estimation from family study
  • Twin study
  • Parents-offspring regression
  • Heritability estimation from GWAS study

10
Estimation of Missing Heritability
  • In quantitative genetics, we assume that there is
    absence of gene-environment interaction and
    correlation (Falconer et al. 1996)
  • VP VG VE VG VA VD VI
  • H2 VG / VP h2 VA / VP
  • where VP is the phenotypic variance of a
    population
  • VG is the genotypic variance
  • VE is the environmental variance
  • VA is the additive genetic variance
  • VD is the dominance variance
  • VI is the epistasis variance

11
Estimation
  • Cov (P1, P2) Cov (A1D1I1E1, A2D2I2E2)
  • Cov (A1, A2) Cov (D1, D2)
    Cov (I1, I2) Cov (E1, E2)
  • Twin studies is the most commonly used
    traditional methods for estimating heritability
    (Pierrick et al. 2016).
  • Twin studies are a special type of
    epidemiological studies designed to measure the
    contribution of gene as opposed to the
    environment, for a given trait.
  • Monozygotic and dizygotic twins share almost 100
    and 50 of their genetic material respectively.

12
Estimation
  • The environment is typically divided into
  • Shared Environment (C) - the part of the
    environment that affects both twins in the same
    way
  • Unique Environment (U) - the part of the
    environment that affects one twin but not the
    other
  • (Silventoinen et al. 2003)
  • In the absence of interaction and correlation
    between C and U, we have
  • E C U

13
Estimation
  • Assuming epistasis effects to be negligible
    (assumption in twin studies), then
  • Cov (PT1, PT2) Cov (AT1DT1CT1UT1 ,
    AT2DT2CT2UT2)
  • Cov (AT1, AT2) Cov (DT1, DT2) Cov (CT1, CT2)
    Cov (UT1, UT2)
  • where indexes T1 and T2 represent the two
    twins for each twin pair studied
  • Cov (UT1, UT2) is zero for both monozygotic and
    dizygotic twins as each twins unique environment
    by definition is independent of that of the other
    twin.

14
Estimation
  • Variance is a special case of covariance when the
    two variables are identical, and that for
    monozygotic twins AT1, DT1, and CT1 equal to AT2,
    DT2, and CT2 respectively, then
  • CovMT (PT1, PT2) VAVDVC
    CovDT (PT1, PT2) 1 2 VA 1 4 VD
    VC
  • ?? ???? 2 2 ?????? ???? ?? ??1, ??
    ??2 -2 ( ?????? ???? ?? ??1, ?? ??2 ??
    ?? ?? ?? ?? ?? 3 2 ?? ?? ?? ??
  • where ?? ???? 2 is the broad-sense heritability
    from twin studies, because the resulting estimate
    provides an accurate estimate of neither H2 nor
    h2, although it is closer to H2 than to h2
    (Falconer and Mackay, 1996).

15
Estimation
  • The covariance between the traits of parents (one
    or the mean of both) and the mean of their
    offspring (Falconer and Mackay 1996)
  • Cov (PP, PO) Cov (AP DP IP EP, AO DO
    IO EO)
  • Cov(AP, AO) Cov(DP, DO)
    Cov(IP, IO) Cov(EP, EO)
  • Doolittle, 2012 assumes that Cov (DP, DO) and Cov
    (EP, EO) are zero
  • Environments experienced by individuals are
    likely to be more similar within a family line,
    so Cov(EP,EO) might be some value (Guo et al.
    2014)

16
Estimation
  • Covariance of the parents and their offspring is
    equal to half of additive genetic variance, and a
    variance term representing effects due to
    dominance and similarities between environments
  • Cov (PP,PO) Cov (AP,AO) Cov (DP,DO) Cov
    (EP,EO) 1 2 VA VDEC
  • ?? ???? 2 2 ?????? ( ?? ??, ?? ??
    ) ?? ?? ?? ?? ?? ?? ?? ??????
    ?? ??
  • Heritability estimates in both twin studies and
    parent-offspring regression include an extra term
    when compared to h2, but they do not correspond
    to H²
  • ?? 2 h 2 h ????h???? 2
  • where h ????h???? 2 is the part of heritability
    contributed by the extra component(s)
    representing non-additive variance.

17
Estimation
  • Some epigenetic factors can lead to additive
    genetic effects (Pierrick et al. 2017), the
    additive variance of them ( ?? ?? ?????? )
    should be added to the additive variance of DNA
    sequences ( ?? ?? ?????? ) to obtain VA,
    assuming there is no interaction between ?? ??
    ?????? and ?? ?? ?????? then,
  • VA ?? ?? ?????? ?? ?? ??????
  • h2 ?? ?? ?????? ?? ?? ?? ??
    ?????? ?? ??
  • h ?????? 2 h ?????? 2
  • h ?????? 2 h2 - h ?????? 2

18
Estimation
  • Missing heritability (MH) equals to difference
    between the estimates obtained by traditional
    quantitative methods (H2) and the estimates
    obtained by GWAS ( h ?????? 2 ). Thus,
  • MH H2 - h ?????? 2
  • h ????h???? 2 h ?????? 2
  • Missing heritability results from the part of
    heritability originating from epigenetic factors
    stably transmitted across generations, plus the
    part of heritability originatingfrom
    non-additives factors.

19
Estimation of Heritability from Common Variants
20
Estimation of heritability from common variants
  • SNPs identified by GWAS explain only a small
    fraction of the heritability
  • Genome-wide complex trait analysis (Yang et al.
    2011) estimates the variance explained by all the
    SNPs to solve the missing heritability problem.
  • The basic concept behind GCTA, is to fit the
    effects of all the SNPs as a random effects by
    using linear mixed model (Hayes et al. 2009) and
    H-E regression method (Haseman et al. 1972)

21
HasemanElston regression method
  • The unbiased estimator of ?? ?? 2 is provided
    by HasemanElston regression method
  • Let ,Yj (x1j-x2j)2 be the squared pair
    difference for jth sib pair
  • x1jµ g1j e1j x2j µ g2j e2j
  • gij a, d, -a for BB, Bb and bb individuals
  • ?? ?? (0, 1 2 ???? 1) is the proportion of
    gene IBD for jth sib pair
  • Conditional expectation of the squared pair
    differences
  • E(Yj ?? ?? ) (?? ?? 2 2?? ?? 2 ) -(2?? ??
    2 ) ?? ?? a ß ?? ??
  • where a ?? ?? 2 2?? ?? 2 ß -2?? ?? 2
    ?? ?? 2 -ß/2

22
Estimation of heritability from common variants
  • In GWAS the associations between individual SNPs
    and the trait are represented by following simple
    regression model
  • yj µ xijai ej
  • ejN(0, ?? ?? 2 )
  • where yj is the phenotypic value of the jth
    individual µ is the general mean ai is the
    allele substitution effect of ith SNP xij is an
    indicator variable that takes a value of 0, 1 or
    2 if the genotype of the j th individual at ith
    SNP is bb, Bb or BB respectively and ej is the
    residual effect.

23
Estimation of heritability from common variants
  • Let, m causal variants are genotyped, then the
    model is
  • yj µ gj ej and gj ??1 ?? ?? ????
    ?? ??
  • where gj is the total genetic effect of jth
    individual m is the number of causal loci ui is
    the additive effect of the ith causal variant
    zij is the design matrix allocating casual allele
    to trait
  • -2pi / 2?? ?? (1-?? ?? )
    if the genotype of the jth
  • zij
    individual at ith locus is qq
  • (1-2pi) / 2?? ?? (1-?? ?? )
    Qq
  • 2(1-pi ) / 2?? ?? (1-?? ?? )
    QQ

24
Estimation of heritability from common variants
  • In matrix notation,
  • y µ1 g e and g Zu
  • variance-covariance matrix of y (the vector of
    observations) can be expressed as
  • var(y) ZZ ?? ?? 2 I ?? ?? 2 ????' ??
    ?? 2 ?? I ?? ?? 2 G ?? ?? 2 I ?? ?? 2
  • u N(0, I ?? ?? 2 ) gj N(0, ?? ?? 2 m ??
    ?? 2 )
  • where ?? ?? 2 is the variance of causal
    (random) effects ?? ?? 2 is the variance of
    total additive genetic effects I is an n x n
    identity matrix, G is the genetic relationship
    matrix between pairs of individuals at causal
    loci.

25
Estimation of heritability from common variants
  • The number and positions of the causal variants
    are exactly not known, so G matrix is not
    directly obtained
  • The Genome-wide relationship matrix (A) is
    obtained from a genome-wide sample of SNPs
  • A ?? ?? ' ?? wij ?? ????
    -???? ?? ?? ?? ?? (??- ?? ?? )
  • where W is a standardized genotype matrix with
    the ijth element xij is the number of copies of
    the allele for the ith SNP of the jth individual
    and pi is the frequency of the allele

26
Estimation of heritability from common variants
  • The Genome-wide relationship matrix (A) between
    individual j and k can be estimated by the
    following equation
  • Ajk 1 ?? ??1 ?? ( ?? ???? -2?? ??
    )( ?? ???? -2?? ?? ) 2?? ?? (1-?? ?? )
    when j ? k
  • 1 1 N i1 N ?? ???? 2 - 1
    2p i x ik 2?? ?? 2 ) 2p i (1-p i )
    when jk
  • Gjk is not known, so to fit model and estimate
    the genetic variance ( ?? ?? 2 ), A is used
    i.e. estimate of relationship matrix based on the
    Genome-wide relationship matrix (A) .

27
Estimation of heritability from common variants
  • Randomly sample 2N SNPs from all the SNPs across
    the genome and randomly split them into two
    groups (N SNPs in each group).
  • Calculate Ajk using all the SNPs in the first
    group.
  • Calculate Gjk using SNPs with MAF ? in the
    second group
  • Regress Gjk on Ajk for j k (use Gjk - 1 and Ajk
    - 1 when j k). the regression coefficient is
  • ß ?????? ( ?? ???? , ?? ???? ) ??????( ?? ????
    )
  • Repeat the procedure using different numbers of
    SNPs
  •  

28
Estimation of heritability from common variants
  • Yjk (z1j-z2j)2 squared z-score difference
    between individual
  • Gjk is not known we replace it by an estimate A
    jk such that
  • E(Gjk A jk) A jk
  • E(Yjk) E(a ßGjk) a ß A jk
  • Yjk is plotted against the A jk i.e. regression
    of Yjk on A jk -2 ?? ?? 2
  • ?? ?? 2 - ß/2
  • The relationship at causal loci is predicted with
    error by the observed SNPs, and the error is c 
    1/N
  • ?? 1- (?? 1 ?? ) ??????( ?? ???? )

29
Estimation of Heritability from Case-control
Study
30
Liability Threshold Model
  • Liability describe all the genetic and
    environmental factors that contribute to the
    development of a multi-factorial disorder
  • The level of liability at which we distinguish
    population into case or control is referred as
    the threshold level.

31
Liability Threshold Model
  • Liability is best represented as a standard
    normal distribution curve as most individual who
    is affected or unaffected will possess some
    degree of liability
  • li gi ei
  • where li is an unknown liability of ith
    individual and a person is assumed to be a case
    if his liability exceeds a threshold t
  • gi is a genetic random effect, which can be
    correlated across individuals
  • ei is the environmental random effect, which is
    assumed to be independent of each other and of
    the genetic effects.

32
Estimation of heritability from case-control study
  • The advantages of working on the scale of
    liability are that the, population parameters
    such as variance components and heritability are
    independent of prevalence
  • l µ1N g e
  • where lN(0, 1)and g N(0, ?? ?? 2 )
  • Mean of the distribution of liability is zero (µ
    0) when there is no ascertainment
  • The total phenotypic variance ( ?? ?? ?? ) on the
    scale of liability is per definition equal to 1
  • The heritability on the liability scale is
  • ?? ?? ?? ?? ?? ??

33
Estimation of heritability from case-control
study
  • Applying the properties of truncated normal
    distributions, the mean liability is
  • i E(ly1)z/K for case
    and
  • i2 E(ly0) -z/(1-K)
    for control
  • Squared mean liability
  • E(l2y1)1it for case
  • E(l2y0) 1i2t for
    control
  • The covariance between y (unaffected/affected
    status) and l (liability) to describe the
    relationship between the phenotypes on the two
    scales
  • Cov(y,l) E(y.l)- E(y)E(l) K1i (1-K)0i2 Ki
    z

34
Estimation of heritability from case-control
study
  • The genetic value on the observed 01 risk scale
    for an individual (u), defined in Equation , as
  • u c bg
  • where c is a constant, The linear regression
    coefficient (b) that links the two scales is
    derived from the regression of the phenotype on
    the observed scale (y) on the additive genetic
    effect on the scale of liability (g)
  • b cov(y, g) / ?? ?? 2 E(y.g) -
    E(y)E(g)/ h ?? 2
  • Ki h ?? 2 / h ?? 2
  • z
  • u c bg c zg
  • ?? ?? 2 var(zg) z2 ?? ?? 2

35
Estimation of heritability from case-control
study
  • The proportion of the total variance of 01
    observations, which is the Bernoulli distribution
    variance K(1 - K) and can be written as
  • h ?? 2 ?? ?? 2 /
    K(1-K) ?? ?? 2 cov (y, g)/ ?? ?? 2
    2/K(1-K)
  • s g 2 b2/K(1-K) h l 2 z2/K(1-K)
  • h l 2 h ?? 2 K(1-K)
    /z2
  • The mean of the estimated genetic values is
  • E( ?? y 1) zi s g 2 for case
  • E( ?? y 0) z ?? 2 s g 2 for control

36
Selection probabilities
  • When the study is observational, the probability
    of being included in the study is independent of
    the phenotype.
  • In case-control study, the proportion of cases is
    usually greatly ascertained
  • ?? ?? ???????? (1-??) ?? ?????????????? ??
    1-?? ?? ?????????????? ??(1-??)
    ??(1-??) ?? ????????
  • where Pcase and Pcontrol are the probabilities
    that a case and a control would be selected for
    the study respectively
  • K is the prevalence of a condition in the
    population
  • P is the prevalence in condition the study

37
Non-normality of the liability
  • When the proportions of cases and controls are
    not a random sample from the population.
  • The mean and variance for case and control
    disease status (ycc), disease liability (lcc),
    and genetic liability (gcc) are
  • E(ycc) P
  • (usually, P 1/2)
  • var(ycc) P (1-P) which is the phenotypic
    variance on the observed scale in the
    case-control sample
  • where P is the proportion of cases in the
    case-control study sample

38
Non-normality of the liability
  • E(lcc) Pi (1-P)i2 i(P-K)/(1-K)
  • i? where ? is, (??-??)
    (1-??)
  • var(lcc) ?? ?????? 2 E( ?? ???? 2 )
    E(lcc)2
  • P(1 it )(1 - P)(1 i2t) - i2 ?? 2
    1Pit-(i-P)tik/(1-K) - i2 ?? 2
  • 1i ?(t-i ?)
  • 1?
  • ? ??? ??-???
  • var(lcc) gt 1, in a case-control study because
    individuals from the tails of the distribution of
    liability have been selected.

39
Non-normality of the liability
  • The mean of genetic liability depends on the mean
    liability phenotype of case-control sample and
    the heritability of liability
  • E(gcc) h l 2 E(lcc) h l 2 Pi (1-P)i2
  • h l 2 i?
  • Variance of genetic liability as
  • var(gcc) s gcc 2 E( ?? ???? 2 ) E(gcc)2
  • h l 2 E ?? ???? 2 h l 2 E(lcc)2
  • h l 2 P(1it)(1-P)(1i2t)- h l 4 i2?2
  • h l 2 1 h l 2
    ??

40
Non-normality of the liability
  • The regression of phenotype on the observed risk
    scale on genetic liability in the case-control
    study
  • bcc cov(ycc,gcc)/var(gcc)
  • E(ycc.gcc)-E(ycc)E(gcc)/var(gc
    c)
  • h ?? 2 iP- h ?? 2 i?/ ?? ?????? 2

    ??h ?? 2 i(1-?)/ ?? ?????? 2
  • z ??(1-??) ?? ?? 2 ??(1-??) ??
    ?????? 2 ?? ? where ? ??(1-??) ?? ?? 2
    ??(1-??) ?? ?????? 2
  • where, ? quantifies the change of the regression
    coefficient due to ascertainment in a regression
    of phenotype on the observed risk scale onto
    genetic factors on the scale of liability

41
Non-normality of the liability
  • The genetic value on the observed scale (ucc) for
    an individual in a case-control study is
  • ucc c bccgcc
  • c z?gcc
  • c z ??(1-??) ?? ?? 2 ?? 1-?? ??
    ?????? 2 gcc
  • and,
  • var (ucc) ?? ?????? 2 ?? ???? 2 ?? ?????? 2
  • z ??(1-??) ?? ?? 2 ?? 1-??
    ?? ?????? 2 2 ?? ?????? ??

42
Non-normality of the liability
  • The mean of the estimated genetic values, when
    samples are ascertained
  • E( ?? ccycc1) ??(1-??)(1-??) ??(1-??)(1-??)
    E( ?? y1) for case
  • E( ?? ccycc0) ??(1-??)?? ??(1-??)?? E( ??
    y0) for control
  • ?? ???? 2 is a squared regression coefficient
    that transform the estimate of genetic factor on
    the observed risk scale to liability scale
  • ?? ?? 2 ??(1-??) ??(1-??) ?? 2 ??
    ?????? 2 ??(1-??) ??(1-??) ?? ???? 2
    ?? ???? 2 ?? 2 ?? ?????? 2 ?? ?? 2
    ?? 2 ??(1-??) ??(1-??)

43
Non-normality of the liability
  • The mean genetic liability for cases transformed
    the observed scale by
  • bcci ?? ?????? 2 bcci ??(1-??)
    ??(1-??) ?? ?? 2
  • bcci2 ?? ?????? 2 bcci2 ??(1-??)
    ??(1-??) ?? ?? 2
  • ?? ?????? 2 ??(1-??) ??(1-??) ?? ?? 2
  • i (1-??) (1-??) ?? ???? ?? and
  • i2 i2 ?? ?? - ?? ???? 1-??

44
Non-normality of the liability
  • h l 2 h ?????? 2 ??(1-??) ?? 2 ????
  • h ?? 2 h ?????? 2 1 ?? ??(1-??)
    ??(1-??) 2
  • var (h ?? 2 ) ?????? ( h ?????? 2 ) 1 ??
    ??(1-??) ??(1-??) 4

45
Illustration
  • Estimate the heritability of a trait from GWAS
    study when fitting significant SNPs and all SNPs
    simultaneously.
  • Sol-Phenotypic and marker data was simulated
    with the help of R-package and the dimension of
    phenotypic data and marker data is 200X1 and
    500X200 respectively. The model is represented
    as
  • y µ1 g e
  • rrBLUP is used to estimate the effect of random
    additive genetic variance from SNPs information.
    From 500 SNPs, 10 most significant SNPs are
    selected by LASSO. The heritability of a trait
    from GWAS study when fitting significant SNPs and
    all SNPs was 0.2998 and 0.3755 respectively.

46
Illustration
  • 2. To estimate the heritability on the liability
    scale along with standard error from ascertained
    case-control data (K 0.1, l gt 1.282 s ?? )
  • Sol-
  • The phenotypic data of 2500 case and 2500 control
    is simulated (mvrnorm function in the R package).
    The heritability estimated from observed data is
    0.1021 and heritability estimated in liability
    scale is
  • h l 2 ?? ?? 2 0.1032
  • var ( h l 2 ) 0.00017

47
Conclusions
  • Heritability is not missing but hidden
  • In the form of common variants of small effect
    scattered across the genome
  • In the form of low frequency variants only
    partially tagged by common variants
  • Estimates of heritability from traditional method
    are inflated
  • If there is physical material (epigenetic
    factors), other than DNA pieces, that can affect
    the phenotype and be transmitted stably across
    generations, then it should also be thought to
    play the role that contributes to additive
    genetic effects.

48
Conclusions
  • There are many character of biological or
    economic interest which vary in discontinuous
    manner but are not inherited in a simple
    Mendelian manner, for this type of traits, the
    estimation of heritability based on liability and
    threshold model provide an unbiased effect.
  • The missing heritability of complex traits can be
    resolved by estimates of heritability explained
    by all genotyped SNPs.
  • The general framework for heritability
    estimation, called GCTA based on HE regression
    method provides the unbiased estimates of
    heritability

49
References
  • Doolittle, D. P. (2012). Population Genetics
    Basic Principles. (16). Springer Science
    Business Media.
  • Eichler, E. E., Flint, J., Gibson, G., Kong, A.,
    Leal, S.M., Moore, J. H. and Nadeau. J. H.
    (2010). Missing Heritability and Strategies for
    Finding the Underlying Causes of Complex
    Disease. Nature Reviews Genetics. 11 (6)
    446450
  • Falconer, D. S. and Mackay, T.F.C.(1996).
    Introduction to Quantitative Genetics Addison
    (4th Edn.). Wesley Longman Ltd.
  • Golan, D., Lander, E. S. and Rosset,S.
    (2014)."Measuring Missing Heritability Inferring
    the Contribution of Common Variants." Proceedings
    of the National Academy of Sciences. 111(49)
    E5272-E5281.
  • Guo, G., Lin, W., Hexuan, L. and Thomas, R.
    (2014). Genomic Assortative Mating in Marriages
    in the United States. PloS One .9 (11) e112322

50
References
  • Gusev, A., Bhatia, G., Zaitlen, N., Vilhjalmsson,
    B. J., Diogo, D., Stahl, E. A.and Plenge, R. M.
    (2013). Quantifying Missing Heritability at known
    GWAS Loci. PLoS genetics, 9(12), e1003993.
  • Haseman, J.K. and Elston, R.C. (1972). The
    Investigation of Linkage Between a Quantitative
    Trait and a Marker Locus. Behavioural Genetics.
    2(1)319.
  • Hayes, B. J., Visscher, P. M., and Goddard, M. E.
    (2009). Increased Accuracy of Artificial
    Selection by Using the Realized Relationship
    Matrix. Genetics research. 91(1), 47-60.
  • Lee, S. H., Wray, N. R., Goddard, M. E. and
    Visscher, P. M. (2011). Estimating Missing
    Heritability for Disease from GWAS. The American
    Journal of Human Genetics, 88(3), 294-305.
  • Maher, B. S., (2008). The Case of the Missing
    Heritability. Nature. 456 1821

51
References
  • Manolio, T. A., Francis, S. C., Nancy, J. C.,
    Goldstein, D. B., Hindorff, L.A., Hunter, D.J.,
    McCarthy, M. I., Ramos, E.M., Cardon, L. R. and
    Chakravarti, Aravinda. (2009). Finding the
    Missing Heritability of Complex Diseases.
    Nature. 461(7265) 747753.
  • Moore, J. H. and Bush, W.S. (2012). Genome-Wide
    Association Studies. PLoS Computational Biology.
    8 (12) e1002822
  • Pierrick, B. and Lu, Q. (2016). "Dissolving the
    Missing Heritability Problem.
  • Richards, E. J., (2006). Inherited Epigenetic
    Variation-Revisiting Soft Inheritance. Nature
    Reviews Genetics. 7 395401.
  • Silventoinen, K., Sammalisto, S., Perola, M.,
    Boomsma, D. I., Cornes, B.K., Davis, C., Leo D.,
    Lange, M. D., Harris, J. R. and Hjelmborg. J.V.B.
    (2003). Heritability of Adult Body Height A
    Comparative Study of Twin Cohorts in Eight
    Countries. Twin Research. 6 (05) 399408

52
References
  • Visscher, P.M., et al. (2006). Assumption-Free
    Estimation of Heritability from Genome-Wide
    Identity-by-Descent Sharing Between Full
    Siblings. PLoS Genet.2e41.
  • Yang, J., Benyamin, B., McEvoy, B.P., Gordon, S.,
    Henders, A. K., Nyholt, D. R. and Madden, P. A.
    (2010). Common SNPs Explain a Large Proportion of
    the Heritability for Human Height. Nature
    Genetics. 42(7)565569.
  • Yang, J., Lee, S.H., Goddard, M.E. and Visscher,
    P.M. (2011). GCTA A Tool for Genome-Wide Complex
    Trait Analysis. American Journal of Human
    Genetics. 88(1)7682.
  • Zuk O, Hechter, E., Sunyaev, S.R., Lander, E.S.
    (2012). The Mystery of Missing Heritability
    Genetic Interactions Create Phantom Heritability.
    Proceedings of the National Academy of
    Sciences.109(4)11931198.
  • Zuk, O., Schaffner, S. F., Samocha, K., Do, R.,
    Hechter, E., Kathiresan, S. and Lander, E. S.
    (2014). Searching for Missing Heritability
    Designing Rare Variant Association Studies.
    Proceedings of the National Academy of Sciences.
    111(4)455464

53
Instructions for use
54
LASSO
  • Data train
  • Input y x1-x500
  • Cards
  • Data test
  • Input y x1-x500
  • Cards
  • proc glmselect datatrain valdatatest
  • plotscoefficients
  • model y x1-x500/
  • selectionLASSO(steps10 choosevalidate)
  • run

55
Estimation of heritability
  • xlt-read.csv(file.choose("g1"))
  • ylt-read.csv(file.choose("p1"))
  • library(rrBLUP)
  • xlt-as.matrix(x)
  • ylt-as.matrix(y)
  • ans lt- mixed.solve(y,x)
  • betalt-ansu
  • glt-xbeta
  • sigma2glt-var(g)
  • sigma2plt-var(y)
  • h2lt-sigma2g/sigma2p
  • h2
  • 0.3755903
  • xlt-read.csv(file.choose("g2"))
  • ylt-read.csv(file.choose("p1"))
  • library(rrBLUP)
  • xlt-as.matrix(x)
  • ylt-as.matrix(y)
  • ans lt- mixed.solve(y,x)
  • betalt-ansu
  • glt-xbeta
  • sigma2glt-var(g)
  • sigma2plt-var(y)
  • h2lt-sigma2g/sigma2p
  • h2
  • 0.2998782

56
Illustration No. 2 (code)
  • library(MASS)
  • library(pps)
  • sigmag1
  • sigmae3
  • m1matrix(sigmag,100,100)
  • m2diag(sigmag,100,100)
  • m10.05(m1-m2)
  • sigmam1m2
  • murep(0,100)
  • gvlt-mvrnorm(n 50, mu, sigma, tol 1e-6,
    empirical FALSE, EISPACK FALSE)
  • gvlt-t(matrix(gv,nrow1,byrowT))
  • evefflt-rnorm(5000,0,sigmae)
  • llt-gveveff
  • sigmallt-sd(l)
  • threslt-1.283sigmal
  • cc1lt-ifelse(lgtthres,1,0)
  • idlt-rep(c(150),each100)
  • cclt-as.data.frame(cbind(id,cc1))
  • names(cc)lt-c("id","cc1")
  • csumlt-tapply(cc,2,cc,1,sum)
  • s1lt-cccc,21,
  • s2lt-cccc,20,
  • s3lt-stratsrs(s2,1,csum)
  •  

57
If y follows a standard normal distribution with
a truncation point at t, with t gt 0, so that the
fraction of y that is larger than t is K, then
the mean value of y above the truncation point
is E(y y gt t) i ?? ?? E(y y lt t)
?? ?? - ???? ??-?? Var (y y gt t) 1-
i(i-t) Var (y y lt t) 1 - ?????? (??-??)
?? ?? ???? (?? - ??) where z the height
of the normal curve at point t
Fig. 2
58
Fig .3
var(lcc) gt 1, in a case-control study because
individuals from the tails of the distribution of
liability have been selected.
Write a Comment
User Comments (0)
About PowerShow.com