Genetic analysis of complex traits - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Genetic analysis of complex traits

Description:

Studying inheritance of traits that show no clear Mendelian inheritance but cluster in families ... Identification of involved genes hampered by: genetic heterogeneity ... – PowerPoint PPT presentation

Number of Views:190
Avg rating:3.0/5.0
Slides: 28
Provided by: estherv
Category:

less

Transcript and Presenter's Notes

Title: Genetic analysis of complex traits


1
Genetic analysis of complex traits
  • Studying inheritance of traits that show no clear
    Mendelian inheritance but cluster in families
  • Suggests there is a genetic component
  • Usually a mix of genetic and environmental
    factors
  • Several models
  • Single gene modified by environment
  • Several genes each with significant contributions
    (oligogenes)
  • Many genes each making a small contribution
    (polygenes)
  • Last two- multifactorial inheritance

2
Problems of complex diseases
  • Identification of involved genes hampered by
  • genetic heterogeneity
  • Alleles at more than one locus can trigger a
    specific disease
  • reduced penetrance
  • Individuals with predisposing genotype are
    unaffected
  • phenocopy
  • Disease is triggered by environmental factors in
    the absence of a predisposing genotype

3
Mapping using a parametric model
  • In monogenic disorders, mapping was achieved
    through pedigree analysis in which linkage is
    sought between markers and the disease gene using
    LOD score analysis to combine data from multiple
    families
  • In complex diseases, often there is a major locus
    that segregates in a Mendelian fashion (often
    with incomplete penetrance)
  • In some families, the disease shows such an
    inheritance and can be mapped with LOD score
    analysis

4
LOD (Log ratio of odds) scores
  • Used to overcome not having large numbers of
    progeny to accurately measure recombination
    frequency
  • Tests a genetic model which states that the
    disease locus and a genetic marker are linked
    with a recombination fraction of ? and which
    requires that parameters are specified
  • Dominant or recessive
  • Degree of penetrance
  • Allele frequencies
  • Logarithm of the ratio of odds (Z) of the
    observed outcome if two loci are linked with a
    recombination fraction of ? to the odds of the
    observed outcome if they are unlinked (? 0.5)
  • Zx odds of observed results if ? x odds
    of observed results if ? 0.5
  • LOD ? x log 10 Zx

5
LOD scores
  • Calculation can be repeated using different
    values of ? (ie. different distances between gene
    and marker)
  • Result is a numerical value that measures the
    odds that two loci are linked at a given
    recombination fraction (ie. distance), compared
    with chances they are unlinked
  • Threshold for declaring linkage is a LOD score of
    3 or more (ie. Z 10001) (translation- if there
    is a 10001 odds that the locus is linked with a
    certain distance (?) versus being unlinked)
  • Unlikely that one family would give a LOD 3
    can combine data from a number of families
  • MLS maximum likelihood score
  • LOD score for the most likely of a series of
    alternatives
  • Model with highest LOD score is judged most
    likely to be correct
  • Since parameters have to be specified in these
    calculations, analysis is called parametric

6
example of LOD score analysis
  • Are A and B linked?
  • All children inherited aB from II-2
  • Other chromosome came from II-1
  • If ? 0 (totally linked), then for each child,
    chance of observed result is 0.5 (since it gets
    one of two chromosomes from II-1)
  • For the 3 children, odds of observed results are
    (0.5)3 0.125
  • If ? 0.5 (loci unlinked), then there is 0.25
    chance of the observed genotype (independent
    assortment) odds are (0.25)3 0.0156

7
LOD score analysis
  • Zx odds of observed results if ? x odds
    of observed results if ? 0.5
  • Z ? 0 0.125 8 0.0156
  • LOD ? 0 log 10 Zx 0.9
  • try again using a different ? 0.25 (linked with
    RF of 0.25)
  • Now odds of observed results are
  • 0.5 x 0.75 0.375? for family (0.375)3 0.053
  • Why 0.75? Since odds of inheriting
    non-recombinant chromosome (as is seen in
    progeny) is 0.75 and there are two possible
    chromosomes that could be inherited
  • Z ? 0.25 0.053 3.39 0.0156
  • LOD ? 0.25 log 10 Zx 0.53

8
LOD score analysis
  • So if we compare the two scenarios, with a ? 0,
    which means the loci are totally linked , we get
    a LOD score of 0.9
  • With ? 0.25, which means the loci can be
    separated by recombination 25 of the time, we
    get a LOD score of 0.53
  • Since ? 0 gives higher score, this is the MLS,
    meaning loci are most likely linked, although LOD
    lt 3 makes it not statistically significant
  • Since LOD scores are logarithmic, they can be
    added, so 4 similar families would combine to
    give a score of 3.6 at ? 0.

9
Parametric models have mapped disease genes
  • BRCA1
  • Analyzed 23 families that showed clustering,
    indicating a familial mode of inheritance
  • Linkage to a specific marker with a LOD 2.35
  • When age of onset was considered
  • analysis was restricted to 7 families with age of
    onset lt 45
  • Gave LOD score of 5.98
  • Alzheimer disease
  • Hereditary non-polyposis colorectal cancer
  • Familial adenomatous polyposis
  • Non-insulin dependent diabetes

10
Psychiatric disorders
  • Schizophrenia and manic depression show
    clustering but not Mendelian inheritance
  • Attempts to map traits have not been successful
  • Many reports that could not be replicated
  • problems encountered
  • Assessment of phenotype is variable
  • Genetic heterogeneity
  • LOD scores are very sensitive to status of a few
    key individuals in families- if their phenotype
    changes, affects outcome of analysis greatly

11
Multifactorial inheritance
  • Those that involve two or more genes and a strong
    environmental influence
  • Continuous variation
  • Controlled by polygenic traits
  • each involved gene contributes additively to
    phenotype
  • Phenotypic expression of multifactorial traits
    varies widely due to gene interactions and
    environmental factors
  • Polygenic traits
  • Height
  • Weight
  • Skin colour

12
Multifactorial inheritance
  • Other polygenic traits
  • Neural tube defects
  • Cleft palate
  • Clubfoot
  • Diabetes
  • Hypertension
  • Behavioural disorders
  • Distributions of phenotypes in F2 varies
    depending on 2,3, or 4 genes involved
  • As number of loci increases, the number of
    phenotypic classes increases, and the less
    phenotypic variation between classes
  • Environmental influence smoothes out variation
    between genotypes even more

13
Model for inheritance of height
  • Trait controlled by 3 genes with 2 alleles (A, a,
    B, b, C, c)
  • Each dominant allele contributes equally to
    phenotype and recessive alleles make no
    contribution
  • The effect of each dominant allele is additive
  • Genes are not linked
  • Assume
  • base height of 5 feet
  • Each dominant allele adds 3 inches
  • aabbcc individual is 5 AABBCC individual is
    66
  • If environment affects all people the same, the
    individuals with 3 dominant alleles (the most
    frequent genotype) would be the average height of
    59
  • Genotype reflects genetic potential for height-
    environment (nutrition) will affect the full
    expression of the genotype

14
Environmental effects
  • 3 genes give rise to 7 genotypic classes
  • Environment blurs the distinction- actually see
    continuous variation

15
Liability
  • take height example and apply to disease state
  • If genetic risk is modified by environment-
    susceptibility (or liability) is normally
    distributed
  • How many pre-disposing alleles are present? If
    three bi-allelic loci are involved, then 0 - 6
  • Some traits do not show continuous variation in
    phenotypes ? either affected or not
  • depends on threshold

16
Linkage and association
  • Linkage studies use individual families where
    members are affected and attempt to demonstrate
    linkage between the occurrence of the disease and
    genetic markers (creates associations within
    families, but not among unrelated people)
  • Association studies are based on populations and
    attempt to show an association between a
    particular allele and susceptibility to disease
    (a statistical statement about the co-occurrence
    of alleles or phenotypes)

17
Non-parametric linkage analysis
  • parametric methods require a genetic model, and
    are only useful for complex traits with a single
    genetic component
  • model-free or non-parametric linkage analysis
  • ignores unaffected people
  • look for alleles or chromosomal segments that are
    shared between affected people (within families
    or in whole populations association studies)
  • if a gene contributes to the disease, then the
    genomic region will be co-inherited from a common
    ancestor by members of an affected pedigree more
    often than would be expected by chance

18
Genome scan
  • follows the inheritance of of polymorphic
    micro-satellite markers in members of a pedigree
  • if affected members co-inherit the same allele
    more often than expected by chance, then that
    genomic region may contain a gene that
    contributes to susceptibility
  • usually based on affected sibling pairs
  • involves genotyping markers evenly spread
    throughout genome
  • sibs are expected to share 0, 1 or 2 alleles at
    an expected ratio of 25, 50 and 25

19
IBS and IBD
  • need to distinguish DNA segments that are
    identical by descent (IBD) versus identical by
    state (IBS)
  • IBS alleles look the same but are not derived
    from a common known ancestor
  • IBD are copies of the same ancestral (usually
    parental) allele
  • analysis most informative if there are multiple
    alleles or if there is a multi-locus multi-allele
    haplotype
  • IBD studies require parental samples

20
IBS versus IBD
both share allele A1 IBS IBD only
obvious if parental genotype is known
21
sib pair analysis
Shares 2
Shares 1
Shares 1
Shares 0
  • share 1 (1/4), 2 (1/2) or 0 (1/4) parental
    haplotypes by random segregation
  • pairs of sibs affected by dominant condition
    share 1 or 2 haplotypes
  • pairs of sibs affected by recessive condition
    share both parental haplotypes for affected region

22
Genome scanning- to identify commonly inherited
areas
  • assemble families containing 2 affected sibs
  • DNA samples from sibs and parents are collected
  • microsatellites are genotyped using PCR with
    fluorescent tagged primers

23
Genome scanning cont
  • Can analyze 18 loci in one lane and up to 500 on
    the same gel
  • initial scan will cover whole genome in 10 cM
    intervals
  • each locus will have two alleles distinguished by
    length of PCR product
  • can determine parental and sib genotypes
  • determine which alleles are IBD and how these
    vary from what is predicted based on allele
    frequencies
  • big question- how do you decide whether excessive
    allele sharing is statistically significant?
  • use LOD score analysis with lower MLS 1 as
    starting point
  • try linkage disequilibrium studies

24
Association studies
  • statistical statement about the co-occurrence of
    alleles or phenotypes
  • allele A is associated with disease D if people
    who have D also have A more (or less) often than
    would be predicted from individual frequencies of
    A and D in the population
  • eg. HLA-DR4 is found in 36 of UK population, but
    in 78 of people with rheumatoid arthritis
  • population associations depend on population
    history
  • in the UK, two unrelated people share a common
    ancestor 22 generations ago (500 years)(44
    meioses)
  • if they inherit a disease susceptibility allele
    from their common ancestor, then during the many
    meioses, recombination will have reduced the
    shared segment to a small region. Only tightly
    linked loci will still be shared

25
Linkage disequilibrium
  • non-random association in a population of
    alleles at two closely linked loci (so one allele
    closely linked to disease)
  • based on having a common ancestor
  • alleles that are closely linked will be commonly
    inherited
  • but, in time, disequilibrium will disappear due
    to recombination (ie. Allele frequencies will
    equalize)
  • if two alleles 1 Mb apart are in disequilibrium,
    then in 70 generations the disequilibrium will
    decay by 50
  • more distantly located alleles will decay faster-
    get a gradient of disequilibrium, with highest
    value being closest to gene
  • L.D. influenced by populations- effect is
    greatest in small, homogeneous populations
    (greatest chance of founder effects) (Finland),
    smallest in large heterogeneous populations (USA)

26
Other reasons for associations
  • direct causation
  • having allele A makes you susceptible to disease
    D (increases the likelihood)
  • expect to see same allele A associated with
    disease in any population (bypasses common
    ancestor)
  • natural selection
  • people with disease may have a competitive
    advantage if they also carry allele A
  • these are unlikely if the associated DNA is a
    variant in non-coding DNA
  • studies in ethnically diverse populations are
    useful to distinguish between these causes and
    L.D.

27
Transmission disequilibrium test (TDT)
  • tests done to check the results of an association
    study
  • confirm whether a parent heterozygous for an
    associated and a non-associated allele transmits
    the associated allele more often to affected
    offspring
  • starts with couples with more than one affected
    offspring
  • select parent that is heterozygous for marker M1
  • test compares number of such parents who transmit
    the M1 allele to their affected offspring versus
    transmitting other allele
  • can be used if only one parent is available
  • fundamentally a test of association, not linkage
Write a Comment
User Comments (0)
About PowerShow.com