INTRODUCTION TO ASSOCIATION MAPPING - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

INTRODUCTION TO ASSOCIATION MAPPING

Description:

introduction to association mapping – PowerPoint PPT presentation

Number of Views:122
Avg rating:3.0/5.0
Slides: 36
Provided by: marco293
Category:

less

Transcript and Presenter's Notes

Title: INTRODUCTION TO ASSOCIATION MAPPING


1
INTRODUCTION TO ASSOCIATION MAPPING
2
  • We have a set of inbred lines or varieties
  • We have genotyped them with a large set of
    markers
  • We also have phenotypic data of the lines for
    several traits
  • And now What?

3
  • We will take advantage of the Linkage
    Disequilibrium (LD) to identify genetic regions
    associated with our trait of interest
  • Association mapping is also called Linkage
    Disequilibrium mapping

4
Identify associations between markers and
phenotypes without the need to develop specific
populations
Marker Distance Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7 Line 8 Line 9 Line 10 Line 11 Line 12 Line 13 Line 14 Line 15 Line 16
_3_0363_ 0 A B B A A A B A B B A B B B B B
_1_1061_ 0.8 A B B A A A B A B B A A A B B A
_3_0703_ 1.5 B A A B B B A B A A B B B B B B
_1_1505_ 1.5 B A A B B B A B A B B B B B B B
_1_0498_ 1.5 B B B B B B B B B B B B B B B A
_2_1005_ 3.8 A B B A A A B A B A A B B B B B
_1_1054_ 3.8 A A A A A A A A A B A A A A A A
_2_0674_ 6 A B B A A A B A B A A A A A A B
_1_0297_ 8.8 A A B B B B B A A A A A A A A B
_1_0638_ 10.7 A A B B B B B A A B A A A A A A
_1_1302_ 11.4 B A A A B B A A A B A B B B B A
_1_0422_ 11.4 B A A A B B A A A B A B B B B A
_2_0929_ 15.3 A B B B A A B B B A B A A A A B
_3_1474_ 15.4 A B B B A A B B B A B A A A A A
_1_1522_ 17.3 A B B B A A B B B A B A A A A A
_2_1388_ 17.3 A A A A A A A A A A A A A A A A
_3_0259_ 18.1 B B B B B B B B B B B A A A A A
_1_0325_ 18.1 B B B B B B B B B B B A A A A A
_2_0602_ 20.8 A A B A A A A B A B A A A A A A
_1_0733_ 23.9 B B B B B B B B B B B A A A A A
_2_0729 23.9 B B B B B B B B B B B A A A A A
_1_1272_ 23.9 A B B B A A B B B B B B B B B B
_2_0891_ 26.1 A A A A A A A A A B A A A A A A
_2_0748_ 26.6 B B B B B B B B B A B B B B B B
_3_0251_ 27.4 A B A A A B A A A B A A A B A A
_1_0997_ 35.5 B B A A A B B B B B B B B B B B
_1_1133_ 41.8 B B A A A B B B B A B A A A A A
_2_0500_ 42.5 A A A A A A A A A B A B B B B B
_3_0634_ 43.3 B B B B B B B B B A B A A A A A
10
Desease severity
5
0
5
  • Definition of Linkage Disequilibrium is very
    simple
  • is the non-random association of alleles at
    different loci

Equilibrium
Disequilibrium
6
Equilibrium
Disequilibrium
Locus 1
Locus 2
Locus 3
Locus 4
Locus 5
Locus 1
Locus 2
Locus 3
Locus 4
Locus 5
Random mating population with loci segregating
independently
  • Non random mating population LD due to selection,
    mutation, drift/sampling, population structure

7
How do we measure LD?
  • The LD is measured with a parameter called D.
  • If alleles at different loci are not inherited
    independently, then
  • PAB ? PA x PB and
    DAB PAB PA x PB
  • (PA and PB are allele frequencies and PAB is the
    haplotype frequency)
  • Standarized measures of LD D and r2

for D lt 0
for D gt 0
8
Line

a a
A A
a a
a a
a a
a a
A A
a a
a a
a a
a a
a a
A A
a a
a a
A A
a a
A A
a a
A A
A A
a a
a a
a a
a a
a a
A A
a a
A A
A A

b b
B B
B B
b b
b b
b b
B B
b b
B B
b b
b b
B B
B B
b b
B B
B B
b b
B B
b b
B B
b b
b b
b b
b b
b b
B B
B B
B B
B B
B B
Locus 1
Locus 2

1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 15
16 16
17 17
18 18
19 19
20 20
21 21
22 22
23 23
24 24
25 25
26 26
27 27
28 28
29 29
30 30
Allele frequencies PA 10/30 Pa 20/30 PB
15/30 Pb 15/30
Haplotype frequencies PAB 9/30 PaB 6/30 PAb
1/30 Pab 14/30
DAB PAB PA x PB 9/30 (10/30 x 15/30)
0.13
9
Spring barley Two rows Chromosome 5H
r2
Distance (bp)
10
(No Transcript)
11
Extension of LD
Humans 80kb (Europeans) 5kb (Nigerians) Outcrossing
Cattle gt 10 cM Outcrossing
Arabidopsis 250 kb Selfing
Maize 1 kb (Diverse maize) 1.5 kb (diverse inbred lines) gt100 kb (Elite lines Outcrossing
Barley Up to 100kb Selfing
Flint-Garcia et al., Annu. Rev. Plant Biol. 2003.
5435774
12
  • Factors that increase LD
  • mutation
  • mating system (self-pollination),
  • population structure
  • admixture
  • relatedness (kinship)
  • small founder population size or genetic drift
  • selection (natural, artificial, and balancing)
  • Factors that decrease LD
  • high recombination and mutation rate
  • recurrent mutations
  • outcrossing

13
Mutation provides the original material for
producing polymorphism that will be in LD
Allele b appears on gamete carrying A A and b
will appear together
14
  • Mating system
  • Generally LD decays more rapidly in outcrossing
    species compared to selfing, where individuals
    are likely to be homozygous
  • In selfing species, most recombination occurs
    between identical haplotypes, as a result of high
    individual homozygosity, and thus these events do
    not reduce LD
  • Selfing reduces the rate at which LD breaks down
  • When loci are closely linked in a selfing
    population they remain in high LD for many
    generations

Selfing, little or no recombination
Outcrossing 0.00 Selfing 0.99
Little recombination 0.05 High recombination
0.5
Outcrossing, high recombination
15
  • Drift / Sampling
  • In small populations the effects of genetic drift
    results in the loss of rare allelic combination,
    which increases LD.
  • Sampling increases or reduces certain allelic
    combinations by chance
  • Selection
  • Strong selection at a locus is expected to reduce
    diversity and increase LD in the surrounding
    region
  • Selection operating on a gene will increase LD
    and reduce diversity in the vicinity of that
    gene. Alleles flanking the selected gene will be
    fixed.
  • Can cause LD also between unlinked loci typical
    result of coselection of loci during breeding for
    multiple traits

16
(No Transcript)
17
LOD
18
(No Transcript)
19
  • What information we need to know the association
    mapping analysis?
  • Genotypic
  • Linkage disequilibrium decay
  • Number of markers and Marker density
  • Quality of the data missing values, minor allele
    frequency
  • Phenotypic
  • Quantitative or qualitative traits
  • Heritability of the trait, repeatability
  • Population
  • Structure
  • Kinship

20
  • Genotypic Information
  • Linkage disequilibrium decay.
  • The power of detection is highly influenced by
    the LD between the QTL and the marker

r2
r2
10 kb
100 kb
Physical distance
Physical distance
21
  • Marker density
  • The extend of LD shows the expected r2 at a given
    distance
  • According to it, it is important to chose an
    adequate marker density to increase the power of
    detection

r2
r2
10 kb
100 kb
Physical distance
Physical distance
22
  • Quality of the data
  • Number of individuals with small samples sizes,
    the probability of a significant association
    between maker and QTL is high.

Marker Distance Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7 Line 8 Line 9 Line 10 Line 11 Line 12 Line 13 Line 14 Line 15 Line 16
_3_0363_ 0 A B B A A A B A B B A B B B B B
_1_1061_ 0.8 A B B A A A B A B B A A A B B A
_3_0703_ 1.5 B A A B B B A B A A B B B B B B
_1_1505_ 1.5 B A A B B B A B A B B B B B B B
_1_0498_ 1.5 B B B B B B B B B B B B B B B A
_2_1005_ 3.8 A B B A A A B A B A A B B B B B
_1_1054_ 3.8 A A A A A A A A A B A A A A A A
_2_0674_ 6 A B B A A A B A B A A A A A A B
_1_0297_ 8.8 A A B B B B B A A A A A A A A B
_1_0638_ 10.7 A A B B B B B A A B A A A A A A
_1_1302_ 11.4 B A A A B B A A A B A B B B B A
_1_0422_ 11.4 B A A A B B A A A B A B B B B A
_2_0929_ 15.3 A B B B A A B B B A B A A A A B
_3_1474_ 15.4 A B B B A A B B B A B A A A A A
_1_1522_ 17.3 A B B B A A B B B A B A A A A A
_2_1388_ 17.3 A A A A A A A A A A A A A A A A
_3_0259_ 18.1 B B B B B B B B B B B A A A A A
_1_0325_ 18.1 B B B B B B B B B B B A A A A A
_2_0602_ 20.8 A A B A A A A B A B A A A A A A
_1_0733_ 23.9 B B B B B B B B B B B A A A A A
_2_0729 23.9 B B B B B B B B B B B A A A A A
_1_1272_ 23.9 A B B B A A B B B B B B B B B B
_2_0891_ 26.1 A A A A A A A A A B A A A A A A
_2_0748_ 26.6 B B B B B B B B B A B B B B B B
10
Desease severity
5
0
23
  • Quality of the data
  • Number of individuals with small samples sizes,
    the probability of a significant association
    between maker and QTL is high.

Marker Distance Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7 Line 8
_3_0363_ 0 A B B A A A B A
_1_1061_ 0.8 A B B A A A B A
_3_0703_ 1.5 B A A B B B A B
_1_1505_ 1.5 B A A B B B A B
_1_0498_ 1.5 B B B B B B B B
_2_1005_ 3.8 A B B A A A B A
_1_1054_ 3.8 A A A A A A A A
_2_0674_ 6 A B B A A A B A
_1_0297_ 8.8 A A B B B B B A
_1_0638_ 10.7 A A B B B B B A
_1_1302_ 11.4 B A A A B B A A
_1_0422_ 11.4 B A A A B B A A
_2_0929_ 15.3 A B B B A A B B
_3_1474_ 15.4 A B B B A A B B
_1_1522_ 17.3 A B B B A A B B
_2_1388_ 17.3 A A A A A A A A
_3_0259_ 18.1 B B B B B B B B
_1_0325_ 18.1 B B B B B B B B
_2_0602_ 20.8 A A B A A A A B
_1_0733_ 23.9 B B B B B B B B
_2_0729 23.9 B B B B B B B B
_1_1272_ 23.9 A B B B A A B B
_2_0891_ 26.1 A A A A A A A A
_2_0748_ 26.6 B B B B B B B B
10
Desease severity
5
0
24
Quality of the data Minor allele frequency
Line
Locus 2
Locus 1

a a
a a
a a
a a
a a
a a
a a
a a
a a
a a
a a
a a
a a
a a
a a
A A
a a
a a
a a
a a
a a

b b
b b
b b
b b
b b
b b
b b
b b
b b
b b
b b
B B
b b
b b
b b
b b
b b
b b
b b
b b
b b

1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 15
16 16
17 17
18 18
19 19
20 20
21 21
Two loci can be completely unlinked and still
show high LD
25
Quality of the data Missing data
b
-
b
b
B
b
B
b
b
B
-
b
b
-
b
B
b
b
-
b
-
b
b
b
b
b
26
  • What information we need to know the association
    mapping analysis?
  • Genotypic
  • Linkage disequilibrium decay
  • Number of markers and Marker density
  • Quality of the data missing values, minor allele
    frequency
  • Phenotypic
  • Quantitative or qualitative traits
  • Heritability of the trait, repeatability
  • Population
  • Structure
  • Kinship

27
  • Phenotypic
  • Quantitative or qualitative traits
  • One or more QTL involved
  • The higher the effect of the QTL, the higher the
    power of detection
  • Quantitative traits usually many genes involved
    of small effect
  • The problem of epistatic traits

Heritability of the trait, repeatability
h2Vgenotipic/Vphenotypic
28
The problem of epistatic traits
Phenotype heading date
Line
VRN1
VRN2
1
62
a
c
VRN1 and VRN2 located in different chromosomes
2
152
A
c
3
59
a
c
4
58
a
c
5
60
A
D
6
60
No association between individuals genes (VRN1 or
VRN2) and heading date
a
c
7
57
a
D
8
64
a
c
9
151
A
c
10
59
a
D
11
58
a
D
12
152
However, late heading date only when haplotype Ac
is present
a
c
13
60
a
c
14
151
A
c
15
58
a
c
16
149
A
c
17
64
A
D
18
58
a
c
19
154
A
c
20
58
a
c
21
63
a
D
60
a
22
c
153
A
23
c
58
a
24
c
57
a
25
c
64
a
26
c
29
  • What information we need to know the association
    mapping analysis?
  • Genotypic
  • Linkage disequilibrium decay
  • Number of markers and Marker density
  • Quality of the data missing values, minor allele
    frequency
  • Phenotypic
  • Quantitative or qualitative traits
  • Heritability of the trait, repeatability
  • Population
  • Structure
  • Kinship

30
Population Structure
The classical example of interference by
population structure
  • Study of type 2 diabetes in 2 tribes of Native
    Americans from Arizona
  • A correlation between a haplotype at the
    immunoglobulin G locus and reduced diabetes
  • However on further analysis it was found that
    those with diabetes had a lower proportion of
    European ancestry
  • And that the haplotype associated with reduced
    diabetes was more prevalent in Europeans
  • When the analysis was restricted to individuals
    with similar European ancestry, the association
    was no longer detected.

Knowler WC, et al. 1988. Am. J.Hum. Genet.
4352026
31
  • Population Structure
  • Similar structure exists in plants
  • Breeding history of many important crop species
    and limited gene flow have created complex
    stratification within the germplasm.
  • Different geographic origin of the germplasm
    causes population structure (usually natural
    selection tends to fix alleles at many loci
    related to adaptation).
  • Also the destination of the crop, growth habit,
    certain morphological traits.
  • This is a common cause of spurious associations

32
  • How can we allocate individuals to
    sub-populations?
  • First, we need to know in advance how many
    sub-populations there are.
  • If unknown, this can be estimated
  • The allocation process is repeated for different
    possible numbers and the best fitting selected.

33
  • The computer program STRUCTURE
  • Uses computationally intensive methods to
    partition individuals into populations.
  • Many individuals or lines will not belong
    uniquely to one, but will be the descendents of
    crosses between two or more ancestral
    populations.
  • STRUCTURE also estimates the proportion of
    ancestry attributable to each population.

34
(No Transcript)
35
The effect of kinship
y Xß Qv Zu e
Xß includes all fixed effects population means,
environments, and marker allele effects
Q is a subpopulation incidence matrix v are
estimates of subpopulation mean effects
There is a degree of relatedness not captured by
population structure u is the polygenic effect
gnerated by othre loci unlinked to the one being
tested
Write a Comment
User Comments (0)
About PowerShow.com