Title: Molecular population genetics of adaptation from recurrent beneficial mutation
1Molecular population genetics of adaptation from
recurrent beneficial mutation
- Joachim Hermisson and Pleuni Pennings,
-
- LMU Munich
2- How can genetic variation be maintained in a
population in the face of positive selection?
3Selective sweepwith recombination
4Selective sweep with recombination
5Selective sweepwith recombination
6Selective sweepwith recombination
7Selective sweepwith recombination
8Recurrent mutation
- Classical view
- Adaptive substitutions occur from a single
mutational origin
9Recurrent mutation
- Classical view
- Adaptive substitutions occur from a single
mutational origin - What happens if the same beneficial allele
- occurs recurrently in a population?
10Soft sweepfrom recurrent mutation
11Soft sweepfrom recurrent mutation
12Soft sweepfrom recurrent mutation
13Soft sweepfrom recurrent mutation
14Soft sweepfrom recurrent mutation
frequency ?
time ?
15Is recurrent mutation relevant?
- What is the probability of a soft sweep under
recurrent mutation? - What is the impact on patterns of neutral
polymorphism?
16Model
- Haploid population of constant size Ne
- At selected locus recurrent mutation of rate u
to a beneficial allele (or a class of equivalent
alleles) with selective advantage s - Scaled values q 2Ne u , a 2Ne s, R 2Ne r
- Generation update Wright-Fisher model
(fitness weighted multinomial sampling)
17Coalescent viewGenealogy of a sample from a
linked locus
- What can happen one generation back in time?
t 1 t
1- xt
xt
n lines
18Coalescent viewCoalescence of two lines
t 1 t
1- xt
xt
19Coalescent viewRecombination
t 1 t
1- xt
xt
20Coalescent viewNew mutation at selected site
t 1 t
1- xt
xt
21Coalescent view
- Problem Rates for
- coalescence
- recombination
- beneficial mutation
- depend on the frequency x of the selected allele
- stochastic path
22Coalescent viewClassic case Coalescence and
recombination
- Probability for multiple haplotypes in a sample
after a sweep due to recombination - (Higher orders Etheridge, Pfaffelhuber,
Wakolbinger) - small for large a (strong selection makes broad
sweep patterns) -
23Coalescent view Coalescence and mutation, sample
of size 2
- Probability for coalescence before mutation
(single haplotype) -
24Coalescent view Coalescence and mutation, sample
of size 2
- Probability for coalescence before mutation
(single haplotype) -
25Coalescent view Coalescence and mutation, sample
of size 2
- Probability for coalescence before mutation
(single haplotype) -
26Coalescent view Coalescence and mutation, sample
of size 2
- Probability for coalescence before mutation
(single haplotype) -
27Coalescent view Coalescence and mutation, sample
of size 2
- Probability for coalescence before mutation
(single haplotype) -
28Coalescent view Coalescence and mutation, sample
of size 2
- Probability for single or multiple haplotypes
T1 average time to the first coalescence or
mutation-event
29Coalescent view Coalescence and mutation, sample
of size 2
- Sampling at time of fixation 0 lt T1 lt Tfix
30Coalescent view Coalescence and mutation, sample
of size 2
- General sampling Tobs generations after fixation
extra factor can be ignored for Tobs ltlt Ne
31Coalescent view Coalescence and mutation, sample
of size 2
- Sampling at time of fixation 0 lt T1 lt Tfix
Tfix / Ne 4 log(a) / a , a 2Ne s (scaled
selection strength)
32Coalescent view Coalescence and mutation, sample
of size 2
Simulation results (? 0.4)
33Coalescent view Coalescence and mutation, sample
of size 2
- For a gt 500 Tfix / Ne ltlt 1, thus
- Corresponds to approximation
34Coalescent view Coalescence and mutation, sample
of size n
35Coalescent view Coalescence and mutation, sample
of size n
36Coalescent view Coalescence and mutation, sample
of size n
Continuous time and time rescaling
Neutral coalescent !
37Coalescent view Coalescence and mutation, sample
of size n
- Problem independent of the path xt and all
selection parameters
38Coalescent view Coalescence and mutation, sample
of size n
- Problem independent of the path xt and all
selection parameters - Coalescent of the infinite alleles model
- Forward in time Hoppe urn or Yule process with
immigration
39Coalescent view Coalescence and mutation, sample
of size n
- Problem independent of the path xt and all
selection parameters - Coalescent of the infinite alleles model
- Forward in time Hoppe urn or Yule process with
immigration - The sampling distribution of ancestral haplotypes
- can be approximated by the distribution of
family sizes - in a Hoppe urn or a Yule process with
immigration -
- Solved problem
40ResultsEwens sampling formula
- Probability for k haplotypes, occurring n1,, nk
times - in a sample of size n
41ResultsEwens sampling formula
- Probability for more than one ancestral haplotype
in a sample - (soft sweep)
42ResultsProbability of a soft sweep
43ResultsProbability of a soft sweep
Simulation (2Ne s 10 000, n 20)
gt4 haplos
100
4 haplos
80
3 haplos
60
2 haplos
40
1 haplo
20
0
q 1
q 4
q 0.4
q 0.04
q 0.004
44ResultsProbability of a soft sweep
Simulation (2Ne s 10 000, n 20)
gt4 haplos
100
4 haplos
80
3 haplos
60
2 haplos
40
1 haplo
20
0
q 1
q 4
q 0.4
q 0.04
q 0.004
Probability for multiple haplotypes gt 5 for q gt
0.01 gt95 for q gt 1
45ResultsFrequency of major haplotype
0.5
Sample size 10
0.4
a 100
a 1000
0.3
a 10000
prediction
0.2
0.1
0
5/10
6/10
7/10
8/10
9/10
46When should we expect soft sweeps?Multiple
haplotypes due to recurrent beneficial mutations
- Strong dependence on the mutation rate
- More than 5 for q gt 0.01
- E.g. African D. melanogaster q 0.05 (Li /
Stephan 2006) - About 16 of all single-site adaptations soft
- Particularly relevant for
- Large populations (e.g. bacteria)
- Adaptive (partial) loss-of-function mutations
47Soft sweeps in data?
- Drosophila
- Schlenke and Begun (Genetics 2005) LD pattern at
3 immunity receptor genes in Californian D.
simulans - Humans
- Multiple origin of FY-0 Duffy allele (loss of
function) - Plasmodium
- Multiple origins of pyrimethamine resistance
mutations
48Generality of the resultmigration instead of
mutation
- Beneficial alleles enter by recurrent migration
at rate M 2Ne m from a genetically
diverged source population - Coalescent analysis with migration rate
49Generality of the resultmigration instead of
mutation
- Beneficial alleles enter by recurrent migration
at rate M 2Ne m from a genetically
diverged source population - Coalescent analysis with migration rate
- Directly proportional to coalescence rate (no
factor 1- xt) - Approximation holds exactly in this case
50Generality of the resultmigration instead of
mutation
M 0.4
q 0.4
a
51Generality of the resulttime or
frequency-dependent selection
- Results independent of the stochastic path xt of
the frequency of the beneficial allele - Independent of any form of time or frequency
dependence of the selection strength - In particular Independent of the level of
dominance - In particular Holds also for adaptation from
standing genetic variation (number of independent
origins)
52Generality of the resultvariance in selection
coefficients
- If beneficial allele corresponds to a class of
alleles - some fitness differences among variants likely
- Assume 2 classes of alleles with selective
advantage - (D coefficient of variation)
53Generality of the resultvariance in selection
coefficients
?
?
0.01
?
1
0.1
100
90
80
70
gt4
60
4
50
Number of haplotypes
3
40
30
2
20
1
10
0
D
0
0.1
0
0.1
0
0.1
0.2
0.2
0.2
0.01
0.05
0.01
0.05
0.01
0.05
54Generality of the resultvariance in selection
coefficients
q 0.1
0.4
0.3
D0
Frequency of major haplotype
D0.01
0.2
D0.05
D0.1
0.1
D0.2
0
5/10
6/10
7/10
8/10
9/10
55Footprint of selectionFrequency spectrum of
polymorphic sites
- Ewens neutral coalescent prior to the sweep
- Derive frequency distribution of ancestral
variation that survives the sweep - Skew toward intermediate allele frequencies
- (singleton frequency lower than neutral)
- In contrast
- Recombination haplotypes are most likely at low
frequency
56Footprint of selectionFrequency spectrum of
polymorphic sites
Probability of event
57Footprint of selectionFrequency spectrum of
polymorphic sites
x
1-x
recombination
coalescence
mutation
58Footprint of selectionIncluding recombination
- Analytical results
- E.g. Probability for a single haplotype in sample
of two - General Marked Yule process with immigration
- For now
- Simulation results
- Add recurrent mutation to simulation program by
Yuseob Kim
59Footprint of selectionPower of Tajimas D test
at the selected gene
- Neutral locus at recombination distance R to
selected site - Recombination width of the neutral locus Rn 10
- Neutral mutational input qn 10
- a 2Ne s 10000
- Sample size 20
- Power of Tajima D for various recombination
distances and - sampling times after fixation of the beneficial
allele
60Footprint of selectionPower of Tajimas D test
single origin
0
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
61Footprint of selectionPower of Tajimas D test
q 0.1
0
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
62Footprint of selectionPower of Tajimas D test
q 0.4
0
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
63Footprint of selectionPower of Tajimas D test
q 1
0
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
64Footprint of selectionCondition on soft sweeps
negative D
0
q 0.1
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
65Footprint of selectionCondition on soft sweeps
positive D
0
q 0.1
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
66Footprint of selectionTests based on linkage
disequilibrium
- E.g. number-of-haplotypes test (K-test) by
Depaulis and Veuille - Conditioned on number of segregating sites
- Zero recombination assumed for neutral comparison
- Other values as before
- Power of K for various recombination distances
and - sampling times after fixation of the beneficial
allele
67Footprint of selectionPower of haplotype test
single origin
0
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
68Footprint of selectionPower of haplotype test
q 0.1
0
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
69Footprint of selectionPower of haplotype test
q 0.4
0
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
70Footprint of selectionPower of haplotype test
q 1
0
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
71Footprint of selectionPower of haplotype test
q 4
0
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
72Footprint of selectionCondition on soft sweeps
number of haplotypes
0
q 0.1
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
73Footprint of selectionTests based on linkage
disequilibrium
- Can we extend high power to a longer time after
fixation? - Idea
- Use only ancestral variation
- E.g. local adaptation to an island use only
shared polymorphisms with the continental founder
population - Adapt neutral standard of the test accordingly
74Footprint of selectionCondition on soft sweeps
ancestral haplotypes
0
q 0.1
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
75Footprint of selectionCondition on soft sweeps
ancestral ZnS
0
q 0.1
0.01
0.05
Time since fixation in 2Ne
generations
0.1
0.2
0.5
1
100
0
10
20
200
600
Distance in units of R 2Ne r
76Summary
- Soft sweeps from recurrent mutation likely for
biologically realistic parameter values - Pattern described by Ewens sampling distribution
- Result very stable with respect to the selection
scenario - May be detected by LD tests, in particular if
recent mutations can be sieved out
77Open Issues
- Unified Yule process (?) theory of coalescence,
recombination, and mutation - Description of LD patterns after soft (or hard)
sweeps Which aspect lasts the longest?