Drosophila%20Population%20Genetics - PowerPoint PPT Presentation

About This Presentation
Title:

Drosophila%20Population%20Genetics

Description:

Darwin was the first person to recognize clearly that evolutionary change over ... eye-colour mutations such as cardinal can be found in natural populations) ... – PowerPoint PPT presentation

Number of Views:124
Avg rating:3.0/5.0
Slides: 92
Provided by: bio1154
Category:

less

Transcript and Presenter's Notes

Title: Drosophila%20Population%20Genetics


1

Drosophila Population Genetics
Brian Charlesworth Institute of
Evolutionary Biology School of
Biological Sciences University of
Edinburgh
2
Why is intra-specific variability interesting?
  • A high degree of variability is obviously
    favourable, as freely giving the materials for
    selection to work on Charles Darwin, The Origin
    of Species, Chap. 1.
  • Darwin was the first person to recognize clearly
    that evolutionary change over time is the result
    of processes acting on genetically controlled
    variability among individuals within a
    population, which eventually cause differences
    between ancestral and descendant populations.
  • Knowledge of the nature and causes of this
    variability is crucial for an understanding of
    the mechanisms of evolution, animal and plant
    breeding, and human genetic diseases.

3
Classical and quantitative genetic studies of
variation
  • Classical genetics reveals the existence of
    discrete polymorphisms in natural populations,
    but is necessarily limited either to chromosomal
    rearrangements such as inversions that can be
    detected cytologically, or to conspicuous
    phenotypes such as eye colour or body colour
    (flies carrying certain eye-colour mutations such
    as cardinal can be found in natural populations).
  • Within a given species, only a handful of such
    polymorphisms can easily be detected. Relatively
    few cases of discrete polymorphisms affecting
    morphological traits are known.

4
The classic polymorphism of Drosophila
pseudoobscura
A human inversion polymorphism
5
  • Quantitative genetics reveals the existence of
    ubiquitous genetic variation in metrical and
    meristic traits.
  • Most metric traits have a coefficient of
    variation (the ratio of the standard deviation to
    the mean) of 5-10.
  • Measurements of the resemblances between
    relatives show that 20-80 of the variance in
    such traits is typically due to genetic factors.
  • This type of variation is of great evolutionary,
    medical and economic significance, but measuring
    it does not tell us anything about the details of
    its genetic control (numbers of loci involved,
    frequencies of variant alleles, etc.).

6
  • Studies of concealed variability (revealed by
    inbreeding) indicates the existence of low
    frequency recessive alleles usually with
    deleterious effects, that are not normally
    detectable in a large random-mating population.
  • The results of close inbreeding (e.g. by
    brother-sister matings) are
  • 1. Reduced mean performance of a set of inbred
    lines, with respect to traits like survival,
    fertility and growth rate.
  • 2. Increased variability among lines,
    sometimes involving abnormalities caused by
    single gene mutations.

7
  • While amply validating Darwins view that there
    is plenty of variation available for evolution to
    utilize, this evidence leaves two important
    questions unanswered
  • How much variation within a natural population is
    there at an average locus? Classical genetics
    provides no means of sampling loci at random from
    the genome, without respect to their functional
    importance or level of natural variability.
  • To what extent does natural selection as opposed
    to mutation and/or genetic drift control the
    frequencies of allelic variants within
    populations? The classical genetics bias towards
    genes with conspicuous phenotypic effects means
    that strong selective forces are likely to be
    operating. Such genes might well be
    unrepresentative of the global picture.

8
Molecular genetics to the rescue
  • The solution to question (a) is to use the fact
    that genes correspond to stretches of DNA that
    code for proteins.
  • If either the protein sequence corresponding to
    a gene, or its DNA sequence, can be studied
    directly, then we can look at variation within
    the population without having to follow visible
    mutations, i.e. there is no need for prior
    knowledge of the existence of variation.
  • We can also look at variation in non-coding
    sequences.

9
Electrophoretic variation
  • The first steps were taken in the mid-1960s by
    Lewontin and Hubby, working in Chicago on the
    fruitfly Drosophila pseudoobscura, and by Harris
    in London, working on humans.
  • They used the technique of gel electrophoresis
    of proteins to screen populations for variants in
    a large number of soluble proteins controlled by
    independent loci, mostly enzymes with
    well-established metabolic roles. The proteins
    were chosen purely because they could be studied
    easily.

10
  • The results of the early electrophoretic surveys
    were startling a large fraction (as high as 40)
    of loci were found to be polymorphic (i.e. they
    exhibited one or more minority alleles with
    frequencies greater than 1).
  • An average D. pseudoobscura individual was
    estimated to be heterozygous at 13 of the 24
    protein loci that had been studied by 1974 i.e. a
    random individual sampled from the population
    would be expected to have distinct maternal and
    paternal alleles at 13 of its protein-coding
    loci.
  • Much lower levels of heterozygosity (or gene
    diversity the chance that two randomly chosen
    copies of a gene are different) were found in
    mammals, and much higher levels in bacteria.

11
  • This work conclusively refuted the view that
    loci are only rarely polymorphic.
  • However, it raised more questions than it
    answered. In particular, there were several
    biases in the data. Only soluble proteins could
    easily be studied, and amino-acid changes that do
    not affect the mobility of proteins on gels are
    not detected by electrophoresis.
  • Similarly, any changes in the DNA that do not
    affect the protein sequence go undetected.

12
DNA sequence variation
  • The advent in the late 1970s of methods for
    cloning and sequencing of DNA meant that studies
    of natural variation could be carried out at the
    DNA level. This eliminates virtually all the
    possible biases in quantifying variability.
  • With the advent of PCR amplification for
    isolated specific regions of the DNA, and with
    relatively cheap automated sequencing, this is
    now the method most commonly used in surveys of
    variation.
  • Efforts are currently under way in D.
    melanogaster to scale this resequencing up to
    the whole genome level.

13
  • The pioneering work on directly comparing
    homologous DNA sequences sampled within a species
    was carried out by Martin Kreitman in Lewontins
    lab at Harvard in the early 1980s.
  • Kreitman sequenced 11 independent copies
    (alleles) of the Adh (alcohol dehydrogenase) gene
    of D. melanogaster, isolated from collections
    made around the world. He sequenced 2379 bases
    from each of these alleles, an heroic effort in
    those days.

14
  • His work succeeded in
  • Demonstrating a high level of variability at the
    level of individual nucleotide sites, a factor of
    ten or so higher than would have been expected
    from the typical level of heterozygosity for
    protein polymorphisms
  • Showing that nearly all of this variability
    involved silent changes that did not affect
    protein sequences, i.e. the changes were either
    in regions that did not code for amino-acids or
    involved synonymous changes in codons.
  • The only amino-acid polymorphism detected was
    that already known to cause the difference
    between the fast (F) and slow (S) electrophoretic
    alleles of Adh.

15
Kreitmans Adh Results
  • Intron 1 Coding Region 3'
    Non-Transcr.
  • Silent Sites
  • Segregating 1.7 6.7
    0.6
  • No. Sites 654 765
    767
  • No non-silent substitutions found (other than
    F/S) 39 are expected if variability were same as
    for silent sites.

16
  • These results demonstrate that the protein
    sequence is highly constrained by selection, i.e.
    most mutations affecting the amino-acid sequence
    of a protein cause selectively disadvantageous
    changes to its functioning, and are eliminated
    rapidly from the population.
  • Most variation that is detected in coding
    sequences (typically over 85 in Drosophila) thus
    involves synonymous variants. Non-coding region
    variation shows a similar level to synonymous
    variation.
  • These results suggest that most variation and
    evolution at the DNA level may be due to neutral
    or nearly neutral mutations, whose fate is
    controlled by genetic drift rather than
    selection, especially as much of the genome is
    non-coding, even in Drosophila.

17
How to measure DNA sequence variation
Allele 1
ATGCTTAGCGTTGGCATCCTAGCGATCGAG
Allele 2
ATGCTTGGCGTTGGCATCCTAGCGATCGG
Allele 3
ATACTTAGCGTTGGCATCCTCGCGATTGAG
18
  • The nucleotide site diversity (?) for a given
    set of alleles sampled from a population is the
    frequency with which a randomly chosen pair of
    alleles differ at a given site.
  • It can be calculated from data on a sample of
    homologous DNA sequences, by determining the sum
    of the numbers of differences between all
    possible pairs of sequences.
  • The result is divided by the product of the
    number of sequences that were compared (this
    equals n(n-1)/2, if there are n independent
    alleles), and the number of bases studied.

19
  • In the example, n 3, so n(n-1)/2 3.
  • The total number of pairwise differences between
    all 3 combinations of sequences is 1 3 4 8.
  • To get the pairwise diversity per site, we
    divide this by 3 times the number of sites, so
    that
  • ? 8/(3 x 30) 0.089

20
  • An alternative method of measuring variation is
    simply by counting the number of sites that are
    segregating in the sample, S.
  • By dividing S by the product of the number of
    bases in the sequence and the sum
  • a 1 1/2 1/3 ... 1/(n -1)
  • we obtain a statistic called Wattersons ?w.

21
  • If the population is at equilibrium and there is
    no selection, ? w is expected to be similar in
    value to ?.
  • In the example, we have S 4, and a
    ?????????????????
  • Hence
  • ? w 4/(30 x 1.5) 0.089

22
  • Under the neutral theory of evolution,
    variability in DNA sequences reflects the balance
    between the input of new variants by mutation and
    their loss by random fluctuations in frequencies
    caused by finite population size (genetic drift).

23
  • Under this model, variant frequencies at a locus
    are always shifting around, but a statistical
    equilibrium will eventually be reached if
    population size stays constant.
  • The expected value of the pairwise diversity in
    the population is then given by
  • q 4Nem
  • where m is the neutral mutation rate per site,
    and Ne is the effective population size, which
    controls the rate of genetic drift.
  • The expected values of both p and ? w are equal
    to ?.

24
  • Estimates of ? have now been obtained from many
    different kinds of organisms, by sampling sets of
    homologous genes from natural populations and
    sequencing them.
  • Rough average values over many genes for silent
    nucleotide are as follows
  • Escherichia coli (bacterium) 0.05
  • Drosophila melanogaster 0.02
  • (African)
  • Homo sapiens 0.001

25
  • Knowledge of m enables us to estimate Ne from q.
  • For example, with m 4 x 10-9, and q 0.02, we
    obtain Ne 1.25 x 106.
  • Drosophila effective population sizes are
    therefore very large.

26
Detecting Selection
  • One of the major goals of evolutionary genetics
    is to understand to what extent selection, as
    opposed to neutral forces of mutation and genetic
    drift, controls variation and evolution in DNA
    and protein sequences.
  • The methods for doing this often involves
    combining data on sequence divergence between
    species with data on polymorphism within species.

27
Forms of selection
  • Purifying selection, which acts to prevent the
    spread of deleterious mutations, e.g. those
    affecting the amino-acid sequences of proteins.
  • Positive directional selection, which causes an
    adaptive mutation to spread through a species
  • Balancing selection, which maintains alternative
    variants in the population
  • Directional and balancing selection are often
    collectively referred to as positive selection.

28
  • Use of sequence divergence data
  • The simplest situation is when we have two
    homologous (aligned) DNA sequences from a pair of
    related species.
  • For the purpose of discussion, assume that all
    evolutionary change occurs by nucleotide
    substitutions, i.e. the sequence differences are
    caused entirely by one nucleotide base changing
    into another by mutation.
  • This is usually the case for coding sequences,
    since insertions or deletions cause disruption of
    functionality.

29
The total time separating a pair of sequences
from the two species is 2T
30
Neutral sequence evolution
  • Under neutral evolution, K is expected to be
    equal to the mutation rate (m) times the
    divergence time between the two species, i.e.
  • K 2 m T
  • The simplest way to understand this is to note
    that, under neutral evolution, the expected
    number of mutations that distinguish a pair of
    sequences is equal to the time separating them
    (2T) times the rate of mutation per unit time
    (m).

31
  • We compare K values for nucleotide sites where
    mutations can reasonably be assumed to be neutral
    or nearly neutral with K for sites where we wish
    to test for selection larger than neutral K
    values indicate directional selection, and
    smaller than neutral K values indicate purifying
    selection.
  • Nonsynonymous sites are usually used as the
    candidates for selection, but there is increasing
    use of defined types of non-coding sequences.

32
Evidence for pervasive purifying selection
This comes from the fact that both K and q for
nonsynonymous variants are nearly always
much smaller than for synonymous and noncoding
sites.
33
Statistics on diversity and divergence in D.
miranda (species 1 18 loci) and D.
pseudoobscura (species 2 14 loci)
All values are percentages Divergence (K) is
measured between D. miranda and D. affinis. (KS
between mir pseudo is 3.5) L. Loewe et al. 2006
Genetics 172 1079-1092.
34
Divergence of mel-sim introns
P. Haddrill et al. 2005 Genome Biol. 6 R67. 1-8.
35
Effects of deleterious mutations on fitness
  • There are clearly a lot of deleterious mutations
    entering the population each generation, most of
    which will eventually be eliminated by selection
  • While the mean level of variability is much lower
    for nonsynonymous than synonymous mutations, this
    could simply mean that all the deleterious ones
    are rapidly removed by selection, so that the
    amino-acid variants that we see segregating are
    in fact selectively neutral.

36
  • It is a topic of current research to try and
    estimate the distribution of selection
    coefficients on deleterious amino-acid and silent
    variants in natural populations
  • Estimate for amino-acid variants indicate a wide
    distribution, such that the mean selection
    coefficient against a heterozygous non-synonymous
    variant is of the order of 10-5
  • Values for synonymous or silent variants are much
    smaller, of the order of 10-6.

37
Faster divergence in coding than non-coding
sequences suggests positive selection
Positive directional selection
  • In the OdsH gene of three Drosophila species,
    divergence in the homeodomain is highly
    significantly accelerated
  • This directly suggests selection

C. Ting et al. 1998 Science 2821501-1504
38
The McDonald-Kreitman test
  • Compares non-synonymous and synonymous site
    divergence between species, and non-synonymous
    and synonymous site diversity within species, in
    the same gene
  • If variants at both kinds of sites were neutral,
    the numbers of substitutions at the two kinds of
    sites between two species should be in the same
    ratio as the polymorphism within either species,
    assuming equilibrium between drift and mutation
  • Neutral divergence 2Tm
  • Neutral diversity 4Nem

39
  • If the ratio of non-synonymous variants to
    synonymous variants for differences between
    species is greater than the ratio for
    within-species variation, this suggests positive
    directional selection
  • If the opposite is the case, either purifying
    selection or balancing selection is acting

40
Centromeric histone protein evolution
  • Alignment of the Cid proteins of five
    melanogaster subgroup species with histone H3
    proteins from D. melanogaster (2.3 million years
    divergence )with E. histolytica (gt 1 billion
    years divergence)
  • The most divergent histone H3 sequences have gt75
    identity to each other, whereas centromeric
    H3-like proteins are much more diverged (3550
    identical to histone H3).

41
Sliding window analysis of Cid
50-nucleotide (nt) window, in steps of 10 nt,
using all sites
N-terminal tail region (mostly non-synonymous)
p or K
C-terminal core (mostly synonymous
substitutions)
intraspecific polymorphism within D. simulans (p)

interspecific divergence (K)
42
Evidence for adaptive evolution in D.
melanogaster simulans Cid
  • Polymorphism was studied in D. melanogaster (15
    strains) and D. simulans (8 strains), and
    divergence between them
  • Non-synonymous synonymous (NS) ratios differ
    significantly (P lt 0.0025)
  • For divergence between the species 1810
  • For pooled polymorphic sites within the two
    species 928
  • McDonald-Kreitman test for the D. melanogaster
    lineage (box)
  • P lt 0.006

H. Malik S. Henikoff 2001 Genetics 157
1293-1298
Fixed diffs Polymorphic sites
Non-syn 8 0
Synonymous 4 9
43
  • Using data on many different genes, methods have
    been developed to use the McDonald-Kreitman
    approach to estimate what fraction of amino-acid
    differences between D. melanogaster and D.
    simulans are caused by directional selection.
  • This fraction is of the order of 25, a
    surprisingly high value.
  • N. Bierne A. Eyre-Walker 2004 Mol. Biol. Evol.
    21 1350-1360.

44
Indirect evidence for selection selective sweeps
  • After an advantageous mutation has spread through
    a population, the level of polymorphism will be
    reduced across the region (i.e. at closely linked
    neutral sites)
  • This is because a unique selectively favourable
    mutation may arise at a site in a DNA sequence
    that is completely linked to a polymorphic
    variant segregating in a population
  • J. Maynard Smith J. Haigh 1974
    Genet. Res. 12 12-35.

45
A selective sweep fixes variants linked to the
selected siteIt is a form of hitch-hiking
  • as the black (advantageous) variant increases in
    frequency in a population, it causes low
    diversity at closely linked sites in a sequence
    (white circles)

46
A recent selective sweep is detectable if the
time since selective substitution is sufficiently
small (around 0.25Ne generations), but there is a
lot of noise
47
Indirect evidence for selection statistics of
variant frequency distributions
  • It is also possible to work out the frequencies
    at which variants are expected to be found in
    equilibrium populations, under both neutrality
    and selection
  • Under neutrality, most variants are expected to
    be quite rare
  • If selection is operating on the sequence, it
    will affect the frequencies of variants in the
    sample
  • This forms the basis for some tests for
    selection, and methods for estimating the
    intensity of selection.

48
  • Assuming neutrality and equilibrium, the expected
    value of both ? and ?w 4Nem
  • If ? ? ? w, it suggests the possibility of
    selection
  • If there are excess rare variants, compared with
    what is expected under neutrality, this suggests
    purifying selection
  • Excess high frequency variants might suggest
    balancing selection or the presence of
    advantageous mutations spreading in the
    population
  • BUT there are two problems
  • We have to test whether the difference could be
    produced by chance
  • The population may not have been constant in
    size, as assumed in the model, and so its
    demographic history may cause ? ? ? w

49
Statistical tests must be used!
  • Things we estimate from a sample may look very
    different from the average that is expected
  • Statistical tests are necessary to decide whether
    a sample could not have arisen by a process of
    neutral mutation and drift. Only if we can say
    this, can we conclude that something such as
    selection has affected the sequences.
  • Neutrality is used as a null hypothesis

50
The spread of an advantageous mutation affects
diversity very much like a bottleneck, but only
on the region around the gene
Extreme bottleneck One haplotype present, then
new neutral variants occur ???lt ?w , negative
Tajimas D
Fixed advantageous mutation One haplotype
selected, then new neutral variants occur ???lt ?
w , Tajimas D lt 0
51
Evidence for a selective sweep on the neo-X
chromosome of D. miranda D. Bachtrog 2003 Nat.
Genet. 34 215-219.
52
Genome scans for selective sweeps
  • There is currently a lot of interest in using
    scans of variability across the genome, to look
    for patterns that suggest a recent selective
    sweep.
  • The hope is that this will lead to
    identification of the mutations that have been
    favoured by selection.

53
  • One subject of study is non-African populations
    of D. melanogaster and D. simulans, which are
    believed to have originated relatively recently
    (10,000 years ago??) from ancestral African
    populations.
  • They must have adapted to their new
    environments. It should be possible to see which
    regions of the genome show evidence of selective
    sweeps.
  • The problem is that they have also gone through
    bottlenecks of small population size, which has
    similar effects to sweeps, but are distributed
    over the whole genome.

54
Relative values of microsatellite (A) and
sequence diversity (B) in non-African and
African populations of D. melanogaster
B. Harr et al. (2002) Proc. Natl. Acad. Sci. USA
99, 12949-12954
55
Scan of 250 approximately 500 bp non-coding
sequences across the X chromosome of mel (L.
Ometto et al. 2005 M.B.E. 22 2119-2130)
Q is the probability of getting as many as the
observed number of polymorphisms in the European
sample on a bottleneck model Empty and filled
circles indicate sig. negative or positive
Tajimas D.
56
  • Some recent research problems in my lab
  • What is the typical magnitude of selection on
    mutations that alter codon usage?
  • Are non-coding sequences evolving neutrally?

57
  • The genetic code is degenerate there are at
    least two codons for each amino-acid except
    methionine and tryptophan
  • The 3rd coding position is often redundant, so
    that at least some changes in it frequently
    result in no change in the protein sequence

58
  • The genetic code is degenerate there are at
    least two codons for each amino-acid except
    methionine and tryptophan
  • The 3rd coding position is often redundant, so
    that at least some changes in it frequently
    result in no change in the protein sequence

59
  • It might be thought that synonymous changes would
    have no effect on fitness, so that such changes
    could be treated as selectively neutral
  • If this is so, the frequency with which codons
    corresponding to a particular amino-acid are used
    should correspond to the frequencies with which
    they would be expected to be produced by randomly
    combining their constituent nucleotides
  • It quickly became apparent in the early days of
    DNA sequencing that this was not the case, and
    that there is considerable codon usage bias in
    many species

60
  • The proportion of codons in a gene that are
    preferred (major codons) provides an index of
    overall codon bias (major codon usage or MCU)
  • A variant of this method has become popular with
    the advent of databases of levels of gene
    expression to identify codons that are more
    frequently used in genes with high levels of
    expression
  • These are often called optimal codons, and the
    frequency of optimal codons in a gene is known as
    Fop. This term is now often used for MCU

61
  • An important observation is that there is a
    general tendency for patterns of codon usage to
    be fairly consistent across different genes in
    the genome i.e. the same codons are preferred in
    different genes, although the level of bias
    varies considerably, and there are differences
    between species in the nature of the preferred
    codons
  • General levels of codon usage are well-conserved
    evolutionarily

62
(No Transcript)
63
  • These facts suggest that the forces affecting
    the use of preferred codons mainly operate across
    the whole genome, rather than being specific for
    individual genes, although the magnitude of these
    forces varies considerably.

64
The evolution of codon usage bias
  • In most species there is substantial variation at
    synonymous nucleotide sites, even in genes with
    high levels of codon usage bias (of the order of
    1-2 per cent diversity per site in many
    Drosophila species)
  • This means that any selection on codon usage must
    be weak in relation to other evolutionary
    factors, such as genetic drift and mutation.
  • In order to understand codon bias, we need
    population genetic models that take all three
    factors into account

65
Modelling codon usage evolution(the Li-Bulmer
model)
  • The simplest model that can be made is for a
    random-mating population with a large number of
    independently evolving sites
  • Each site has two alternatives preferred and
    unpreferred codons (A versus a)

66
Evolutionary forces
  • Selection for preferred over unpreferred codons
  • Mutation in either direction (preferred to
    unpreferred, and vice-versa).
  • Genetic drift (random sampling of allele
    frequencies). Its effectiveness is inversely
    related to the effective population size (Ne )

67
  • Selection is less effective at preventing
    deleterious mutations becoming polymorphic than
    spreading to fixation.
  • It was suggested in 1995 by Hiroshi Akashi that
    this result could be used to test for present-day
    selection on codon usage
  • This requires a species in which synonymous
    single nucleotide polymorphisms at numerous
    codons exist, and in which the ancestral state of
    each SNP can be inferred

68
  • Polymorphic mutations can then be classified as
    preferred (P) to unpreferred (U)
  • In addition, we need to identify fixed
    differences from a related species as P ??U or U
    ? P, to check whether codon bias is in
    evolutionary equilibrium.
  • These differences are assumed to have
    accumulated in the two focal species since the
    split between them

69
  • If codon usage is in equilibrium, the numbers of
    fixations in the two directions must be equal
  • Since selection has less of an effect on
    polymorphic mutations than fixations, we thus
    expect a deficiency of U ? P polymorphisms, and
    an excess of P ??U polymorphisms
  • Mutational bias and mutation rates do not affect
    these statistics, if codon usage is in
    equilibrium

70
The species of choice
  • We have been using three Drosophila species for
    this purpose
  • D. miranda is used for the polymorphism study
  • D. pseudoobscura is a very close relative (less
    than 4 silent site divergence from miranda)
  • D. affinis is a more distant outgroup species
    (about 23 silent site divergence from the other
    two)
  • Codons were classified as preferred (P) versus
    unpreferred (U), using Akashis codon usage
    table for D. pseudoobscura.

71
(No Transcript)
72
(No Transcript)
73
(No Transcript)
74
(No Transcript)
75
(No Transcript)
76
(No Transcript)
77
Polymorphism/divergence for codon usage changes
for 18 X and autosomal genes
  • P ??U U ??P
  • Fixed 19 12
  • Polymorphic 37 6
  • rpd 1.95 0.50
  • Ratio of rpd values 3.9

C. Bartolomé et al. 2005 Genetics 169 1495-1507
78
  • For a sample of n homologous sequences from the
    population, the expected fraction of P ? U
    polymorphisms among both P ? U and U ??P
    polymorphisms is
  • ? upI0/(up I0 v1-p I1)
  • where
  • I0 is the probability that a P ? U
    polymorphism is detected in the sample
  • I1 is the probability of detecting a U ? P
    polymorphism
  • p is the proportion of P codons in the
    sequence
  • u and v are the mutation rates for P ? U and U
    ? P changes

79
(No Transcript)
80
  • If the Li-Bulmer formula for equilibrium p is
    substituted into this equation, we get the simple
    relation
  • ? I0 /(I0 I1e - ? )
  • i.e. the proportion of P ??U polymorphisms
    depends only on ?? 4Net.
  • This allows us to use maximum likelihood to
    estimate the value of ? and its approximate 95
    confidence limits.

81
  • For all 18 genes together, the maximum likelihood
    of ? was 2.5 (2-unit support limits 1.5 - 3.8.
  • This value is not significantly different from
    those obtained after dividing the dataset into
    two groups of genes with low bias (Fop lt 0.60, ?
    2.6) and high bias (Fop gt 0.63, ? 2.2).
  • This lack of an apparent difference may reflect
    the limited range of Fop values the average
    Fopvalues for the low and high bias groups were
    0.50 0.024 and 0.66 0.009, respectively.

82
  • These results suggest that Net for mutations
    changing codon usage in D. miranda is between
    0.38 to 0.96, with an ML value of 0.62
  • Silent polymorphism data suggest an Ne of about
    800,000 for miranda. The selection coefficient s
    is thus about 8 x 10-7
  • This is much lower than previous estimates of Net
    by Akashi and coworkers for simulans and
    pseudoobscura (around 1 or more)
  • It agrees well with an estimate using the same
    approach for americana

83
GC to AT changes
84
  • Similar methods to those applied to P and U
    codons can be applied to GC content at 3rd coding
    positions (GC3) to explain the observed mean
    value of 69 with the estimated level of
    selection requires a mutational bias of over
    3-fold in favour of GC to AT mutations
  • This predicts a GC content of 23 for non-coding
    sequences, if these are evolving neutrally, as
    opposed to an observed value of around 36
  • The implication is that non-coding sequences are
    subject to non-neutral evolution, despite our
    failure to detect it.

85
Formation of a neo-Y chromosome
86
  • The two autosomal copies in males segregate with
    the sex chromosomes in the first division of
    meiosis, in such a way that one always
    accompanies the X into a sperm, and the other
    accompanies the Y.
  • The lack of crossing over in male Drosophila
    means that the neo-Y chromosome is immediately
    placed in a genetic environment that is identical
    to that of the true Y chromosome.

87
(No Transcript)
88
From Bachtrog Charlesworth (2002) Nature 416
323-326.
89
Relaxed selection on codon usage
  • Fixations were assigned to the neo-X and neo-Y
    branches, subsequent to the neo-X/neo-Y split
  • Neo-X Neo-Y
  • P ? U 15 47 p 0.014
  • U ? P 7 4

Bartolomé and Charlesworth 2006 Genetics
1742033-2044
90
Polymorphisms on the neo-X versus the neo-Y
  • On a Mantel-Haenszel test, there is a
    significant excess (p lt0.001) of non-synonymous
    relative to silent polymorphisms on the neo-Y
    compared with the neo-X, indicating a relaxation
    of purifying selection on the neo-Y.

91
ACKNOWLEDGEMENTS
  • THE HARD EXPERIMENTAL WORK Doris Bachtrog,
    Carolina Bartolomé, and Soojin Yi
  • HELP WITH FLY-COLLECTING Deborah Charlesworth
  • PROVISION OF LAB FACILITIES ON COLLECTING TRIP
    Dan Barbash, Chuck Langley
  • IDENTIFICATION OF MIRANDA STRAINS Doris Bachtrog
  • TECHNICAL ASSISTANCE Helen Borthwick and Helen
    Cowan
  • MONEY BBSRC, Royal Society
  • THEODOSIUS DOBZHANSKY for discovering D. miranda
    71 years ago, and for the posthumous loan of his
    field microscope
Write a Comment
User Comments (0)
About PowerShow.com