By Mireya Diaz - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

By Mireya Diaz

Description:

Mutation: limited role in evolution due to its slow effect, however contributes ... stages of the life of an organism (e.g. differential fecundity, viability) ... – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 22
Provided by: MDi93
Category:
Tags: diaz | mireya | viability

less

Transcript and Presenter's Notes

Title: By Mireya Diaz


1
The Coalescent Theory
  • By Mireya Diaz
  • Department of Epidemiology and Biostatistics for
    EECS 458

2
Agenda
  • Basic concepts of population genetics
  • The coalescent theory
  • Coalescent process of two sequences
  • Coalescent time
  • Statistical inference
  • Applications reconstruction of human
    evolutionary history
  • Future venues

3
Basic Concepts in Population Genetics
Mutation
Random genetic drift
Selection
4
Basic Concepts in Population Genetics
  • Mutation limited role in evolution due to its
    slow effect, however contributes to the
    maintenance of alleles in the population
  • Locus with 2 allelles A1 (p(n)) and A2
    (q(n)1-p(n))
  • Non-overlapping generations
  • A1-gtA2 at rate u and A2-gtA1 at rate v (u, v
    10-5, 10-6)
  • Allele can mutate most once/generation
  • if initial gene freq. of A1p(0)

As n-gt8
equilibrium
5
Basic Concepts in Population Genetics
  • Random genetic drift change in gene frequency
    due to random sampling of gametes from a finite
    population. Important for small size populations
  • Each generation 2N gametes sampled at random
    from parent generation
  • y(n) gametes of type A1, in absence of
    mutation and selection

Wright-Fisher model
  • One allele will be lost

6
Basic Concepts in Population Genetics
  • Selection can act at different stages of the
    life of an organism (e.g. differential fecundity,
    viability)
  • Locus with 2 alleles A1, A2
  • Three genotypes A1 A1 (w11), A1 A2 (w12), A2A2
    (w22)
  • with fitness wij, relative survival chances of
    zygotes of genotype AiAj
  • Under Hardy-Weinberg equilibrium

If w11gtw12gtw22 -gt A1 becomes fixed
w11ltw12ltw22 -gt A2 becomes fixed w11,w22ltw12
-gt overdominance, stable polymorphism w12lt
w11,w22 -gt underdominance, unstable
polymorphism, A1 or A2 becomes fixed
f(0)
7
The Coalescent Theory
  • Stochastic process continuous-time Markov
    process
  • Large population approximation of Wright-Fisher
    model, and other neutral models
  • Probability model for genealogical tree of random
    sample of n genes from large population
  • Most significant progress in theoretical
    population genetics (past 2 decades). Cornerstone
    for rigorous statistical analysis of molecular
    data from populations
  • Need of inferring the past from samples taken
    from present population
  • Seminal work Kingman, J Appl Prob 19A27, 1982

8
The Coalescent Theory Key Idea
  • Start with a sample and trace backwards in time
    to identify EVENTS in the past since the Most
    Recent Common Ancestor (MRCA) in the sample
  • Consider sample of n sequences of a DNA region
    for a population
  • Assume no recombination between sequences
  • N sequences are connected by a single
    phylogenetic tree (genealogy) where the rootMRCA

9
The Coalescent Theory Usefulness
  • Sample-based theory
  • By-product development of highly-efficient
    algorithms for simulation of samples under
    various population genetics models
  • Particularly suitable for molecular data
  • Estimate parameters of evolutionary models (vs.
    history of specific locus phylogenetics)

10
The Coalescent Process of Two Sequences
  • Consider diploid organisms
  • Wright-Fisher model
  • Sequence in a population at a generation random
    sample with replacement from those in the
    previous generation
  • Mutations at locus of interest selectively
    neutral (do not affect reproductive success, all
    individuals likely to reproduce, all lineages
    equally likely to coalesce)
  • P(coalescence at previous generation)?
  • P1/2N, Neffective population size
  • For haploid structures, use N rather than 2N

11
The Coalescent Tree
Genealogical relationship of sample of genes
  • Topology is independent of branch lengths
  • Branch lengths are independent, exponential rvs
    (waiting time between coalescent events)
  • Topology is generated by randomly picking
    lineages to coalesce -gt all topologies are
    equally likely

12
The Coalescent Time
  • Assume mutations in a given period Poisson
  • mean time 2N generation between two sequences
  • mean mutations in two sequences
  • ? 4Nm (m mutation rate seq/generations)
  • Underlying assumption randomly mating
  • ( organisms with high mobility)
  • Coalescent time time between two successive
    coalescent events
  • Exponential variable, mean 2/k(k-1)
  • k ancestral sequences between the two events

13
Coalescent Tree Parameters
Expected total branch length of the tree
14
The Coalescent Theory Statistical Inference
  • Mutation rate
  • Age of MRCA
  • Recombination rate
  • Ancestral population size
  • Migration rate

15
Reconstruction of Human Evolutionary History
  • Goal estimate times of evolutionary events
    (major migrations), demographic history
    (population bottlenecks, expansions)
  • Haploid sequences mtDNA, Y chromosome
  • Case study recent common ancestry of human Y
    chromosome
  • Source Thomson et al. PNAS 2000 977360-5
  • Estimations expected time to MRCA and ages of
    certain mutations
  • Data 53-70 chromosomes, sequences variation at
    three genes (SMCY, DBY, DFFRY) in Y chromosome

16
Recent common ancestry of Y chromosome
  • For ages of major events need mutation rate
    estimate (SN substitution)
  • Substitutions between chimpanzee and human
    sequences
  • Mutation rate per site per year No.
    subst./2TsplitL
  • Tsplit time since chimp and human split (5M
    years ago)
  • Assumptions selective neutrality of all changes
    on Y since divergence

Summary of gene characteristics from sample
Source Table 1 from article () in no.
polymorphisms after removal of length variants,
repeat sequences, indels
17
GENETREE Analysis
  • Software www.stats.ox.ac.uk/stephens/group/softw
    are.html
  • Estimate mean number of mutations ? 2Nem
  • Ne effective number of Y chromosomes in
    population
  • m mutation rate per gene per generation
  • Also expected ages of mutation, time since MRCA
  • Assumptions coalescent process,
    infinitely-many-sites mutation (mutation rate low
    enough -gt e/occurs at new site)
  • Four insertions, three deletions, two repeat
    mutations (different rates from SN substitutions)
  • Only one segregating site in SMCY appeared to
    have mutated gt1 -gt data fit infinitely-many sites
    model

18
Recent common ancestry of Y chromosome
MRCA distribution under constant population
MRCA distribution under exponential population
growth
1Expected age in Ne generations. 2Value in years
Ne25
19
GENETREE Analysis
Expected ages of mutations in tree Mutation 1
47,000 (35,000 89,000) male movement out of
Africa Mutation 2 40,000 (31,000 79,000)
beginning of global expansion
20
Future Venues
  • Population genetics models incorporation of
    migration, population growth, recombination,
    natural selection
  • Longitudinal analysis
  • Evolutionary analysis of quantitative trait loci
    (QTL)
  • Properties of CT
  • Accuracy of coalescent approximation under
    combinations of population size, sample size,
    mutation rate
  • Properties of estimators under MCMC

21
References
  • Handbook of Statistical Genetics, 2nd edition,
    Vol.2
  • Nature 2002 3380-390
  • Theoretical Population Biology 1999 561-10.
Write a Comment
User Comments (0)
About PowerShow.com