Genetic Variation in Populations - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Genetic Variation in Populations

Description:

consider the parameters and statistics required to describe the genetic ... 3. Brown pelicans living on Anacapa Island off. the coast of Southern California ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 36
Provided by: yixua
Category:

less

Transcript and Presenter's Notes

Title: Genetic Variation in Populations


1
Genetic Variation in Populations
  • Xiaolin Yin
  • 11/16/2007

2
Outline
  • The Biological Problem
  • Variation in Human Populations
  • Gene mapping
  • ----linkage analysis ----association
    analysis
  • Modeling Gene Frequencies in Populations
  • ----Wright-Fisher model ----Coalecence model

3
The Biological Problem
  • Gene mapping
  • Inferring evolutionary relationships between
    organisms
  • ----based on Genetic Variations (a kind of
    character, can be described by allele
    frequencies)
  • ----consider the parameters and statistics
    required to describe the genetic characters and
    genetic changes in populations

4
The Concept of Population
  • Population is a localized collection of
    individuals of a species that are capable of
    exchanging the genes that characterize that
    species.
  • 1. San people of southern Africa
  • 2. Native American people (Indian)
  • 3. Brown pelicans living on Anacapa Island
    off
  • the coast of Southern California

5
Variation in Human Populations
  • The eight allele frequencies of the particular
    locus D12S2070 in seven geographic regions of the
    world

6
Describing Variation
  • The heterozygosity H for any locus within a
    population is defined as
  • Here is the frequency of allele I
  • Note a function of K and

7
A Simple Example
  • H10.32
  • H20.5

8
A Simple Example
  • H1H20.32

9
Population Structure
  • Stratified (forward, subpopulations)
  • Hierarchical (backward, groupings)
  • (local population ? regional population
  • ? world population)
  • Relationships between heterozygosities calculated
    as averages of subpopulation data and
    heterozygosities calculated from pooled data

10
  • Suppose a total population is stratified by B
    subpopulations.
  • Considering one locus having K alleles.
  • Let be the average fraction of allele i in
    the total population
  • Let be the fraction of allele i in
    subpopulation b
  • The total population heterozygosity

11
  • The heterozygosity of subpopulation
  • The average heterozygosity for the
    subpopulations
  • Thus

12
Gene mapping
  • Recombination Chromosome pairs usually recombine
    during gamete formation

13
Linkage Analysis
  • Family data containing affected individual
  • Recombination rate (r) the probability that
    alleles at two loci on a chromatid come from
    different parental chromosomes
  • Genetic map distance (m) the expected number of
    crossovers between the two loci

14
Relation between r and m
Assume
  • Define
  • Then

Note m r when r is very small
15
Association Analysis
  • Population data
  • Linkage Disequilibrium(LD)
  • --nonrandom a ssociation of alleles in
    haplotypes
  • Two allele-two loucs LD
  • --locus 1 locus 2

16
Some properties of LD
  • Equivalent statements
  • Upper and lower boundary

17
Regularized LD
  • 1
  • 2

is the square of the Pearson product-moment
correlation coefficient
18
LD Decay Property
19
Factors Affecting LD
  • Recombination rate
  • Mutation
  • Genetic drift
  • Natural selection
  • Migration of population

20
Modeling Gene Frequencies in Populations
  • The Wright-Fisher Model
  • The population size N is constant from
    generation to generation.
  • Organisms are diploid (so there are 2N copies
    of each gene).
  • All members of each generation reproduce
    simultaneously generations do not overlap
  • Mating among individuals is random.
  • Without mutation, migration, or selection

21
How to form an offspring
  • Choose an individual at random from the
    population,
  • then, choose one if its gametes at random
  • Return the chosen individual to the population
  • Repeat the experiment
  • This results in two gametes that form an
    individual in the next generation
  • Repeated N times to form the next generation

22
(No Transcript)
23
Genetic Drift
  • More precisely allelic drift
  • --A statistical effect that results from the
    influence that chance has on the survival of
    alleles

24
The Wright-Fisher Model as a Markov Chain
  • Denote the number of A alleles in generation n by
    , outcomes being 0, 1, 2, . . . , 2N.
  • Sequence X0,X1, . . . is a Markov chain, the
    transition matrix of the chain is

25
h(n) is the expected heterozygosity in the
population in generation n. This is the
probability that two genes chosen at random (with
replacement) are different alleles
26
Coalescent
  • Look generations backward
  • Given a population as it exists now, and we may
    want to make inferences about how it reached its
    current state.
  • The coalescent (Kingman, 1982) is a very useful
    stochastic process that allows us to model the
    ancestry of genes in the population

27
Coalescence Model
28
  • Allowing each gene in generation g - 1 to
    choose its ancestor from among the 2N gene
    copies that existed in generation g.
  • Some of the gene copies in generation g may be
    chosen multiple times, and others may be chosen
    not at all.
  • The process is repeated, going back from
    generation g to generation g1.
  • Because some gene copies are not chosen in each
    generation going back, the number of ancestors
    becomes smaller and smaller until the lineages
    coalesce to a single ancestor some number of
    generations ago

29
Coalescence for Pairs of Genes
  • The time to the most recent common ancestor (
    ) for two gene sequences from the sampled
    population
  • The expected number of pairwise differences
    between a pair of sequences

30
TMRCA for two gene sequences
2N gene exists in each generation
When N is large, let t g/2N
31
The expected number of pairwise differences
between a pair of sequences
  • X The number of mutations that occur along a
    sequence of length g generations

Let
then
32
Coalescence in multiple genes
  • n genes taken from 2N genes
  • The probability that the n genes have distinct
    ancestors in the previous generation is

33
  • the time taken for the first coalescence
    event of the n genes
  • The expected time to the most recent common
    ancestor of the sample of n genes is

34
THANKS!
35
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com