Population Genetics Introduction - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Population Genetics Introduction

Description:

Population genetics uses statistics to understand how genetics affects large ... A and a, B and b. Paternal haplotype has A and B, Maternal haplotype has a and b ... – PowerPoint PPT presentation

Number of Views:130
Avg rating:3.0/5.0
Slides: 37
Provided by: nathanjoh
Category:

less

Transcript and Presenter's Notes

Title: Population Genetics Introduction


1
Population Genetics Introduction
  • Lecture 24 November 29, 2005
  • Algorithms in Biosequence Analysis
  • Nathan Edwards - Fall, 2005

2
Genomic Polymorphism
  • We all have a slightly different genome
  • DNA replication isnt perfect!
  • Two humans (today) have about 1 difference per
    1000 base-pairs on average.
  • Insert, deletion, substitution
  • Most of these dont matterbut some do!
  • Color-blindness is a result of a single
    nucleotide change!

3
Population Genetics
  • Population genetics uses statistics to understand
    how genetics affects large numbers of
    individuals.
  • Well established field
  • (Mostly) based on easily observed phenotypes
  • Changing due to biotechnologies that can
    determine SNPs in individuals.
  • Actual nucleotide at a particular position.

4
Terminology
  • We are diploid organisms
  • Two copies of each chromosome
  • The two copies are not exactly the same.
  • Each copy is called a haplotype
  • Each pair of copies is called a genotype
  • Each position on a chromosome is a locus
  • Each variant at a locus is an allele
  • A genotype with (two) of one allele is
    homozygous,...
  • ...with two alleles is heterozygous.

5
Haplotypes
  • 1ACGACTCAGATCACTACGTACGACT
  • 1ACGACTCAGATAACTACGGACGACT
  • 2ACGACTCAGATCACTACGTACGACT
  • 2ACGACTCAGATCACTACGTACGACT
  • 3ACGAGTCAGATCACTACGTACGACT
  • 3ACGAGTCAGATAACTACGGACGACT

6
Haplotypes
  • 1ACGACTCAGATCACTACGTACGACT
  • 1ACGACTCAGATAACTACGGACGACT
  • 2ACGACTCAGATCACTACGTACGACT
  • 2ACGACTCAGATCACTACGTACGACT
  • 3ACGAGTCAGATCACTACGTACGACT
  • 3ACGAGTCAGATAACTACGGACGACT

7
Genotypes
  • ACGACCTCAGATCAACTACGTGACGACT
  • ACGACCTCAGATCCACTACGTTACGACT
  • ACGAGGTCAGATCAACTACGTGACGACT

8
Genotyping Technology
  • Can only tell us about a particular locus, so we
    lose information about an individuals haplotypes
  • 1 C, A,C, G,T
  • 2 C, C, T
  • 3 G, A,C, G,T

9
Genotyping Technology
  • Can only tell us about a particular locus, so we
    lose information about an individuals haplotypes
  • 1 0, 2, 2
  • 2 0, 1, 1
  • 3 1, 2, 2

10
Haplotypes
  • Haplotypes are important because they are
    inherited...
  • not genotypes
  • We inherit one copy of each chromosome from each
    parent
  • One haplotype from each parent
  • If no mutation or recombination, we receive one
    of each parents two haplotypes
  • and so on, back to founder population

11
Haplotypes
  • How do new haplotypes enter the population?
  • mutation recombination when germ cells are
    produced in each parent
  • Mutation isnt that rare, but two mutations at a
    particular locus is very rare
  • 3x10-9 that any particular site is chosen once
  • Infinite sites model assumes that each DNA
    position can change at most once.

12
Infinite sites model
  • What are the implications of the infinite sites
    model?
  • We never revert at a locus
  • Shared alleles must be from a common ancestor.
  • At most 2 alleles at each polymorphic locus
  • If no recombination, we could unambiguously trace
    our lineage back through time.

13
Recombination
  • In addition to DNA mutations, new haplotypes are
    introduced by recombination during germ cell
    production.
  • Recombination merges parental haplotypes.
  • new haplotypes are passed to offspring.

14
Recombination
15
Recombination
  • Consider two loci, each with two alleles.
  • A and a, B and b.
  • Paternal haplotype has A and B,
  • Maternal haplotype has a and b
  • Without recombination, child gets one haplotype
    with either A and B, or a and b.
  • Due to recombination, child might (with some
    probability) get one haplotype with A and b, or B
    and a.
  • What is the chance of a recombination event
    between these loci?
  • Decreases with the distance between two loci.

16
Recombination
  • Two point mutations close to one another on one
    chromosome are more likely to be inherited
    together
  • Two point mutations on different chromosomes or
    far from each other will be observed independently

17
Harvey-Weinberg Equilibrium
  • Given a locus with 2 alleles, A and a
  • p is frequency of A in the (haplotype) population
  • q 1-p is frequency of a in the population
  • Three genotypes are possible AA, Aa, aa
  • If various assumptions are satisfied (random
    mating, no natural selection, etc)
  • P AA p2, P Aa 2pq, P aa q2

18
Harvey-Weinberg Equilibrium
  • Assumptions include
  • Diploid
  • Sexual reproduction
  • Random mating
  • Bi-allelic sites
  • Large population size
  • Basic model Each individual randomly picks his
    two haplotypes from the population

19
Linkage (Dis)-equilibrium
  • HWE provides a model for observed genotypes at a
    single locus.
  • What about alleles on a single haplotype at two
    loci?
  • If extensive recombination
  • PA,B(0,0) 0.375
  • PA,B(0,1) 0.125
  • PA,B(1,0) 0.375
  • PA,B(1,1) 0.125
  • Linkage equilibrium

A B 0 1 0 1 0 0 0 0 1 0 1
0 1 0 1 0
20
Linkage (Dis)-equilibrium
  • HWE provides a model for observed genotypes at a
    single locus.
  • What about alleles on a single haplotype at two
    loci?
  • If no recombination
  • PA,B(0,0) 0.25
  • PA,B(0,1) 0.25
  • PA,B(1,0) 0.5
  • PA,B(1,1) 0.0
  • Linkage disequilibrium (LD)

A B 0 1 0 1 0 0 0 0 1 0 1
0 1 0 1 0
21
Measures of LD
  • Two bi-allelic sites A and B with 0,1 alleles
  • Let P00 P A 0 B 0 P0 P A 0
    , P0 P B 0 ,
  • If P00 P0 P0, then linkage equilibrium
  • D abs(P00 - P0 P0) abs(P01 - P0 P1) ...

22
LD over time
  • With random mating, and fixed recombination rate
    (r) between sites, LD will disappear
  • Let D(t) LD at time t
  • P(t)00 (1-r)P(t-1)00 r P(t-1)0 P(t-1)0
  • D(t) P(t)00 - P(t)0 P(t)0 P(t)00 -
    P(t-1)0 P(t-1)0 (HWE)
  • D(t) (1-r) D(t-1) (1-r)t D(0)
  • LD decays exponentially with time.

23
Halotype phasing
  • Current technology doesnt provide information
    about haplotypes, we get geneotypes instead.
  • Haplotype phasing is the resolution of a genotype
    into two haplotypes.

24
Genotyping Technology
Probes for each allele
Genomic DNA
SNP site
25
Genotyping Technology
Genomic DNA
SNP site
26
Genotyping Technology
Genomic DNA
SNP site
27
Haplotype phasing
  • When is phasing easy?
  • g1 0 1 0 0 0 1 1 0 0 1 1
  • g2 1 0 0 1 1 0 1 0 0 1 0
  • g3 0 1 1 0 1 1 0 1 0 1 1
  • g4 0 1 0 1 0 1 2 0 0 0 1

28
Haplotype phasing
  • When is phasing easy?
  • h11 0 1 0 0 0 1 1 0 0 1 1
  • h12 0 1 0 0 0 1 1 0 0 1 1
  • ...
  • h41 0 1 0 1 0 1 0 0 0 0 1
  • h42 0 1 0 1 0 1 1 0 0 0 1

29
Haplotype Phasing
  • What happens if we have more than one ambiguous
    site?
  • g5 0 2 0 2 0 1 1 0 2 1 1 (s 3)
  • h51 0 0 0 0 0 1 1 0 0 1 1
  • h52 0 1 0 1 0 1 1 0 1 1 1
  • 3 other phasings are possible. (2s-1)

30
Halotype phasing
  • Rely on linkage disequilibrium
  • Many fewer haplotypes than 2S-1
  • Assume (haplotype) population consists of a small
    number of distinct haplotypes
  • We tend to share haplotypes
  • Or at least chunks of haplotypes
  • Clarks rule (1990) proposed a common sense
    approach.

31
Clarks Rule
  • Find unambiguous individuals
  • (at most 1 ambiguous locus)
  • Form initial list of known haplotypes
  • Resolve ambiguous individuals
  • If possible, use two known haplotypes
  • Otherwise, use one known haplotype and add new
    haplotype to list.
  • If unphased individuals remain
  • Assign phase randomly for one individual

32
Clarks Rule
  • Basic principle
  • Rely on known haplotypes
  • Create as few haplotypes as possible
  • What can go wrong?
  • How to start if no unambiguous individuals?
  • What happens when we get stuck?
  • Unambiguous 00 forces 22 to 11, when 01, 10 might
    be more likely.
  • Maybe 00 is a rare event?
  • Often works pretty well, if unambiguous
    individuals sample common haplotypes
    thoroughly.
  • Fast!

33
Clarks Rule
  • We can enhance Clarks rule by selecting
    genotypes to phase and haplotypes to use, based
    on their frequency.
  • Many different models, objectives are plausible,
    and they give different results!

34
HapMap Consortium
  • International consortium measuring genotypes and
    comparing SNP and haplotype information across
    the human genome
  • 90 Africans (30 parent-child trios)
  • 90 Americans (30 parent-child trios)
  • 45 Chinese
  • 44 Japanese

35
HapMap Data
36
HapMap Data
Write a Comment
User Comments (0)
About PowerShow.com