Title: Quantitative Genetics: The Basics
1Quantitative Genetics The Basics
- Barbara Karczeski, MS
- DNA Diagnostic Laboratory
- Johns Hopkins University
2Outline
- This week
- Basic Probability
- Hardy-Weinberg Equilibrium
- Bayesian Analysis
- Linkage Analysis
- Next Week
- Problems
3Basic Probability
- The chance of two events happening is the product
of the chance that each will happen. - Probability P (A) x P (B)
4Basic Probability
- The chance of either one event OR another
happening is the sum of the chance that each will
occur. - Probability P (A) P (B)
5An Example
- In a family of 6 children
- The youngest child has been diagnosed with MSUD.
- The parents wish to know the chance that all of
their other children are MSUD carriers. - Whats the chance that at least one is a carrier?
6- First
- You need to know what the chance is that 5
children are MSUD carriers.
7The chance each child is a carrier is 2/3. The
chance all 5 children are carriers is 2/3
x 2/3 x 2/3 x 2/3 x 2/3 32/243
About 13
8- Second
- You need to know the chance that at least 1 of
the 5 children is a carrier MSUD carriers. - Note Sometimes its easier to figure out the
opposite of what youre looking for, and then
subtract this number from 1.
9The chance any child is NOT a carrier is
1/3. The chance all 5 children are NOT carriers
is 1/3 x 1/3 x 1/3 x 1/3 x 1/3
1/243 1 1/243 242/243 or About 99
10 11Hardy-Weinberg Law
- Genotypes are distributed in proportion to the
frequencies of the individual alleles in a
population and remain constant from generation to
generation if the population is at equilibrium
12Assumptions of Hardy-Weinberg law
- 1. Random mating
- 2. Population is infinitely large
- 3. No selection
- 4. No new mutations
- 5. No migration
13For a 2 allele, diploid system, an individuals
genotype is equal to the likelihood of inheriting
allele 1 (p) or allele 2 (q) from one parent and
allele 1 (p) or allele 2 (q) from the other
parent.
P (allele 1 OR allele 2) AND P (allele 1 OR
allele 2)
(p q) x (p q) (p q)2 1 (i.e. all
possible genotypes)
14- In a 2 allele system, the fraction of the alleles
that are p the fraction of alleles that are
q 1. - p q 1
- But we are diploid, so we must take into account
2 events happening (one allele for each gene
copy)
15- If p q 1 for one allele, then for 2 alleles
- (pq) x (pq) 1
- or
- p2 2pq q2 1
16- p2 and q2 represent the homozygotes
- and
- 2pq represents the heterozygotes
p and q are allele frequencies (GENES) p2, 2pq
and q2 are genotype frequencies (PEOPLE)
17The HW equation can be used to calculate gene
frequencies and heterozygote (carrier)
frequencies when the incidence of the genetic
trait (allele frequency) is known (or vise
versa).
18Autosomal Dominant Trait
- Using p and q for wild type and mutated alleles,
which genotypes would cause Huntington Disease?
19Autosomal Dominant Trait
- Using p and q for wild type and mutated alleles,
which genotypes would cause Huntington Disease? - qq and pq
20Autosomal dominant trait Â
-
- Because homozygous patients for a dominant
disease (q2) are usually rare, this number is
essentially 0 and can be ignored. Therefore the
disease incidence is equal to 2pq (the
heterozygotes). - Since 1/2 of the heterozygotes' alleles are the
HD gene and 1/2 are the normal gene, the
frequency of the HD gene, q, is equal to 1/2
(2pq).
21The incidence of Huntington disease is 1/10,000.
What is the frequency of the HD gene?
- 2pq q2 1/10,000 essentially
- 2pq 1/10,000
- Since 1/2 of the heterozygotes' alleles are the
HD gene and 1/2 are the normal gene, the
frequency of the HD gene, q, is equal to 1/2
(2pq), or 1/20,000.
22Autosomal recessive trait
- For an AR trait, q2 is equal to the incidence of
the trait because only those homozygous for a
mutation are affected. - Therefore the frequency of recessive gene (q)
the square root of the disease incidence ( vq2)
23Autosomal recessive traitCystic fibrosis affects
approximately 1 in 2,500 non-Jewish, Caucasians
in the US. What is the frequency of the CF gene
and what is the frequency of heterozygotes (CF
carriers)?
- If q2 is 1/2500, q 1/50
- so p 49/50
24- p 49/50 q 1/50
- The carrier frequency (2pq) equals
- 2 x 49/50 x 1/50 ? 1/25
- For rare recessive traits (incidence p, the frequency of the normal allele is very
close to one and the carrier frequency equals 2q.
25An example
- The incidence of the recessive disorder, Ataxia
Telangiectasia, in Finland is 1/90,000. Whats
the carrier frequency? - q2 1/90,000 (incidence)
- q 1/300 (frequency of mutated gene)
- 2pq 2 x 1 x 1/300
- 2pq carriers 1/150
26Lets go backwards now
- The carrier frequency of B-thalassemia on a small
island in Malaysia is 1/60. Whats the disease
incidence? - 2pq 1/60 (carriers)
- q 1/60 x 1/2 x 1 1/120 (gene)
- q2 disease incidence 1/14400
27 28Bayes Theorem
29The principle of Bayesian probability
- .. To adjust a persons prior risk, most often of
having a mutant gene (dominant, recessive, or
X-linked)... - by taking into account further information such
as unaffected children, negative laboratory
tests, etc. (the conditional risk) - to provide a modified or posterior risk.
30The principle of Bayesian probability
- Based on family history, my patient has a risk of
1/3 to be a DMD carrier (Prior). - But I know that she has 3 healthy sons AND had a
negative CPK, which both lower the risk that
someone would be a carrier (Conditions). - What is her modified risk taking into account
this other information (posterior)?
31The two sided coin
- In Bayesian analysis you must look at your
concern (the probability my patient is a
carrier) as well as the opposite event (the
probability she is NOT a carrier) to take into
account all of the possibilities
32The Equation
Posterior Probability Joint P (X) Joint P
(X) Joint P (Y)
33Bayesian ProbabilityExample 1
Retinoblastoma (Rb), a malignant tumor of the
eye, may be inherited as an autosomal dominant
trait. Penetrance is about 4/5 (80). Michaels
paternal grandmother, father and brother had Rb,
but Michael is unaffected. What is the risk that
Michaels child will develop retinoblastoma?
34Example 1
Step 1 Draw Pedigree.
Michael
35Example 1
Step 2 Do Bayesian calculation for Michaels
risk to carry the Rb gene.
Carrier Non-Carrier Prior Conditional Joi
nt Posterior
36Example 1
Step 2 Do Bayesian calculation for Michaels
risk to carry the Rb gene.
Carrier Non-Carrier Prior 1/2
1/2 Conditional Joint Posterior Based
on pedigree information.
37Example 1
Step 2 Do Bayesian calculation for Michaels
risk to carry the Rb gene.
Carrier Non-Carrier Prior 1/2 1/2 Condition
1/5 5/5 Joint Posterior M
ichael is unaffected 80 penetrance.
38Example 1
Step 2 Do Bayesian calculation for Michaels
risk to carry the Rb gene.
Carrier Non-Carrier Prior 1/2 1/2 Condition
al 1/5 5/5 Joint 1/10 5/10 Posterior
Joint probability prior x conditional
39Example 1
Step 2 Do Bayesian calculation for Michaels
risk to carry the Rb gene.
Carrier Non-Carrier Joint 1/10
5/10 Posterior . 1/10 .
1 . 5/10 . 5 1/10 5/10
6 1/10 5/10 6 Posterior joint
probability over sum of joint probabilities.
40Example 1
Congratulations! You did it. But thats not the
question you were asked.
Carrier Non-Carrier Prior 1/2 1/2 Condition
al 1/5 5/5 Joint 1/10 5/10 Posterior 1/6
5/6 Michaels risk to carry the Rb gene is
1/6
41Answer Example 1
Step 3 Calculate Michaels childs chance to
develop Rb.
1/6 x 1/2 x 4/5 1/15 Michael is passes
gene is gene carrier on gene penetrant The
chance that Michaels child will develop Rb is
1/15.
42Bayesian ProbabilityExample 2
- Jakes sister has cystic fibrosis. DNA reveals
that she has one ?F508 mutation but her other
mutation is unidentified. Jakes DNA testing
does not reveal any mutation. His wife has no
family history of CF, and her DNA testing was
also negative. The population carrier frequency
is 1/25. Assume mutation analysis detects 90 of
CF mutations. - What is Jakes chance to have a child with CF?
43Example 2
Step 1 Draw Pedigree.
44Example 2
Step 2 Calculate Jakes risk. Dont need
Bayes here!
Jake
?/ ?F508
Jakes parents are obligate carriers. One
carries the ?F508 mutation and the other the
unidentified mutation.
45Example 2
Step 2 Calculate Jakes carrier risk.
N
?F508
N/N
N/?F508
N
?/?F508
?
N/?
Since Jake is unaffected, his carrier risk is
2/3...
46Example 2
Step 2 Calculate Jakes carrier risk.
N
?F508
N/N
N/?F508
N
?/?F508
?
N/?
but negative mutation analysis reduces his risk
to 1/2
47Example 2
Step 2 Calculate Jakes carrier risk.
1/2
Jake N/N or N/?
?/?F508
Jakes carrier risk goes from 2/3 to 1/2. Cannot
be reduced any further although mutation analysis
detects 90 of mutations.
48Example 2
Step 3 Calculate Jakes wifes carrier risk
using Bayes.
Carrier Non-Carrier Prior Conditional Joi
nt Posterior
49Example 2
Step 3 Calculate Jakes wifes carrier risk.
Carrier Non-Carrier Prior 1/25 1 -
1/25 24/25 Conditional Joint Posterior
Based on population carrier frequency of CF
gene.
50Example 2
Step 3 Calculate Jakes wifes carrier risk.
Carrier Non-Carrier Prior 1/25 24/25 Condi
tional 1/10 10/10 Joint Posterior Muta
tion analysis negative.
51Example 2
Step 3 Calculate Jakes wifes carrier risk.
Carrier Non-Carrier Prior 1/25 24/25 Condit
ional 1/10 10/10 Joint 1/250 240/250 Pos
terior Joint prior x conditional
52Example 2
Step 2 Calculate Jakes wifes carrier risk.
Carrier Non-Carrier Joint 1/250
240/250 Posterior . 1/250 .
1 . 240/250 . 240 1/250
240/250 241 1/250 240/250 241 Posterior
joint probability over sum of joint
probabilities.
53Example 2
Step 3 Calculate Jakes wifes carrier risk.
Carrier Non-Carrier Prior 1/25 24/25 Condi
tional 1/10 10/10 Joint 1/250 240/250 Pos
terior 1/241 240/241 Wifes carrier risk is
reduced to 1/241 due to negative DNA analysis.
54Answer Example 2
Step 4 Calculate their chance to have a child
with CF.
1/2 x 1/241 x 1/4 1/1928 Jake is a Wife
is a Both pass carrier carrier on CF
gene Their chance to have a child with CF is
1/1928.
55 56Linkage Analysis
- No one likes to do it, but its still on the
boards.
57Linkage Analysis
- An indirect method of testing
- You dont identify the mutation that causes
disease in a patient - You identify a set of polymorphisms that seem to
travel with the mutation in that family.
58So why do you use linkage?
- The gene has not been identified, but the locus
is known - There is no direct test available
- The family doesnt have a mutation identifiable
by the direct test - Someone in the family is now pregnant and theres
no time to undertake / transition to a direct test
59To do a linkage analysis, you need to
- Know whos who in the family
- Know whos affected, whos healthy and whos been
evaluated - Develop a specific question to answer through
linkage - Figure out which family members are necessary to
include in the test
60Linkage Vocabulary
- Marker a specific polymorphism near or in the
gene of interest that you will assay - Genotype patients result at a specific marker
- Haplotype The patients pattern of genotypes
across all markers assayed
61Linkage Vocabulary
- Recombination The chance that a crossover will
occur between a marker and your theoretical
mutation disrupting the haplotype - Informativeness The presence of heterozygous
markers in a family that will be useful in
linkage analysis (allow us to determine which
allele segregates with disease, which alleles are
inherited from which parent, etc)
62The Accuracy of Linkage
- Depends on several variables,
- Some are test specific
- Some family specific
- Some are patient specific
- Some are a combination
63The Accuracy of Linkage
- Recombination detectable and undetectable
- Variation of the markers youve selected and the
familys informativeness - Available family members and their phenotypes
64Types of Markers
- Single nucleotide polymorphisms (SNP)
- A person can only be A, C, T, or G at any marker
- Most markers have a two allele system C or T
A or C G or T - Only 3 genotypes possible.
65Types of Markers
- Restriction Fragment Length Polymorphisms (RFLP)
- A restriction enzyme cuts () or doesnt cut (-)
at a marker. - Only 3 genotypes possible.
66Types of Markers
- Short Tandem Repeats (STR) or Variable Number
Tandem Repeat (VNTR) - Copy number repeat polymorphisms with a wider
variability (differs by marker). - At a particular dinucleotide polymorphism, a
person might have 6, 7, 8, 9, 10 or 11
dinucleotide repeats.
67An Example
- A family seeks linkage analysis for Marfan
Syndrome. FBN1 testing was negative and all
affected family members meet published diagnostic
criteria. - The family wishes to know if their youngest
daughter, Amy (age 1) is affected.
68This linkage assay looks at three markers in the
FBN1 gene
- Marker 1 A dinucleotide repeat in intron 1
- Marker 2 A trinucleotide repeat in intron 21
- Marker 3 A C/T SNP in the last coding exon
69Unaffected
Affected
Unknown
Sue
Bob
Jim
Beth
Bill
Jen
Amy
Josh
70Unaffected
Affected
Unknown
Sue
Bob
Jim
Beth
Bill
Jen
Amy
Josh
71(No Transcript)
72(No Transcript)
73(No Transcript)
74(No Transcript)
75(No Transcript)
76(No Transcript)
77(No Transcript)
78(No Transcript)
79An example of linkage data
Unaffected
Affected
Unknown
Sue
Bob A/B
Beth A/C
Jim D/E
Bill
Jen C/E
Amy A/D
Josh A/E
80An example of linkage data
Unaffected
Affected
Unknown
Sue
Bob A/B
Beth A/C
Jim D/E
Bill
Jen C/E
Amy A/D
Josh A/E
Predicted to be Affected
81When good linkage goes bad
82Uninformative linkage data
Unaffected
Affected
Unknown
Sue
Bob A/B
Beth A/A
Jim D/E
Bill
Jen A/E
Amy A/D
Josh A/E
Affected ???
83A recombination
84- Questions?
- Thank you
- Do your homework!