Disease Gene Discovery Using Linkage Analysis - PowerPoint PPT Presentation

1 / 84
About This Presentation
Title:

Disease Gene Discovery Using Linkage Analysis

Description:

Ph.D. Program in Human Genetics. Disease Gene Discovery is a Young, But Successful Science ... 10,000,000 validated SNPs in human genome. Association studies ... – PowerPoint PPT presentation

Number of Views:237
Avg rating:3.0/5.0
Slides: 85
Provided by: jonathan61
Category:

less

Transcript and Presenter's Notes

Title: Disease Gene Discovery Using Linkage Analysis


1
Disease Gene DiscoveryUsing Linkage Analysis
  • Scott Williams, Ph.D.
  • Center for Human Genetics
  • Research
  • Ph.D. Program in Human Genetics

2
Disease Gene Discovery is a Young, But Successful
Science
  • 1966 MIM
  • 1487 Known genetic traits
  • 0.7 Mapped
  • 1973 HGM-I
  • Several chromosomes with NO genes at all!

3
Disease Gene Discovery is an Exploding Science
  • 2003 OMIM
  • 14,340 Established genetic loci
  • 2205 Disease genes mapped
  • 15 Mapped
  • 2003 NCBI
  • 26,115 Genes mapped to all chromosomes
  • Average 8.6 genes/Mb (3.22-26.73)
  • Average 116 Kb/gene (37-310)

4
640 cubic yards
3,000 MB
1/100 cubic inch
1 x 10-6 MB
It really is like finding a needle in a
haystack! (and a very BIG haystack, at that)
5
CLASSES OF HUMANGENETIC DISEASE
  • Diseases of Simple Genetic Architecture
  • Can tell how trait is passed in a family follows
    a recognizable pattern
  • One gene per family
  • Often called Mendelian disease
  • Causative gene

6
What is the pattern of inheritance in this family?


7
CLASSES OF HUMANGENETIC DISEASE
  • Diseases of Complex Genetic Architecture
  • No clear pattern of inheritance
  • Moderate to strong evidence of being inherited
  • Common in population cancer, heart disease,
    dementia, asthma etc.
  • Involves many genes or genes and environment
  • Susceptibility genes

8
Designing Genetic Study
  • Define phenotype
  • Can sub-phenotypes be defined (genetic
    heterogeneity)
  • i.e., early vs. late age of onset

9
  • Genetic heterogeneity
  • Locus different genes give similar but maybe
    non-identical phenotype
  • Allele different mutations within a gene give
    similar but maybe non-identical phenotype

10
Prior to Designing Genetic Study
  • Is phenotype genetic?
  • Is there a measurable heritable component
  • Mode of inheritance?
  • Is disease dominant/recessive?
  • Autosomal/Sex-linked

11
Defining Genetic component
  • Twin studies monozygotic v. dizygotic twins
  • Adoption Studies sibs raised apart

12
  • Monozygotic twins vs. dizygotic twins
  • Monozygotic should be more similar than dizygotic
  • Twins raied apart share traits if genetic

13
If genetic two approaches
  • Random Screen linkage approach
  • Candidate gene approach

14
(No Transcript)
15
Approaches to Disease Gene Discovery
  • Mendelian Disease serves as paradigm
  • Clean mode of inheritance single gene defect
    causes disease
  • First Successes with Mendelian Diseases
  • Huntington Disease (1984-1993)
  • Cystic Fibrosis (1986-1989)
  • Developed paradigm for future work
  • Initial localization by mapping to marker loci
    randomly distributed throughout the genome

16
Linkage Analysis
  • Linkage analysis in humans is based on counting
    recombinants and non-recombinants, similar to the
    process in experimental animals (e.g. mouse,
    fly). However, in humans, we face additional
    challenges
  • long generation time

17
Linkage Analysis
  • Linkage analysis in humans is based on counting
    recombinants and non-recombinants, similar to the
    process in experimental animals (e.g. mouse,
    fly). However, in humans, we face additional
    challenges
  • long generation time
  • inability to control matings

18
Linkage Analysis
  • Linkage analysis in humans is based on counting
    recombinants and non-recombinants, similar to the
    process in experimental animals (e.g. mouse,
    fly). However, in humans, we face additional
    challenges
  • long generation time
  • inability to control matings
  • inability to control study participation

19
Linkage Analysis
  • Linkage analysis in humans is based on counting
    recombinants and non-recombinants, similar to the
    process in experimental animals (e.g. mouse,
    fly). However, in humans, we face additional
    challenges
  • long generation time
  • inability to control matings
  • inability to control study participation
  • inability to dictate key exposures and
    environmental conditions

20
Linkage Analysis
  • Linkage analysis in humans is based on counting
    recombinants and non-recombinants, similar to the
    process in experimental animals (e.g. mouse,
    fly). However, in humans, we face additional
    challenges
  • long generation time
  • inability to control matings
  • inability to control study participation
  • inability to dictate key exposures and
    environmental conditions
  • small family size

21
Linkage Analysis in humans
  • What is likelihood of getting the data obtained
    given linkage vs. non-linkage between marker and
    disease causing locus?

22
  • Closer linkage less likely to get recombinant
  • Unlinked 50 recombination
  • Null hypothesis marker is unlinked to disease gene

23
Assigning Genotype From Phenotype
NA NN
NA NN
NA NN NN NA
NN NA NN
A- abnormal N - Normal
24
Definitions
  • Linkage - the co-segregation of two or more loci
  • Marker locus with alleles 1 or 2 and dominant
    disease allele D

12 22
12 22
12 22 22 12
22 12 22
25
Phase of marker and disease locus
  • Phase pattern of inheritance of alleles at
    different loci from a parent
  • Can be known if grandparents known

26
Definitions
  • Linkage - the co-segregation of two or more loci
  • Marker locus with alleles 1 or 2 and dominant
    disease allele D

12 22
1D/2d
12 22
12 22 22 12
22 12 22
27
No recombination
28
Recombination
Recombinant chromosomes
29
Phase of marker and disease locus
  • Phase pattern of inheritance of alleles at
    different loci from parents
  • If together from one parent then in phase
  • Depending on phase can determine probability or
    likelihood of family structure given linkage or
    no linkage

30
Definitions
  • Depending on phase can determine probability or
    likelihood of family structure

12 22
1D/2d
12 22
12 22 22 12
22 12 22
(1-q) (1-q) (1-q) (1-q)
(1-q) (1-q) (1-q)
What is chance that marker and disease locus are
co-inherited?
Likelihood in this family is (1-q)7 If q 0,
then likelihood is 1.
31
  • Require - marker be heterozygous in individual
    with KEY meiosis
  • For rare dominant disease affected parent is key
    individual

32
12 22
22 12
22 22 22 22
22 22 22

33
  • Require - marker be heterozygous in individual
    with KEY meiosis
  • For rare dominant disease affected parent is key
    individual
  • For recessive disease with one affected parent,
    unaffected parent is key

34

35
Evidence for linkage
  • What is the likelihood of linkage vs. no linkage
    in a given pedigree
  • Substitute into likelihood equation recombination
    distance (q) vs. 0.5 for no linkage

36
Likelihood of data if marker locus and disease
locus are linked at
recombination fraction q ODDS
Likelihood of data if loci unlinked or
recombination fraction
equals 0.50
37
Likelihood Analysis
L(pedigree? x)
L(pedigree ? 0.50) where ? ? 0.49. LR is
constructed as L.R.
In our example LR (1-q)7 / 0.57
38
Assessing the chance of linkage between a marker
and disease locus
  • Keep denominator constant and substitute in
    different values of q
  • Value of q that gives largest ratio is best
    estimator of genetic distance between marker and
    disease locus because it has the best odds

39
LOD (log of the odds) Analysis
L(pedigree? x)
L(pedigree ? 0.50) where ? ? 0.49. LR is
constructed as L.R.
In our example LR (1-q)7 / 0.57
Take Log of this ratio
40
WHY LOGARITHMS?
  • Note the numbers for the likelihoods can be very
    small
  • Inheritance of disease 0.000015578
  • Inheritance of disease and marker 0.000000009347
  • These are very small numbers and hard to look at
    (too many 0s!)
  • Logs eliminate that problem
  • Log10(0.000015578) -4.81
  • Log10(0.000000009347) -8.03
  • Including data from many pedigrees add logs

41
Linkage Phase Known-Dominant Disease Model
1 2
I. II. III.
11 22
2 and D are in phase
1 2
12 22
1 2 3 4
5 6 7 8
9 10
22 22 12 12 12
22 22 22 22 12

42
Linkage Phase Known-Dominant Disease Model
1 2
I. II. III.
11 22
1 2
Likelihood q(1-q)9
12 22
1 2 3 4
5 6 7 8
9 10
22 22 12 12 12
22 22 22 22 12 NR
NR NR NR NR R NR
NR NR NR 1-q 1-q 1-q
1-q 1-q q 1-q 1-q 1-q
1-q

43
Linkage Phase Known-Dominant Disease Model
1 2
I. II. III.
Ratio is q(1-q)9 (0.5)10
11 22
1 2
12 22
1 2 3 4
5 6 7 8
9 10
z log? 9log(1-?) - 10log(0.50)
22 22 12 12 12
22 22 22 22 12 NR
NR NR NR NR R NR
NR NR NR
? 0.01 0.05
0.10 0.15 0.20 0.30 0.40 0.97
1.51 1.60 1.55 1.44 1.09 0.62
44
Evaluating Lod Scores
A). If z(?) ? 3.0, then conclude significant
evidence for linkage. 103 10001 B). If
z(?) ? -2.0, then conclude significant evidence
for non-linkage. 10-2 1001 against
linkage C). If -2.0 ? z(?) ?? 3.0, then
collect more data.
45
Some Factors that can Affect Linkage Analysis
  • Misspecification of Parameters
  • Genetic model
  • Frequency of sporadic cases (phenocopies)

46
Some Factors that can Affect Linkage Analysis
  • Misspecification of Parameters
  • Genetic model
  • Dominant vs. recessive

47

48
Factors that can Affect Linkage Analysis
  • Misspecification of Parameters
  • Genetic model
  • Frequency of sporadic cases (phenocopies)
  • Scoring Errors
  • Incorrect trait phenotype
  • Incorrect marker genotype
  • Incorrect family relationships
  • Linkage Heterogeneity

49
Role of Heterogeneity among families collected
  • Genetic
  • Different inheritance patterns for same trait
  • Retinitis Pigmentosa
  • 6 X linked loci
  • 12 AD loci
  • 8 AR loci
  • Locus
  • Different genes leading to same trait
  • Breast Cancer
  • Allelic
  • Different alleles at same locus leading to
    different phenotype
  • FGFR3
  • ACHONDROPLASIA
  • THANATOPHORIC DYSPLASIA
  • CROUZON SYNDROME WITH ACANTHOSIS NIGRICANS

50
Locus Heterogeneity
51
Locus Heterogeneity
52
Breast Cancer Mapping
  • BRCA1
  • BRCA2

53
(No Transcript)
54
(No Transcript)
55
MULTIPOINT LINKAGE ANALYSIS
  • Uses multiple markers together
  • Uses (or generates) multiple estimates of ?
  • Can provide good estimates of location
  • Very sensitive to
  • Incorrect specification of marker order
  • Genotyping errors
  • Locus heterogeneity

56
Multipoint Lod Score
A
B
C
?2
?1
57
Exercise Analyze as a dominant
disease Analyze as a recessive disease
58
(No Transcript)
59
Association Studies of Disease Using Unrelated
Samples
  • Case-Control Studies

60
Goal of case-control association study
  • Identify genes and/or alleles that
    cause/predispose to disease

61
Association studies
  • Detected by differential distribution of markers
    in the case and control groups
  • Risk increasing allele/genotype will be more
    common in disease group
  • Risk decreasing allele/genotype will be more
    common in normal/control group

62
Causative Allele
  • Sickle Cell Disease
  • b chain - Glutamate-6-Valine

Samples population
SC SC
N N
N
SC
N
SC
SC
63
Causative Allele
64
Causative Allele Allelic Association
65
Causative Allele Genotypic Association-
Recessive Model
66
Causative Allele Genotypic Association-
Recessive Model
67
Test for Allelic Association
  • c2 Test

c2 (AD-BC)2 N/(AB)(CD)(AC)(BD)
68
Susceptibility Alleles
  • True of most common diseases with genetic risk
  • Cancers, Heart Disease, Asthma, Diabetes, etc
  • The association is not complete, even if you know
    mode of inheritance and usually do not
  • Penetrance of Alleles is Incomplete
  • May increase or decrease risk but not absolute
  • HIV susceptibility and CCR5 variants

69
How to Study Genetics of Disease Using Association
  • As with linkage
  • Characterize Phenotype
  • Collect Samples and Clinical Data

70
Two Basic Approaches
  • Candidate Gene
  • Base selection of genes on basis of known
    biological function
  • e.g., Angiotensinogen for blood pressure and
    hypertension
  • Genome Scan
  • Assume no knowledge and scan markers across the
    entire genome
  • 10,000,000 validated SNPs in human genome

71
Association studies
  • Whole genome association (Not hypothesis driven)
  • Use random markers throughout genome
  • Similar to linkage in that no bias assumed
  • Candidate gene association (Hypothesis driven)
  • Choose genes on the basis of known
    physiology/function

72
Genome Scan Association
  • Similar to linkage analysis scan
  • Do not use a priori knowledge but scan markers
    throughout the genome and look for significant
    associations
  • Can do thousands to millions of markers
  • Technology 0.01 per genotype

73
(No Transcript)
74
Association studies
  • Whole genome association (Not hypothesis driven)
  • Use random markers throughout genome
  • Similar to linkage in that no bias assumed
  • Candidate gene association (Hypothesis driven)
  • Choose genes on the basis of known
    physiology/function

75
Direct or Indirect association
  • Direct
  • Identify and study functional variant
  • Indirect
  • Study variant in linkage disequilibrium with
    functional variant

76
Why cant you be sure that an association
identifies your gene?
  • Linkage/Linkage Disequilibrium
  • What is LD
  • Nonrandom association between markers
  • e.g., SNP1 a or c f(a) 0.5 SNP2 t or g f(t)
    0.3
  • Expect at together 15 of time if in equilibrium
  • What if SNP1 is next to disease marker such that
    the presence of a marks disease phenotype

77
Linkage disequilibrium
78
Hispanic
Loehmueller et al In Press
79
261 kb
Haines et al 2005
80
(No Transcript)
81
(No Transcript)
82
(No Transcript)
83
(No Transcript)
84
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com