Association Studies To Locate Human Disease Genes - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Association Studies To Locate Human Disease Genes

Description:

The Robert S Boas Center for Genomics and Human Genetics ... between disease and functional SNPs (causative of disease) of candidate gene ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 35
Provided by: marshaa
Category:

less

Transcript and Presenter's Notes

Title: Association Studies To Locate Human Disease Genes


1
Association Studies To Locate Human Disease Genes
  • Wentian Li, Ph.D
  • The Robert S Boas Center for Genomics and Human
    Genetics
  • North Shore LIJ Institute for Medical Research

March 08, 2005
2
GENE
PHENOTYPE/DISEASEENVIRONMENT
3
GENETIC MARKERGENEPHENOTYPE/DISEASEENV
IRONMENT (controlled, fixed)
Linkage disequilibrium
4
Early history of association analysis (1921)
  • blood type (ABO) and disease association
  • JA Buchanan, ET Higley (1921) "The relationship
    of blood groups to disease", British Journal of
    Experimental Pathology 2247-255.

5
Early history of association analysis (1945)
  • The suggestion to use ABO blood type/secretor
    polymorphism to detect association with diseases
  • EB Ford (1945), "Polymorphism", Biological
    Reviews, 2073-88.

6
(No Transcript)
7
Early history of association analysis (1953-54)
  • Ian Aird, HH Bentall, JA Fraser-Roberts (1953),
    "A relationship between cancer of stomach and the
    ABO blood groups", British Medical Journal,
    1799-801.
  • I Aird, HH Bentall, JA Mehigan, JAF Roberts
    (1954), "The blood groups in relation to peptic
    ulceratiuon and carcinoma of the colon, rectum,
    breast and bronchus an association between the
    ABO groups and peptic ulceration", British
    Medical Journal, 2315-321.

8
Early history of association analysis (1960s)
  • Polymorphism in Human Leukocyte Antigen (HLA)
    system (also known as Major Histocompatibility
    (MHC)) and disease association
  • International Histocompatibility Workshop (first
    one in 1964)

9
Divergence between linkage and association
analysis for human disease gene detection
(1970s-1980s?)
  • Both are based on the same principle that the
    genetic polymorphism (itself may not have
    function) and the disease gene (it has function)
    lie close to each other on the chromosome.
  • Only the techniques are different
  • Association (and linkage disequilibrium) became
    mainly a topic in population genetics (with the
    exception of HLA-disease association analysis)

10
Differences between linkage analysis and
association analysis
  • Linkage analysis is based on pedigree data
  • Association analysis is based on population data
  • Linkage analyses rely on recombination events in
    action
  • Association analyses rely on ancestral
    recombinations
  • The statistic is linkage analysis is to count the
    number of recombinants and non-recombinants
  • The statistical method for association analysis
    is statistical correlation

11
The domination of linkage analysis (1980s?)
  • The easy determination for restriction fragment
    length polymorphism (RFLP) made linkage analysis
    popular again
  • Linkage analysis helped to locate chromosomal
    regions for dozens of rare Mendelian diseases (in
    1983, the first disease gene, for Huntington
    disease, was mapped )
  • Even easier for typing and denser genetic marker
    microsatellite markers

12
Association analysis was brought back to disease
mapping (1990s). I. Family-based association
  • The most often criticized aspect of association
    analysis, its inability to deal with population
    stratification, was thought to be solved by the
    family-based design
  • Genotype-based haplotype relative risk (Falk and
    Rubinstein, 1987)
  • Haplotype-based haplotype relative risk
    (Terwilliger and Ott, 1992)
  • McNemar test (Terwilliger and Ott, 1992),
    Transmission disequilibrium test (TDT) (Spielman,
    McGinnis, Ewen, 1993)

13
Association analysis was brought back to disease
mapping (1990s). II. Weaker signal in complex
diseases
  • TDT is shown to be more powerful than the
    affected-sib identical-by-descent sharing method
    (a nonparametric linkage analysis) for complex
    diseases (diseases with lower genotypic relative
    risk)
  • N Risch, K Merikangas (1996), "The future of
    genetic studies of complex human diseases",
    Science, 2731516-1517

14
Statistical genetic methods for disease gene
identification
15
Association studies
  • Association between risk factor and disease risk
    factor is significantly more frequent among
    affected than among unaffected individuals
  • In genetic epidemiology
  • Risk factors alleles/genotypes/haplotypes

16
Association studies
  • Candidate genes (functional or positional)
  • Fine mapping in linkage regions
  • Genome wide screen

17
Candidate gene analysis
  • Direct analysis
  • Association studies between disease and
    functional SNPs (causative of disease) of
    candidate gene

18
Candidate gene analysis
  • Indirect analysis
  • Association studies between disease and random
    SNPs within or near candidate gene
  • Linkage Disequilibrium mapping

19
Case-control studies ?2 test
Risk factor
contingency table
Test of independence ?2 ? (O-E)2 / E with
1 df
20
Case-control studies ?2 test
2x3 contingency table
Genotypes
AA Aa aa Cases nAA nAa naa N Controls
mAA mAa maa M tAA tAa taa NM
Test of independence ?2 ? (O-E)2 / E with
2 df
21
Case-control studies ?2 test
2x2 contingency table
Alleles
A a Cases nA na 2N Controls
mA ma 2M tA ta 2(NM)
Test of independence ?2 ? (O-E)2 / E with
1 df
22
Hardy-Weinberg Equilibrium
Biallelic locus A, a genotypes AA, Aa,
aa Allele frequencies A P(A) p a P(a)
q Genotype frequencies are in HWE
if AA P(AA) p2 Aa P(Aa)
2pq aa P(aa) q2
23
Haplotypes
GENOTYPES
Locus 1
2
1
3
Locus 2
6
1
1
5
9
1
7
4
9
1
Identification of phase
6
2
9
1
7
2
1
2
1
2
7
6
1
4
1
7
1
8
1
8
1
4
Locus N
1
0
1
0
24
Statistical significance of a correlation versus
correlation strength
  • Statistical significance is usually measured by
    p-value the probability for observing the same
    amount of correlation or more if the true
    correlation is zero.
  • Correlation strength can be measured by many many
    quantities D, D, r2
  • Correlation strength between a marker and the
    disease status is usually measured by odd-ratio
    (OR)
  • The 95 confidence interval (CI) of OR contains
    both information on strength and significance
  • When the sample size is increased, typically the
    p-value can become even more significant, whereas
    OR usually stays the same (but 95 CI of OR
    becomes more narrow).

25
Graphic representation of LD
r2
D
GOLD
26
Main Issues in Association Analysis
  • The association is typically detected between a
    non-function marker and the disease, instead of
    the disease gene itself and the disease status.
    (non-direct role of the disease gene in
    association analysis)
  • When the disease (case) group and the normal
    (control) group both are a mixture of
    subpopulations with a different proportion of
    mixing, even markers not associated with the
    disease will exhibit spurious association
    (heterogeneity)

27
Zondervan Cardon, 2004
28
Solution to the first issue
  • Choose the marker, haplotype, to have a matching
    (allele, haplotype, ) frequency as the disease
    gene.
  • Whenever possible, typing a marker that is also
    functional (e.g. coding SNP, functional SNP,
    regulatory SNP)

29
Association due to population stratification
Marchini et al, 2004
30
Well-known problem when case/control groups
consist of two different subpopulations with
different mixing proportion
  • Example comparing peoples height between two
    places 1. prison, and 2. nurse school
  • In prison, maybe 80 are men
  • In nursing school, maybe 80 are women
  • Men are on average taller than women
  • People in prison are taller than people in nurse
    school
  • But the cause of this difference is due to the
    different mixing proportions, not due to staying
    in prison makes people taller

31
Solution to the second issue
  • Try to use people from the same population in
    both case and control group.
  • Use neutral marker to test whether subpopulations
    exist
  • If possible use an isolated population (the extra
    benefit is to reduce the heterogeneity in the
    case group)
  • Use family-based association design (the
    disadvantage is that it is more costly, and
    parents of late-onset patients are hard to find)

32
Lee et al. Gene and Immunity (2005)
33
dis.e.qui.lib.ri.um, n. Loss or lack of stability
or equilibrium
link.age, n. (genetics) An association between
two or more genes such that the traits they
control tend to be inherited together.
as.so.ci.a.tion, n. 1. The act of associating or
the state of being associated.
cor.re.la.tion, n. (statistics) the simultaneous
change in value of two numerically valued random
variables
ASSOCITION IS THE LEAST RIGOROUSLY DEFINED WORD!
34
Criswell et al. Am J Hum Genetics (2005)
Write a Comment
User Comments (0)
About PowerShow.com