Diapositive 1 - PowerPoint PPT Presentation

About This Presentation
Title:

Diapositive 1

Description:

Analytical challenges in genetic association studies David Meyre, Associate Professor, McMaster University (meyred_at_mcmaster.ca) HRM 728 Graduate Course: Genetic ... – PowerPoint PPT presentation

Number of Views:134
Avg rating:3.0/5.0
Slides: 72
Provided by: mey52
Category:

less

Transcript and Presenter's Notes

Title: Diapositive 1


1
Analytical challenges in genetic association
studies
David Meyre, Associate Professor, McMaster
University (meyred_at_mcmaster.ca) HRM 728 Graduate
Course Genetic Epidemiology November, 7th 2014
2
Li Meyre., Int J Obes 2013
3
The march of technology
1980
single variant (100 SNPs)
detailed study of individual genes (102 SNPs)
1990
2000
regional studies (104 SNPs)
2006
genome-wide association (5 105 SNPs)
3,5 106 SNPs (2007)
2011
Whole-genome sequencing (3 107 SNPs)
A storm of data to deal with!
4
Analytical challenges in genetic association
studies
  • . 426 positive findings in 127 genes but.
  • . only 22 genes associated with obesity-related
    phenotypes in gt 5 studies
  • Replication is challenging in genetic
    epidemiology
  • Skepticism in the medical / scientific community

Rankinen et al., Obesity 2006
5
Analytical challenges in genetic association
studies
  1. Analytical challenges to find a true association
    in a discovery study (risk of false positive
    result)
  2. Analytical challenges to replicate a true
    positive association
  3. Guidelines for proper discovery and replication
    association study designs

6
Analytical challenges to find a true association
in a discovery study
Are you ready for the Episode 1 of the saga!
7
Analytical challenges in genetic association
studies
  • Lack of replication may occur because the
    original study reports a false positive result
  • 1) The phenotype is not heritable

8
Obesity is an heritable disease
. 2 obese parents ? 10-fold increased risk for
childhood obesity . Obesity has a strong genetic
component heritability ? 50- 85

(Stunkard et al., NEJM 1986
Wardle et al., AJCN 2008)
9
Analytical challenges in genetic association
studies
  • Lack of replication may occur because the
    original study reports a false positive result
  • The phenotype is not heritable
  • Insufficient sample size

10
Statistical power and sample size
Effect sizes for obesity-associated common
genetic variants are small (OR lt 2)
11
Statistical power and sample size
MAF in controls 0.01 0.05 0.1 0.2 0.3 0.4
Allelic OR 0.01 0.05 0.1 0.2 0.3 0.4
1.1 443,854 92,868 49,252 27,974 21,518 19,010
1.2 116,354 24,434 13,018 7,460 5,792 5,162
1.3 54,110 11,404 6,102 3,526 2,760 2,480
1.5 21,208 4,498 2,426 1,424 1,132 1,032
2.0 6,386 1,374 754 458 376 354
Table 1. Sample sizes needed in a case control
design to detect significant association with a
power of 90 and a two-sided P-value of 0.001 by
odds ratio and allele frequency for risk allele.
Calculations assume multiplicative effect on
disease risk. Sample sizes presented are total
number of cases and controls needed, assuming an
equal number of cases and controls.
12
GAD2 or the importance of a well-powered study
.Association of the GAD2 promoter gene variant
-243 AgtG with morbid obesity (OR1.05-1.58,
P0.01) using 575 cases and 646 controls . No
prior statistical power calculation in the
princeps study
.Lack of confirmation of the association of the
GAD2 promoter gene variant -243 AgtG with morbid
obesity (OR0.90-1.36, P0.28) in a meta-analysis
of 1,252 cases and 1,800 controls
Boutin et al., PLOS Biol 2003, Swarbrick et al.,
PLOS Biol 2005
13
Statistical power and rare variant analysis
We identified six highly correlated SNPs that
show strong and comparable associations with risk
of type 2 diabetes, but further refinement of
these associations will require large sample
sizes (gt100,000) or studies in ethnically diverse
populations.
Fawcett et al., Diabetes 2010
14
Analytical challenges in genetic association
studies
  • Lack of replication may occur because the
    original study reports a false positive result
  • The phenotype is not heritable
  • Insufficient sample size
  • Lack of correction for multiple testing

15
Multiple testing in the post-GWAS area
1 million polymorphisms!
Bonferroni correction Pcorrected 0.05 /
1,000,000 5 x 10-8 2 SNP gene x gene
interactions Pcorrected 1x 10-13
16
Multiple testing in the whole-exome/genome
sequencing area
30 million polymorphisms 20,000 genes
Bonferroni correction SNPs Pcorrected 0.05 /
30,000,000 1 x 10-9 Bonferroni correction
genes Pcorrected 0.05 / 20,000 2.5 x 10-6
17
INSIG2 a GWA false positive association
Science April 2006 INSIG2 rs7566605 variant is
associated with obesity (ORmeta-analysis1.05-1.42
, P 0.008), far from the threshold of
significance after multiple testing correction
(P5 x 10-7)
Science January 2007 INSIG2 lack of association
with obesity in 3 independent designs (N22,381)
18
Analytical challenges in genetic association
studies
  • Lack of replication may occur because the
    original study reports a false positive result
  • The phenotype is not heritable
  • Insufficient sample size
  • Lack of correction for multiple testing
  • Geographical population substructure

19
Lactase persistence and population substructure
LCT rs4988235 T allele frequency in UK
Davey-Smith et al., EJHG 2009
20
Rare variants and founder effects
-common SNP associated with adiponectin level in
Fillipinos by GWAS -exon resequencing identified
a rare coding variant (R221S) in LD with the
common SNP strongly associated with adiponectin
level -the mutation is found exclusively in
Fillipinos
Croteau-Chonka et al., HMG 2012
21
Analytical challenges in genetic association
studies
  • Lack of replication may occur because the
    original study reports a false positive result
  • The phenotype is not heritable
  • Insufficient sample size
  • Lack of correction for multiple testing
  • Geographical population substructure
  • Technological biases, lack of quality control
    procedure

22
INS VNTR and association with childhood obesity,
a technological bias?
. Lack of association of the INS VNTR variant
with childhood obesity . Genotyping by TaqMan, a
highly reliable method . Family-based design to
enable a high-standard quality control procedure
. Association of the INS VNTR variant with
childhood obesity . Genotyping by RFLP, a highly
subjective method (Peters et al., CCM 2003)
Le Stunff et al., Nat Genet 2000, Bouatia-Naji et
al., Obesity 2008
23
Next generation sequencing and false-positive
mutations
  • . 10 of mutations are technological artifacts in
    next generation sequencing
  • . The rate of false positive mutations is higher
    in old DNA libraries
  • Use of pedigrees, confirmation of mutations by
    Sanger resequencing
  • New methods (Rain Dance technology)

Bonnefond et al., PLOS One 2012
24
Analytical challenges in genetic association
studies
  • Lack of replication may occur because the
    original study reports a false positive result
  • The phenotype is not heritable
  • Insufficient sample size
  • Lack of correction for multiple testing
  • Geographical population substructure
  • Technological biases, lack of quality control
    procedure
  • Inappropriate statistical analysis

25
Association and adjustement for confounding
factors
  • . Association between FTO intron 1 SNP and type 2
    diabetes (OR1.09-1.23, P 5x 10-8) if adjustment
    for sex and age
  • . Lack of association between FTO intron 1 SNP
    and type 2 diabetes (OR0.96-1.10, P 0.44) if
    adjustment for sex, age and BMI
  • FTO is an obesity gene
  • Inappropriate adjustment (or lack of adjustment)
    can lead to wrong conclusions

Frayling et al., Science 2007
26
Analytical challenges in genetic association
studies
  • Lack of replication may occur because the
    original study reports a false positive result
  • The phenotype is not heritable
  • Insufficient sample size
  • Lack of correction for multiple testing
  • Geographical population substructure
  • Technological biases, lack of quality control
    procedure
  • Inappropriate statistical analysis

27
Analytical challenges to replicate a true
positive association
Now the Episode 2 of the saga!
28
Analytical challenges in genetic association
studies
  • II. Replication may be challenging even when the
    original result is a true positive association
  • Willingness to replicate the original study

29
Lactase persistence and BMI variation
Despite a convincing initial evidence of
association between the LCT rs4988235 T variant
and BMI (P8 x 10-5) in 31,720 European
individuals
Kettunen et al., HMG 2009
30
Lactase persistence and BMI variation
Replication studies showed-up after 2-4 years
Correla et al., Obesity 2011
31
Analytical challenges in genetic association
studies
  • II. Replication may be challenging even when the
    original result is a true positive association
  • Willingness to replicate the original study
  • Winners curse effect and sample size in
    follow-up studies

32
Obesity loci from GIANT and replication
. Due to the small effect size of the SNPs on BMI
variation, only a fraction of these associations
replicates for obvious statistical power concerns
(den Hoed et al., Diabetes 2010)
33
Analytical challenges in genetic association
studies
  • II. Replication may be challenging even when the
    original result is a true positive association
  • Willingness to replicate the original study
  • Winners curse effect and sample size in
    follow-up studies
  • Gene x gene, gene x environment interactions

34
Interactions between FTO SNP and physical activity
.The effect of the rs9939609 SNP on obesity risk
is decreased by 27 in physically active
adults . No genotype x physical activity
interaction on obesity risk in children
Kilpelainen et al., PLOS Med 2012
35
Savage et al., Nat Genet 2002
36
Analytical challenges in genetic association
studies
  • II. Replication may be challenging even when the
    original result is a true positive association
  • Willingness to replicate the original study
  • Winners curse effect and sample size in
    follow-up studies
  • Gene x gene, gene x environment interactions
  • Heterogeneity (ethnic heterogeneity, phenotype
    heterogeneity)

37
Ethnicity and linkage disequilibrium blocs
SNP2
SNP3
SNP4
SNP5
SNP1
Icelandic
French
Asian
African
Distance (Kb)
Disease-associated LD block
Causal SNP
Proxy SNP
38
Ethnicity and SNP allele frequency
. Intronic variation (rs2237892) in a new locus
(KCNQ1) was strongly associated with T2D in Asian
(OR 1.26-1.42, 10-40lt P-value lt 10-12) . The
association with T2D was nominally replicated in
European descent populations (DIAGRAM P0.01),
with similar OR but lower risk allele frequency
(5-7 in European, 28-40 in Asian)
39
Obesity, waist and BMI have a partially
overlapping genetic architecture
40
Analytical challenges in genetic association
studies
  • II. Replication may be challenging even when the
    original result is a true positive association
  • Willingness to replicate the original study
  • Winners curse effect and sample size in
    follow-up studies
  • Gene x gene, gene x environment interactions
  • Heterogeneity (ethnic heterogeneity, phenotyp
    heterogeneity)
  • Inheritance model (parent of origin effects, de
    novo mutations)

41
Analytical challenges in genetic association
studies
  • II. Replication may be challenging even when the
    original result is a true positive association
  • Willingness to replicate the original study
  • Winners curse effect and sample size in
    follow-up studies
  • Gene x gene, gene x environment interactions
  • Heterogeneity (ethnic heterogeneity, phenotyp
    heterogeneity)
  • Inheritance model
  • Subjective interpretation of data

42
Subjective interpretation of data
Is this glass half-full or half-empty?
43
Analytical challenges in genetic association
studies
  • II. Replication may be challenging even when the
    original result is a true positive association
  • Willingness to replicate the original study
  • Winners curse effect and sample size in
    follow-up studies
  • Gene x gene, gene x environment interactions
  • Heterogeneity (ethnic heterogeneity, phenotyp
    heterogeneity)
  • Inheritance model
  • Subjective interpretation of data

44
Guidelines for proper discovery and replication
association study designs
Enough time for the Episode 3 of the saga?
45
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Discovery
  • Study designs

46
Gene discovery study designs
General population
N
Lean
Obese
Body mass index
  • 1) Case control studies from extremes of the BMI
    tails
  • 2) Quantitative trait studies in the whole
    population
  • Correlation genotype / trait at a genetic locus
  • Best approach (GIANT / GIANT extreme) BMI study
    in the whole population analysis of the
    extremes of the BMI tails (genetic variance,
    effect size)

Berndt et al., Nat Genet 2013
47
Gene discovery study designs
3) Family-based association studies allele
transmission from parents to affected offsprings
(imprinting, haplotypes.) 4) Cohort studies
correlation of a genotype with an incident
disease event (gold standard)
48
Gene discovery study designs
General population
N
Lean
Obese
Body mass index
Normal weight
5) The case control case design discovery of
gene variants associated with leanness or with
obesity (applications in drug design)
49
The gain-of-function V103I and I251L variants in
MC4R are associated with leanness
16 cohorts 5964 control and 6370 obese
patients OR 0.53, p-value 4.26.10-5
-Meta-analysis in 39,879 subjects confirms an
obesity-protective role of the V103I polymorphism
(OR 0.80 p-value 0.002) -V103I et I251L are
infrequent (0.41-2.24) and induce a gain of
function effect on the melanocorin 4 receptor
(Xiang et al., Biochemistry 2006)
Stutzmann et al., HMG 2007
50
Gene discovery study designs
6) Clinical trials, interventional studies
correlation of a genotype with response to
intervention or treatment (lifestyle
intervention, drug, surgery, smoking cessation,
antipsychotic drug administration.)
51
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Discovery
  • Study designs
  • Phenotype

52
How to chose a relevant obesity phenotype?
Heritability for BMI -h² 0.48 at age 4 y. -h²
0.78 at age 11 y.
Heritability for type 2 diabetes -h² 0.69
(onset lt 60 y.) -h² 0.31 (onset lt 75 y.)
Haworth et al., Obesity 2008, Almgren et al.,
Diabetologia 2011
53
How to chose a relevant obesity phenotype?
  • -clinically and biologically relevant
  • -easy and inexpensive to measure
  • -relevant in diverse ethnicities
  • -minimal measurement error
  • -minimal misclassification and reporting biases
  • value of BMI to estimate the degree of adiposity
    questionable
  • body fat content, body adiposity index are more
    relevant

54
. Genome-wide association study for fat mass in
36,000 subjects, replication of the best hits in
39,000 subjects . Three fat mass-associated
loci FTO, IRS1, SPRY2 . Only one locus (FTO)
out of three has been conclusively associated
with BMI body mass index in literature
Kilpelainen et al., Nat Genet 2011
55
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Discovery
  • Study designs
  • Phenotype
  • Gene identification strategies

56
Gene identification strategies
CANDIDATE GENE APPROACH
AGNOSTIC APPROACH
-highly successful -novel disease causing
mechanisms -significance thresholds -lack of
biological relevance
-moderately successful -previously known
mechanisms -strong selection criteria
needed -biological relevance
HIGH-THROUGHPOUT CANDIDATE GENE
APPROACH (pathway, expression, evolution)
57
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Discovery
  • Study designs
  • Phenotype
  • Gene identification strategies
  • 4) Genotyping methodology and quality control
    procedures

58
Genotyping methodology and quality control
-exclusion of low quality DNA (cases
controls) -highly reliable genotyping
technology -genotyping call rate (gt
95)\ -Hardy-Weinberg equilibrium (P gt
0.005)\ -double genotyping concordance rate (gt
99) -MAF comparison in public databases -confirma
tion by a second method -association of SNPs in
linkage disequilibrium -accurate experiments /
data management and reporting (bar coding,
automated processes, internal controls, flow
charts.) -sex inconsistencies, hidden
relatedness, ethnic outliers.
59
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Discovery
  • Study designs
  • Phenotype
  • Gene identification strategies
  • 4) Genotyping methodology and quality control
    procedures
  • 5) Statistical analysis

60
Statistical analysis
-power calculation -limited number of
hypotheses tested -multiple testing (FDR,
Bonferroni) -adjustment for confounding
factors -caution with subgroup analyses -best
fitting inheritance model -conditional analyses
61
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Discovery
  • Study designs
  • Phenotype
  • Gene identification strategies
  • 4) Genotyping methodology and quality control
    procedures
  • 5) Statistical analysis
  • 6) Population stratification

62
Population stratification
-correction for self-reported ethnicity -exclusio
n of ethnic outliers -genomic control

(Ancestry Informative
Markers) -family-based association tests -case
control matched for age, sex, geography
63
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Replication
  • Systematic replication and reporting of promising
    associations

64
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Replication
  • Systematic replication and reporting of promising
    associations
  • Statistical power (Winners curse effect)

65
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Replication
  • Systematic replication and reporting of promising
    associations
  • Statistical power
  • Heterogeneity

66
How to lower heterogeneity in replication studies?
-same ethnicity / country -same study
design -same ascertainment criteria -same
phenotype -same genetic markers -same age
window, same sex ratio -same inheritance
model -same statistical analysis -same
covariate adjustments
67
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Replication
  • Systematic replication and reporting of promising
    associations
  • Statistical power
  • Heterogeneity
  • Meta-analyses

68
Analytical challenges in genetic association
studies
  • III. Guidelines for proper discovery and
    replication association study designs
  • Replication
  • Systematic replication and reporting of promising
    associations
  • Statistical power
  • Heterogeneity
  • Meta-analyses
  • Additional studies

69
Additional studies
  • -worldwide contribution
  • -extension to different study designs,
    ascertainment criteria
  • -association with obesity endophenotypes
  • -gene x environment interactions
  • -fine-mapping, causative gene variants
  • -functional experiments
  • -biological insights
  • FTO in 2007 gene of unknown function in an
    unknown pathway
  • 2014 gt 740 articles published

70
1997 first identification of a monogenic obesity
gene (LEP) 2007 first gene variant in FTO
conclusively associated with obesity 2012 40
monogenic (syndromic / non-syndromic) obesity
genes, gt 100 common gene variants conclusively
associated with polygenic obesity
71
ANY QUESTIONS?
The French fair-play!
Write a Comment
User Comments (0)
About PowerShow.com