Association mapping: - PowerPoint PPT Presentation

About This Presentation
Title:

Association mapping:

Description:

Association mapping: finding genetic variants for common traits & diseases Manuel Ferreira Genetic Epidemiology Queensland Institute of Medical Research – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 34
Provided by: Manu56
Category:

less

Transcript and Presenter's Notes

Title: Association mapping:


1
Association mapping finding genetic variants
for common traits diseases
Manuel Ferreira
Genetic Epidemiology
Queensland Institute of Medical Research Brisbane
WEHI Postgraduate seminar, 31 May 2010
2
Why?
Understand disease aetiology
Predict disease risk / drug response
Personalized Medicine
Lancet 2010 375 152535
3
Rare, monogenic traits
Ng et al. Nature Genetics 2010 42 30-35.
4
Common, complex traits
5
GENETICS OF COMMON DISEASES
Phenotypic modelling
1990
Linkage analysis
Association analysis
2000
2005
2008
2009
2010
2015
6
Recent advances assays/analysis genetic variation
HapMap, 1000 Genomes High-throughput genotyping
sequencing Analytic Methods Genome-wide
association, imputation, stratification, CNVs,
risk prediction
genes
7
HapMap project
1. GOALS
The HapMap was designed to determine the
frequencies and patterns of association among
roughly 3 million common Single Nucleotide
Polymorphisms (SNPs) in four populations, for use
in genetic association studies. 4





Individuals
SNPs
1 The International HapMap Consortium. Nature
2003 426 789. 2 International HapMap
Consortium. Nature 2005 437 1299. 3
International HapMap Consortium. Nature 2007
449 851. 4 Manolio et al. J Clin Invest 2008
118 1590.
8
HapMap project
2. STRATEGY
Genome-wide SNP discovery
1,7 million dbSNP
9,2 million
14,7 million (6,5 million validated)
2002
2005
2009
SNP selection
Phase 1 MAFgt0.05, validated, non-synonymous SNPs
prioritised (1,27 million total)
Genotyping
7 genotyping platforms used/developed by 12
centres
30 trios Yoruba in Ibadan, Nigeria (YRI) 30 trios
European descent in Utah (CEU) 45 unrelated Han
Chinese from Beijing (CHB) 45 unrelated Japanese
from Tokyo (JPT)
Phases 2 and 3 expanded SNP (4 million) and
population (11) coverage
http//www.hapmap.org/
9
HapMap project
3. OUTCOMES
Systematic catalogue of common human variation
Designing and refining high-throughput genotyping
platforms
Population genetics (selection, sub-structure,
recombination mutation)
Linkage disequilibrium (LD) or correlation
between SNPs (tagging, fine-mapping, imputation)
10
Gene A
Correlation (LD) between SNPs



D and r2
Haplotypes
SNP tags
Haploview, Tagger
HapMap SNPs
Genetic Coverage



Proportion of known SNPs tagged
Haploview
eg. SNP 1 tags 4/10 variants
Fine-mapping
T A
G C
T A
Interesting SNPs to follow-up
Cross-study comparisons
11
1000 Genomes project
http//www.1000genomes.org/
GOAL
The 1000 Genomes Project aims to achieve a
nearly complete catalog of common human genetic
variants (defined as frequency 1 or higher) by
generating high-quality sequence data for gt85 of
the genome for three sets of 400-500 individuals
(...)
2,500 samples at 4x by 2011
12
High-throughput genotyping sequencing
Whole-genome genotyping (from 300 USD/sample)
Affymetrix
Illumina
6.0 chip
Human1M BeadChip
gt900,000 SNPs CNV probes 82 coverage CEU
HapMap Accuracy 99.90
gt1 million SNPs CNV probes 95 coverage CEU
HapMap Accuracy 99.94
Whole-genome sequencing (from 10,000
USD/sample)
Illumina
Complete Genomics
HiSeq 2000
40x coverage 35 bp read length
30x coverage 100 bp read length
13
Recent advances assays/analysis genetic variation
HapMap, 1000 Genomes High-throughput genotyping
sequencing Analytic Methods Genome-wide
Association, stratification, imputation, CNV,
risk prediction Examples recent GWAS.
14
Analytic methods
1. GENOME-WIDE ASSOCIATION
SNPs








cases
Individuals
controls
15
Analytic methods
Association tests
Study designs
Software
Unrelated individuals Families
Between individual effects Between Within
family effects
Many (eg. PLINK) Merlin, etc
Pros
More power / spent, easier to collect,
analyse Assess inheritance (CNVs), robust
population stratification
Unrelated individuals Families
16
Analytic methods
2. POPULATION STRATIFICATION
Genetic matching
Ind1 Ind2 shared
A1 A2 100
A1 A3 50
A1 A4 25
A1 A5 10
A1 A6 8
A1 B1 5
A
B
A
B
17
Analytic methods
3. IMPUTATION OF UNMEASURED GENOTYPES
Genotyped Dataset
SNPs





Individuals
Reference panel (eg. HapMap)


Genotyped Imputed Dataset





Shaun Purcell, Doug Ruderfer (PLINK)
MACH, IMPUTE, BEAGLE
18
Combine data from studies genotyped using
different platforms
19
Example 1 Bipolar Disorder GWAS
325,690 SNPs
gt1,7 million SNPs
Ferreira et al (2008) Nature Genetics 40 1056
20
ANK3 Ankyrin G
Cases 7.0 Controls 5.3 Odds ratio
1.45 Not related to sex, psychosis or
age-of-onset
Replicated recently
Smith et al (2009) Mol Psychiatry 14
755-63. Scott et al (2009) Proc Natl Acad Sci USA
106 7501-6. Lee et al (2010) Mol Psychiatry Apr
13 Han Chinese population
21
Example 2 analysis of lymphocyte subsets
2,538 individuals CD4 T cell levels, CD8 T
cell levels, CD4CD8 ratio
MHC class I
  • rs2524054, C
  • Increased CD8 T levels
  • Improved host control of HIV (OR0.32,
    P10-9)

MHC class II
  • rs9270986, A
  • Increased CD4 T levels
  • Protective effect for type-1 diabetes (OR
    0.04, P10-125)
  • Protective effect Rheum. Arthritis (OR0.60,
    P10-15)

Ferreira et al. (2010) Am J Hum Genet 86 88-92
22
Analytic methods
4. Structural Variants
Deletions Duplications Insertions
Quantitative (Copy Number Variants)
Structural Variants
Positional (Translocations) Orientational
(Inversions)
Genomic alterations involving segment of DNA gt1kb
23
Detection of CNVs
Non-polymorphic probes
McCarroll et al 2008 Nat Genet 40 1166
24
Detection of CNVs
Use polymorphic probes from genotyping arrays to
Identify and genotype new, potentially rarer CNVs
Example rs1006737 A/G
probe 1
... AGCCCGAAATGTTTTCAGA...
probe 2
... AGCCCGAAGTGTTTTCAGA...
AA AG GG
Intensity of probe 1
Intensity of probe 2
25
A/G
Detection of CNVs
A
Copy number for
Genotype
Ind
Pattern
Mat/Pat
A
G
Total
1 A/G 1 1 2 2 A/- 1 0 1 3 AA/- 2 0 2
4 -/G 0 1 1 5 -/- 0 0 0 6 AAA/G 3
1 4
A
G
A
A
A
G
A
A
A
G
26
A/G
Detection of CNVs
A
Individuals with duplication(s)
G/G
Normalized intensity of allele G
ie. total CN gt 2
A/G
A/A
Normalized intensity of allele A
Individuals with deletion(s)
Polymorphic probe in CNV region
ie. total CN lt 2
27
Detection of CNVs
Birdseye Affy 5.0, 6.0
Korn et al 2008 Nat Genet 40 1253
PennCNV Affymetrix and Illumina
Wang et al 2007 Genome Res 17 1665
Combine information across probes to identify new
CNVs
For example... Cases Controls
100kb deletion chr. 2 10/5,000 1/5,000
28
Example 3 Autism whole-genome CNV analysis
Sample 16p11 Cases Controls P
Discovery Del (600kb) 5/1,441 3/4,234 1.1 x 10-4
Affy 500K Dup 7/1,441 2/4,234 1.1 x 10-4

Replication 1 (CHB) Del 5/512 0/434 0.007
array-CGH Dup 4/512 0/434 0.007

Replication 2 (deCODE) Del 3/299 2/18,834 4.2 x 10-4
Illumina Dup 0/299 5/18,834 4.2 x 10-4
COPPER Birdseye CNAT
del dup
Deletion frequency Iceland
Autism 1 Psychiatric disorder 0.1 General
population 0.01
inherited 2 6 de novo 10 1 unknown 1 4
Weiss et al. N Engl J Med 2008 358 667
29
Example 4 SCZ whole-genome CNV analysis
Specific loci
Genome-wide burden
Cases
Chromosome ?
Controls
Shaun Purcell
30
Genome-wide burden of rare CNVs in SCZ
3,391 patients with SCZ, 3,181 controls Filter
for lt1 MAF, gt100kb 6,753 CNVs
Cases have greater rate of CNVs than
controls 1.15-fold increase P 310-5
Results invariant to obvious statistical
controls Array type, genotyping plate, sample
collection site, mean probe intensity
Shaun Purcell
31
Similar successes for other common diseases
32
Crohns Disease (31 loci, 10 variance)
Jan 2006 to Jan 2008
30
20
N confirmed loci
10
0
before Jan 2006
5
http//www.genome.gov/gwastudies
Altshuler, Daly Lander. Science 2008 322
881 Manolio, Brooks Collins. J Clin Invest 2008
118 1590
33
Summary
Tremendous recent technological
advances Large-scale genetic association studies
feasible gt150 disease loci unequivocally
identified since 2006 Provide a solid base to
build our knowledge about disease
mechanisms Hundreds of loci yet to be identified
for most diseases
Write a Comment
User Comments (0)
About PowerShow.com