Title: Complex traits and the HapMap Project
1Complex traits and the HapMap Project
2Identification of genes
Map
Clone
Identify the Phenotype
3Trait / Phenotype
- Mendelian (single gene traits)
- establish diagnostic criteria
- determine mode of inheritance, gene frequency and
penetrance. - linkage mapping
4Linkage
1
2
I
1
2
4
5
II
1
3
5
7
2
4
6
8
III
5Mapping and sequencing
10000 Kb
Markers
100 Kb
DNA clones
6Trait / Phenotype
- Mendelian (single gene traits)
- establish diagnostic criteria
- determine mode of inheritance, gene frequency and
penetrance. - linkage mapping
- Non-Mendelian (complex traits)
- difficult to layout diagnostic criteria (e.g.
psychiatric disorders). - difficult to determine mode of inheritance, gene
penetrance and frequency. - So, what do we do?
7Association Studies
- Population based studies that test if markers are
enriched in one population compared to a second
population
8SNPs (snips)
- A SNP is a site in the DNA where different
chromosomes differ in the base they have.
9SNPs
Paternal allele CCCGCCTTCTTGGCTTTACA Maternal
allele CCCGCCTTCTCGGCTTTACA
Paternal allele CCCGCCTTCTTGGCTTTACA Maternal
allele CCCGCCTTCTTGGCTTTACA
10Case-Control Association Studies
- Genetic factors are compared within large case
(affected) and control populations to identify
correlations with a defined phenotype
11Advantages Over Linkage Analysis
- No need for multi-generation family pedigrees
Individuals are unrelated - More powerful at detecting small and moderate
genetic effects i.e. The phenotype can be mild or
demonstrate incomplete penetrance
12Case-Control Association Studies
- Family studies population studies
C/C
C/T
C/C
C/C
C/C
C/C
C/T
40 T 60 C
15 T 85 C
Case
Control
C/C
C/T
C/T
Polygenic Complex Inheritance majority of common
disease
Single gene disorders Mendelian Inheritance lt10
common disease
13Case-Control Association Studies
- Single nucleotide polymorphisms (SNPs) are the
markers of choice - Approx 1 SNP every kb
- Low mutation rate
- Easily genotyped (i.e. High-throughput)
14Identification of genes underlying human
Mendelian traits and genetically complex traits
in humans and other species
Science, Volume 298, Number 5602, Issue of 20 Dec
2002, pp. 2345-2349
15The HapMap project
- 100 million - three years - to catalogue common
haplotype blocks. - Francis Collins The Hap/Map will provide the
missing link between the DNA sequence of the
genome and the way in which the genome influences
the risk of disease.
16How many SNPs?
- gt9.8 million SNPs have been identified (17 August
2004). - 93 of genes contain a SNP.
- 98 of genes are within 5000 bp of a SNP.
- 10 million common SNPS in human genome
17- HapMap project allows a more systematic approach
for searching for genetic factors associated with
common diseases. - it allows rapid scanning of the genome for
disease gene by using a number of SNPS and
characterising patterns of linkage disequilibrium
(LD).
18Linkage Disequilibrium (LD)
- Non-random association between alleles at
different loci
Locus 1
Locus 2
Locus 1
Locus 2
A
C
C 50
A 50
Maternal
G
T
Paternal
T 50
G 50
Haplotype
Hardy-Weinberg
LD
C
A
0.5 X 0.5 0.25
0.40
C
G
0.5 X 0.5 0.25
0.10
T
A
0.5 X 0.5 0.25
0.10
T
G
0.5 X 0.5 0.25
0.40
19Haplotype blocks
- Genome is divided into blocks?!
- High LD
- Low diversity
- separated by recombination hotspots
20Haplotype blocks
21Gabriel et al., The structure of haplotype blocks
in the human genome. Science 296 (2002), pp.
22252229.
- 51 autosomal regions
- average size 250,000 bp
- 13 megabases of the human genome (0.4 of
human genome) - Samples Africa, Europe, and Asia
- haplotype blocks can be reliably identified by
genotyping a sample of common markers within
their span - 1/2 of the human genome exists in blocks of
- 22 kb or larger in African and African-Americans
- 44 kb or larger in European and Asian samples
- Within each block, only three to five haplotypes
capture 90 of all chromosomes in each
population. - Both the boundaries of blocks and the specific
haplotypes observed are shared to a remarkable
extent across populations. - 300,000 to 1,000,000 SNPs are adequate in an
association study to survey for common haplotypes
- but see Goldstien et al. 150,000 - 300,000
22Issues
- Block definition
- hotspots defined directly for only one region
(HLA) - over representation of common alleles
- could miss rare variants involved in disease,
- but what if Common disease - Common variant
hypothesis is true? - What is the optimal map density (higher SNP
density --gt shorter blocks)
23SNP density and haplotype blocks
24Do we need to genotype all markers?
- redundant SNPs
- haplotype tagging SNPs
25(No Transcript)
26(No Transcript)
27(No Transcript)
28Genome-wide or candidate gene scans?
- Genome-wide
- costly
- 3 orders of magnitude more SNPs needed
- chance of finding a causal SNP is 10 fold less
- So why not start with candidate genes and regions?
29Cloning DPP10 gene involved in asthma
- Linkage
- interval between 2q14 and 2q32
- mouse bronchial hyper-responsiveness linked to
the region homologous to 2q14 - M. Allen et al. (2003), Positional cloning of a
novel gene influencing asthma from chromosome
2q14, Nature Genet., 35258-263.
30- Samples
- 244 families including 239 children with asthma
- used the total serum IgE concentration (LnIgE) as
a quantitative measure of atopy.
31- Found association between asthma and the
microsatellite D2S308 (near IL1 locus). - Allele 3 of D2S308 (D2S3083) positively
associated with asthma.
32TDT transmission disequilibrium test
33- Constructed BAC/PAC contig surrounding D2S308
marker. - Used SNPs to measure LD
LD tendency of a set of alleles at one locus to
segregate with a particular group of alleles on a
second closely linked locus
34- Found strongest association with WTC122P SNP, 1
kb proximal to D2S308 - genotyped WTC91P, WTC122P and D2S308 in 1,047
schoolchildren aged 911 years
35- measured the frequency of the WTC122P1 D2S3083
haplotype in - 178 severe steroid-dependent asthmatics
- 92 mild asthmatics
- 304 unrelated controls
- WTC122P1 D2S3083 haplotype was significantly
more frequent among the severe asthmatics but not
among the mild asthmatics
36- Identified a novel gene called DPP10
- no coding polymorphisms in DPP10
- strongest association with WTC122P
- WTC122P alleles showed differential protein
binding with nuclear extracts from the T-cell
line C8166
37Exon structure of DPP10
- Effects on asthma susceptibility may result in
part from the presence or absence of the CdxA
promoter element before exon Ib, leading to
alternative splicing between membrane-bound and
other forms of the protein.
38- A haplotype-based molecular analysis of CFTR
mutations associated with respiratory and
pancreatic diseases - JH Lee et al. (2003), Human Molecular Genetics,
2003, Vol. 122321-2332.
39- Mutations in cystic fibrosis transmembrane
conductance regulator (CFTR) - cystic fibrosis (F508del, G532X, N130K)
- severe and high penetrance
- pancreatic and other respiratory disorders
- congenital bilateral aplasia of the vas deferens
(CBAVD), which leads to male infertility, may
occur in isolation or as a manifestation of
cystic fibrosis
40- Samples
- South Asian
- healthy controls
- bronchiectasis patients
- chronic pancreatitis patients
41Identified mutations
42(No Transcript)
4397 of controls V470-2694T or M470-2694G
44Of the two IVS8 T5 containing haplotypes only
haplotype 5, which also has the low activity
variation of V470, showed an association with the
disease phenotype
45- Haplotype 12
- frequency in normal Belgians 7.3
- gives rise to gt95 of the three most common CF
cases, F508del, 542X, N1303 - but in Koreans frequency is 0.9 (CF is rare in
South Asia)!