Title: Genotyping and Genetic Maps
1Genotyping and Genetic Maps Bas Heijmans Leiden
University Medical Centre The Netherlands
2Pedigree file in linkage format
1 1 1 1 2 2 2 2 2
1 2 3 4 1 2 3 4 5
0 0 1 1 0 0 1 1 1
0 0 2 2 0 0 2 2 2
1 1 1 1 1 1 1 1 1
1 3 2 2 0 0 5 6 5
2 4 3 4 0 0 8 7 7
1 2 2 1 1 2 1 2 1
3Pedigree file in linkage format
marker data (1 marker)
disease status
person id
family id
mother
father
sex
1 1 1 1 2 2 2 2 2
1 2 3 4 1 2 3 4 5
0 0 1 1 0 0 1 1 1
0 0 2 2 0 0 2 2 2
1 1 1 1 1 1 1 1 1
1 3 2 2 0 0 5 6 5
2 4 3 4 0 0 8 7 7
1 2 2 1 1 2 1 2 1
4Marker choice for genome-wide linkage scans
- Short tandem repeats (STR, a.k.a.
microsatellites) because - High heterozygosity (1 STR 5 SNPs)
- There are more than enough (1/30kb thus gtgt1/cM)
- Reliable genetic maps (Marshfield, Decode)
- Optimized marker sets, spacing down to 5cM
(Marshfield/Applied Biosystems) - Reasonably automated measurement (2 persons ?
40,000 checked genotypes in database per week) - Low cost per genotype (lt0.15 for consumables)
- Reasonable success and error rates (gt92 and
lt0.8)
5Short tandem repeats
Tetranucleotide repeat
Paternal allele
4 repeats
Maternal allele
2 repeats
6Short tandem repeats
Tetranucleotide repeat
Paternal allele
4 repeats
Maternal allele
2 repeats
Dinucleotide repeat
CACACACACACACACA GTGTGTGTGTGTGTGT
Paternal allele
8 repeats
CACACA GTGTGT
Maternal allele
3 repeats
And there also are tri- and pentanucleotide
repeats.
7Principle of genotyping methods
- Short tandem repeats ? length differences
CACACACACACACACA GTGTGTGTGTGTGTGT
CACACA GTGTGT
- SNPs ? only sequence difference
- Destruction restriction site (RFLP)
- Hybridization differences (TaqMan)
- One base-pair sequencing reaction- primer
extension (Sequenom, Orchid) - Ligation assay (Illumina)
G C
A T
- VNTR, insertion/deletion polymorphisms (1 bp to
300 bp for Alu repeat)
8Genotyping STRs step 1 PCR
9Genotyping STRs step 1 PCR
CACA GTGT
20
35
25
4
20
104 bp
CACACACA GTGTGTGT
108 bp
20
35
25
8
20
10Genotyping STRs step 1 PCR in practice
genomic DNA primers Taq DNA
polymerase dNTPs (ACGT) buffer
11Genotyping STRs step 2 electophoresis Detect
length differences
- Agarose or polyacrylamide slab gel
- DNA is negatively charged
- Longer fragments migrate slower than shorter
ones through polymer network.
electrode
electrode
12To scan the whole human genome
- 1 short tandem repeat every 10 cM
- makes 400 markers per individual
- Assuming 1000 individuals (preferably 1000s)
- One whole genome scan 400,000 genotypings
13Not like this.
14Not like this. but like this
96-well plates
384-well plates
15Not like this.
16Not like this. but like this
17Not like this.
18Not like this. but like this
19Electrophoresis using automated sequencer
- 96 capillaries (no lanes) (ABI3700)
- Put in machine and all goes automatically
- Primers are labelled with fluorescent dye
- Machine detects PCR products through a laser
20Through-put
- A 384-well plate taking about one night
- 384 samples minus 16 controls 368
- 15 markers per sample
- makes 5520 genotypes (if succes rate 100)
21Tetranucleotide repeat marker (e.g. multiples of
AACT)
22- Detected length of PCR product depends on
machine - Standards are used to correct this (CEPH DNA
samples) - Take this into account when analysing data from
different machines/labs
23Dinucleotide repeat marker (e.g. multiples of CA)
24- Dinucleotide repeats give less clean pictures
but in practice this is no problem as long as
pattern is always the same - However, markers not in standard 10 cM screening
sets often are more problematic (different
stutter patterns for different samples,
non-constant ratio real peak/plus-A peak) - ? increased error rates?
25The result allele lengths
CACA GTGT
20
35
25
4
20
104 bp
CACACACA GTGTGTGT
108 bp
20
35
25
8
20
26Pedigree file in linkage format
Raw marker data
1 1 1 1 2 2 2 2 2
1 2 3 4 1 2 3 4 5
0 0 1 1 0 0 1 1 1
0 0 2 2 0 0 2 2 2
1 1 1 1 1 1 1 1 1
1 2 2 1 1 2 1 2 1
102 106 104 104 0 0 111 112 111
104 110 106 110 0 0 118 114 114
27Genetic map of measured markers
- For IBD estimation using Merlin or other software
-
- Pedigree file
- Genetic map
28Markers measured on chromosome 19
16 markers d19s247 d19s1034 d19s391 d19s865 d19s39
4 d19s588 d19s49 d19s433 d19s47 d19s420 d19s178 a
poc2 d19s246 d19s180 d19s210 d19s254
29Genetic maps
- Available from
- Marshfield Center for Medical Genetics
- http//research.marshfieldclinic.org/genetics/
- Decode Genetics (most accurate)
- Supplemental data to Kong et al. Nat Genet
200231241-7. - see F\Bas\GenotypingMaps\DecodeMap.xls
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35(No Transcript)
36(No Transcript)
37(No Transcript)
38Merlin Map File
CHROMOSOME MARKER LOCATION 19 d19s247 9.84 19
d19s1034 20.75 19 d19s391 28.83 19
d19s865 32.39 19 d19s394 34.25 19
d19s588 42.28 19 d19s49 50.81 19
d19s433 51.88 19 d19s47 63.10 19 d19s420 66
.30 19 d19s178 68.08 19 apoc2 69.50 19
d19s246 78.08 19 d19s180 87.66 19
d19s210 100.01 19 d19s254 100.61