Title: Low-Cost, Low-Density Genotyping and its Potential Applications
1Low-Cost, Low-Density Genotyping and its
Potential Applications
K.A. Weigel, O. González-Recio, G. de los Campos,
H. Naya, N. Long, D. Gianola, and G.J.M. Rosa
University of Wisconsin
2Illumina BovineSNP50 Genotyping BeadChip
lt 250 per animal today
3Low-Cost Genotyping Assays
- ? At the current price, the BovineSNP50 BeadChip
is limited to applications involving males and
elite females - A low-cost assay with 300-1000 SNPs might
deliver a substantial portion of the gain for a
small fraction of the price - ? Applications may include preliminary screening
of young bulls, selection of replacement heifers,
genomic mating programs, and parentage discovery
4Which SNPs to Select?
Pick the SNPs with largest estimated effects?
How many do we need?
VanRaden, 2008
5Which SNPs to Select?
Pick evenly spaced SNPs?
How many do we need?
VanRaden, 2008
6Entropy
- A measure of the impurity of an arbitrary
collection of examples (S) - Entropy (S) - p log2p - p- log2p-
- where
- p proportion of positive examples in S
- p- proportion of negative examples in S
7Information Gain
- A measure of the effectiveness of an attribute
in classifying the data - Reduction in entropy caused by partitioning the
examples into subsets (S1,...,Sn) based on values
of a given attribute (A) - Information Gain (S,A)
- Entropy(S) ?i1,n (Si/S) Entropy(Si)
8Top 10 of SNPs for Net Merit (Info Gain for 20
highest bulls vs. 20 lowest bulls)
3252 SNPs
1181 SNPs (36.2) in common (though many more in
linkage disequilibrium)
Number of SNPs
Chromosome
9Effects of Top Net Merit SNPs
3252 SNPs
Estimate
Chromosome
3252 SNPs
Estimate
Chromosome
10Top 10 of SNPs for Specific Traits (Info Gain
for 20 highest bulls vs. 20 lowest bulls
traditional coding)
3252 SNPs
Number of SNPs
3252 SNPs
Chromosome
11Top 2.5 of SNPs for Specific Traits (Info Gain
for 20 highest bulls vs. 20 lowest bulls
traditional coding)
813 SNPs
Number of SNPs
813 SNPs
Chromosome
12Top Info Gain SNPs in Common by Trait (20
highest vs. 20 lowest bulls traditional coding)
Milk Fat Prot PL SCS DPR NM
Milk 901 2265 340 439 468 1600
Fat 114 1044 341 445 319 726
Prot 484 158 342 587 490 1634
PL 21 24 18 272 1056 952
SCS 46 32 64 7 283 270
DPR 56 29 52 159 20 843
NM 274 81 276 149 12 117
top 10 of SNPs (3252) above the diagonal
top 2.5 of SNPs (813) below the diagonal
13Bayesian LASSO
- Bayesian least absolute selection and shrinkage
operator - One-step method for estimating effects of
important SNPs while shrinking estimates for
unimportant SNPs towards zero - Assumes SNP effects follow a double exponential
distribution (a few with large effects, many with
negligible effects)
14Distribution of SNP Effects (analysis of Net
Merit in training set with 32,518 SNPs)
Number of SNPs
Estimated SNP Effect (genetic SD)
15Distribution of SNP Effects (analysis of Net
Merit in training set with 32,518 SNPs)
Estimated Effect (genetic SD)
16Distribution of SNP Effects
Mean SD Min. Max
300 SNPs 0.0031 0.0540 -0.1749 0.1474
500 SNPs -0.0008 0.0436 -0.1078 0.1317
750 SNPs -0.0002 0.0372 -0.0912 0.1084
1000 SNPs -0.0011 0.0321 -0.1094 0.1006
1250 SNPs -0.0017 0.0286 -0.1022 0.0818
1500 SNPs -0.0014 0.0259 -0.1090 0.0850
2000 SNPs -0.0009 0.0229 -0.0898 0.0900
32,518 SNPs -0.0001 0.0030 -0.0405 0.0230
17Validation of Genomic PTAs
- Compute parent averages and genomic PTAs using
2003 data from 3,305 Holstein bulls born in
1952-1998 - ? Training Set
- Compare ability to predict daughter deviations
in 2008 data for 1,398 bulls born from 1999-2002 - ? Testing Set
18Predictive Ability for Net Merit(Genomic PTA
vs. Progeny Test PTA in Testing Set)
Corr. 0.61
PTA from Progeny Testing (SD)
32,518 SNPs
Predicted Genomic PTA from All SNPs (gen. SD)
19Predictive Ability for Net Merit(Genomic PTA
from SNPs vs. Progeny Test PTA in Testing Set)
Corr. 0.43
Corr. 0.52
750 SNPs
300 SNPs
PTA from Progeny Testing
Corr. 0.55
Corr. 0.57
2000 SNPs
1250 SNPs
Predicted Genomic PTA from Top ___ SNPs (gen. SD)
20Predictive Ability for Net Merit(Genomic PTA
vs. Progeny Test PTA in Testing Set)
Predictive Ability in Testing Set
. . .
Number of SNPs used for Prediction
21No. Bulls Chosen Correctly (of 1399)
Top 50 (700 bulls) Top 25 (350 bulls) Top 10 (140 bulls) Top 5 (70 bulls) Top 2½ (35 bulls) Top 1 (14 bulls)
300 SNPs 460 (65.7) 161 (46.0) 31 (22.1) 13 (18.6) 3 (8.6) 0 (0.0)
500 SNPs 460 (65.7) 180 (48.6) 39 (27.9) 12 (17.1) 3 (8.6) 1 (7.1)
750 SNPs 479 (68.4) 180 (48.6) 39 (27.9) 15 (21.4) 4 (11.4) 2 (14.3)
1000 SNPs 484 (69.1) 180 (48.6) 40 (28.6) 11 (15.7) 3 (8.6) 2 (14.3)
1250 SNPs 482 (68.9) 179 (51.1) 43 (30.7) 12 (17.1) 4 (11.3) 1 (7.1)
1500 SNPs 479 (68.4) 183 (52.2) 46 (32.9) 14 (20.0) 5 (14.3) 1 (7.1)
2000 SNPs 489 (69.9) 186 (53.1) 42 (30.0) 17 (24.3) 6 (17.1) 2 (14.3)
32,518 SNPs 499 (71.3) 191 (54.6) 49 (35.0) 16 (22.9) 5 (14.3) 2 (14.3)
Note that we are predicting 2008 PTAs that have
REL much less than 99 (not the true genetic
merit of the bulls)
22Animal ID Applications(96 SNPs in the
parentage panel)
- ? Verify reported parents
- ? Discover parents if unknown or incorrect
- ? Trace animals or animal products
23Effects on Inbreeding
- ? Traditional animal model evaluations favor
co-selection of families or relatives - ? Genomic selection allows within-family
selection, which leads to less inbreeding - ? Low-cost, low-density genotyping assays will
allow widespread screening of families that might
provide unique genetic contributions to the
population - ? Identification and control of inherited defects
will be greatly enhanced as well
24Potential for Mate Selection
- ? Millions of cows are mated using computerized
programs each year, based on faults in
conformation or avoidance of inbreeding - ? SNP genotypes of AI sires and potential mates
could be used to minimize inbreeding or to
identify parents with complementary DNA
profiles
25Possibilities for Novel Traits
? Opportunities to collect DNA and phenotypes for
traits not routinely assessed in national
recording schemes ? Examples include feed
intake, hormone level, immune function, hoof
care, etc. ? Potential resource populations
include experimental herds, calf ranches, heifer
growers, commercial herds with specific
milking/feeding/management equipment, veterinary
databases (without sire ID)
26Novel Traits and Genomics
Recorded Population (10,000-25,000 animals per
trait or trait group)
additive or non-additive inheritance
no selection bias
refine estimates of location or effect, add SNPs
update estimates of SNP effects
full genotyping
selective genotyping
Whole Genome Selection
QTL Detection and MAS
200/genotype 100/trait 5 traits/group select
high/low 10
cost 7.0-17.5 mln per trait group
cost 1.2-3.0 mln per trait
27Synergy with Herd Management
- ? Personalized medicine is the Holy Grail of
biomedical research - Examples include genotype-guided Warfarin dosing
using two major genes - Cost-effective applications in livestock will
involve a series of small returns from enhanced
vaccination programs, ration formulation, mate
selection, veterinary care, and animal grouping
decisions - ? Integration with herd management software will
be the key to success
28UW-Madison Dairy ScienceCommitted to Excellence
in Research, Extension and Instruction
http//www.wisc.edu/dysci
Any Questions?