Title: Abstract
1Optimal allocation of sample size in two-stage
association studies A grid-search algorithm
S. H. WenDepartment of Public Health Tzu-Chi
University, Taiwan
C. K. Hsiao Department of Public Health and
Institute of Epidemiology, National Taiwan
University
- Abstract
- Lately, several powerful two-stage strategies for
- multiple testing in genome-wide association
studies - have received great attention. We propose
optimal - designs for these two-stage procedures under two
- different situations, where one is fixed total
- genotyping cost (FTGC) and the other is fixed
- sample sizes (FSS). For FTGC, allocating at
least - 80 of the total cost in stage one provides
- maximum power. For limited total sample size,
- evaluating all the markers on 55 of subjects in
the - first stage provides the maximum power while the
- cost reduction is approximately 43.
1 Background Family-wise error rate (FWER)
controlling methods may fail for being too
conservative and single stage strategies, such
as false discovery rate (FDR) controlling
methods, are not cost-efficient under limited
resources, especially when testing a large
number of markers. Objective We propose a
grid-search algorithm for an optimal design for
sample size allocation under two-stage multiple
testing procedures. Two different situations are
considered (1) Fixed total genotyping cost
(FTGC) (2) Fixed sample sizes (FSS)
Mw M(1-w)
- 3 Grid-Search Algorithm
- ? N1 ? k or ? ? E(R) ? FPR, TPR
- FTGC costMN1E(R)N2
- ? Let N2kN1 and k(cost MN1)/(N1E(R))
- FSS NN1N2
- ? Let N1?N (e.g. N1000)
Note cost/M600, M500, w0.95,
6
- 7 Comparison with existing 2-stage
- methods
- ? Overall Type I error
- The proposed optimal design produced less
false - positives than that of existing alternatives
- regardless of allelic odds ratio and the total
- number of markers.
- ? Overall power
- The power of the optimal 2-stage design was
- consistently larger than that of existing
methods. - ? Cost-effectiveness
- The superiority remains when compared in terms
- of total sample size or cost-efficiency.
- Conclusions
- ? The proposed approach provides specific
criteria - in formal testing with pre-specified
significance - level for each stage.
- ? The (N1, k) or (N1, p) can be determined
- analytically with optimal TPR, bearable FPR
and - satisfied cost.
- ? Approximately 88 of total cost in earlier
- stage produces optimal power where 5000
- markers are screened under fixed cost.
- ? If the sample size is restricted, we recommend
- N1/N between 0.5 and 0.6 to get a higher
overall - power and substantial cost reduction.
Y-axis 1-2 M5000, w0.999 3-4 M100, w0.95.