Title: Field Design and the Search for Quantitative Trait Loci in Plants
1Field Design and the Search for Quantitative
Trait Loci in Plants
- Kent M. Eskridge
- Department of Biometry
- University of Nebraska
2Searching for QTL
- Identify positions and effects of QTL that
control quantitative traits (eg. yield, oil,
etc.) -
- Used to
- find superior genotypes
- understand basic mechanisms
3Example data set (Campbell et al.)
- LOC REP VARIETY YIELD cdo549 bcd907
barc57 - LIN00 1 CCW3A62 2.983 1
1 1 - LIN00 1 CNN(WI3A) 3.601 .
. . - LIN00 1 CCW3A52 3.474 -1
-1 -1 - LIN00 1 JAGGER 3.108 .
. . - LIN00 1 CCW3A08 2.809 1
1 1 - LIN00 1 CCW3A108 2.217 -1
-1 -1 - LIN00 1 CCW3A03 3.407 1
1 1 - LIN00 1 CCW3A50 3.121 1
1 1 - .
Etc...
4Interval map
5PROBLEMS WITH QTL PROJECTS
- Many neglect major sources of environmental
variation in both - field design
- statistical analysis
6OBJECTIVES
- Discuss consequences of not adequately
accommodating environmental variation - Describe some field designs and statistical
analyses that account for environmental variation
in QTL research
7Design Issues
- QTL effects often small
- Need many lines gt 250? (h2, Q, etc)
- Large amount of field variation gt
- Need for large complex designs eg
- incomplete block designs etc
- Traits sensitive to environment gt Need multiple
environments
8Crop Science QTL papers
- Sample of 40 papers (2000-2003)
- 7.5 had more than 250 genotypes
- Field Design
- 34 used incomplete blocks
- 45 used complete block design
- 13 used other designs
- 8 gave no information on design
9Advantages of Inc. block designs
- Increases chances of detecting QTL with same
resources - Or decreases cost with same power
- ½ reps of complete block (Landes)
- Incomplete block designs widely available (SAS,
Alphagen, CycDesignN Gendex, etc)
10Some useful designs for a large number of entries
- Lattices
- Simple, triple, rectangular,squares.
- -lattice (Patterson and Williams, 1976)
- no. of treatments multiple of
- blocksize.
- Other Designs
11Alpha lattice t 24, k4, b12
- rep 1 rep 2
- 1 7 13 19 1 8 16 24
- 2 8 14 20 2 9 17 19
- 3 9 15 21 3 10 18 20
- 4 10 16 22 4 11 13 21
- 5 11 17 23 5 12 14 22
- 6 12 18 24 6 7 15 23
12Cyclic and row/column designs
- t7, k3, b14
- rep 1 rep 2
- 1 2 4 1 2 6
- 2 3 5 2 3 7
- 3 4 6 3 4 1
- 4 5 7 4 5 2
- 5 6 1 5 6 3
- 6 7 2 6 7 4
- 7 1 3 7 1 5
13CUSTOM DESIGN
- t 27 k 4 b 14
- rep 1 rep 2
- 1 8 15 22 2 3 4
5 - 2 9 16 23 6 7 8
9 - 3 10 17 24 10 11 12 13
- 4 11 18 25 14 15 16 11
- 5 12 19 26 18 19 20 21
- 6 13 20 27 22 23 24 25
- 7 14 21 1 26 27 1 2
14Statistical Analysis Issues
- Linkage analysis
- two-locus,linkage blocks
- Order markers
- two or multi-locus,map distance
- Estimate QTL effects and position
- 1-factor, regression, CIM, LS or ML
- All depend on mating structure
15Crop Science QTL papers
- Statistical Analysis and evaluation of
environmental effects - 67 did not account for E or QTL x E
16POSITION AND EFFECTS OF QTL
- Marker regression
- Interval mapping - maximum likelihood
-
- Composite Interval mapping - Regression
17Markers A,B and QTL Q (Sari-Gorla et al., 1997)
Q
rA
rB
r
A
B
rA (P-PA)/100 rB(PB-P)/100
RA2rA/(12rA) RB2rB/(12rB)
RRARB
18X(A,B,P) for recombinant inbred line
1 (RB-RA)/RA (RA-RB)/RB -1
A1A1 B1B1 A1A1 B2B2 A2A2 B1B1 A2A2 B2B2
X(A,B,P)
19Assumptions of all models
- Independent observations
- No environmental effects
- Trait means over envs and reps
- are analyzed
20Consequences
True model Fitted model
QTL effects biased and inefficient!
21Example QTL for chromosome 3A in Wheat (Campbell
et al.)
104 entries - 98 Recomb. Inbred Chromosome
substitutions lines Design Alpha lattice - 4
reps, 13 plots/ inc. block, 7 environments 15
RFLP 5 micro-satellite markers -- used 16
22Combined Anova
Source df env 6 Blk
(env) 210 Geno 103 (QTL
markers) q residual
103-q Geno x env 618 (QTL markers) x
env 6q residual 618-6q Error 1974
23Better Analyses
- Nearest Neighbor (NNA) and Spatial models often
better than ANOVA (Landes, et al., 2002). - NNA - use neighbor plot values as
- covariates for each plot
- Spatial - model correlation among
- neighboring plots using an-iso power
- Corr(yi,yj) 2 rd(i,j,k)
24NNA and Spatial models
NNA
Spatial Source df Source df
env 6 env
6 NS(env) 6 geno 103
EW(env) 6 (QTL
markers) q geno 103
residual 103-q (QTL markers) q
geno x env 618 residual 103-q
(QTL m) x env 6q geno x env 618
residual 618-6q (QTL m) x env
6q Error 2184 residual
618-6q Error 2172
25Data Analysis
Used (i) markers as QTL (ii) composite
interval mapping For all 4 models 1. Marker
regression on entry means 2. Combined ANOVA 3.
NNA and 4. Spatial
26Markers QTL
27Composite Interval Map - Four Models
28Partitioning GxE interaction
Source df________ Geno x env 618 (QTL
markers) x env 6q (QTL
markers) x precip 1 q (QTL
markers) x precip 2 q (QTL
markers) x precip 3 q (QTL
markers) x temp 1 q (QTL
markers) x temp 2 q (QTL
markers) x temp 3 q residual
618-6q Error 1974
29 GxE Covariates Interval Map - ANOVA
30Major Points
- Many plant QTL studies based on weak designs and
analyses - Good design and stat. analysis ?
precise and cost efficient
results, evaluation of QTL x E
interaction - Encourages use of time tested designs and linear
model analyses ANOVA, NNA, Spatial models etc. - Relatively easy to implement via standard
software
31Conclusion
Example of a research area where we have not
done a very good job of ensuring scientists are
using sound statistical design and analysis.
We (statistical community) need to ensure that
scientists trained in the latest scientific
techniques know how to accommodate all major
sources of variability in both the design and
analysis of their research.
32Acknowledgements
Todd Campbell, Steve Baenziger, Daryl Travnicek,
Steve Westerholt, Carol Disney and Reid Landes