Title: A
1A Control Workgroup Of Twenty-Four Protein
Targets Used To Test Platform Changes David J.
Aceti Center for Eukaryotic Structural
Genomics University of Wisconsin-MadisonDepartmen
t of Biochemistry 2009 NIGMS Workshop
on Enabling Technologies for Structural Biology
March 5, 2009
2- The Control Workgroup is a selected set of
proteins whose behavior is known at all stages of
our platform. - The Control Workgroup has been used to
- Compare expression vectors
- Compare methods of mRNA and plasmid preparation
for Wheat Germ Cell-Free expression - (3) Compare E. coli and Cell-Free protein
production platforms - (4) Compare growth medium formulations
- (5) Test new crystallization equipment
- (6) Test aspects of the purification process
- (7) Train staff
3The Control Workgroup
ORF ID Protein Name Source Organism MW Assay
1 34382 cytoplasmic dynein light chain M. musculus 10990
2 6042 unknown protein At1g77540.1 A. thaliana 11736
3 14751 thioredoxin h1 A. thaliana 12673 Spec assay following NADPH/insulin-DS redox reaction
4 33810 zinc finger protein H. sapiens 13169
5 81370 unknown heme-binding protein C. merolae 16474 Red-colored from heme binding
6 2361 unknown protein At1g01470.1 A. thaliana 16543
7 13193 unknown protein At3g03773.1 A. thaliana 17366
8 11624 thioredoxin-like protein A. thaliana 17947
9 91592 allene oxide cyclase variant1 A. thaliana 19522
10 91593 allene oxide cyclase variant2 A. thaliana 21232
11 35683 cysteine dioxygenase 1 M. musculus 23026 HLPC assay for cysteine sulfinic acid formation
12 605 phosphatase A. thaliana 24537 pNPP phosphatase assay
13 91571 Enhanced C3 green fluorescent protein A. victoria 26748 Green/fluorescent
14 91591 TEV protease Tobacco etch virus 26922 Fluoresence anisotropy-based protease assay
15 74368 Pre-mRNA processing factor 24 S. cerevisiae 27223 Gel mobility shift assay
16 80048 sarcosine dimethylglycine methyltransferase G. sulphuraria 33324 Coupled spec. assay following deamination of adenine
17 37540 glyoxylate/hydroxypyruvate reductase H. sapiens 35668 NADPH-linked spectrophotometric assay
18 79368 aspartoacylase H. sapiens 35735 Couple spec. assay following deamination of aspartic acid
19 70653 dimetal phosphatase D. rerio 36645 pNPP pPhosphatase assay
20 7312 putative steroid sulfotransferase A. thaliana 37140
21 8210 12-oxophytodienoate-10,11-reductase A. thaliana 42691 NADPH-dependent reduction of TNT
22 24674 agmatine iminohydrolase A. thaliana 43156 Assay of ammonia product by Bertelot reaction.
23 34351 unknown protein BC065058 M. musculus 50256
24 74329 Photinus (Firefly) luciferase P. pyralis 60844 Luciferase assay
4History of the Control Workgroup
ORF ID Protein Name MW Wheat Germ Cell Free History E. Coli Cell-Based History
1 34382 cytoplasmic dynein light chain 10990 NMR structure 1Y4O Small amount purified
2 6042 unknown protein At1g77540.1 11736 None NMR structure 2EVN . Xray structure 1XMT
3 14751 thioredoxin h1 12673 NMR structure 1XFL Small amount 15N purified
4 33810 zinc finger protein 13169 NMR structure 1ZR9 None
5 81370 unknown heme-binding protein 16474 Expressed soluble No crystals, HSQC-, good yield
6 2361 unknown protein At1g01470.1 16543 HSQC NMR structure 1XO8, high yields, crystal
7 13193 unknown protein At3g03773.1 17366 None Variable yields
8 11624 thioredoxin-like protein 17947 No expression NMR structure 1X0Y, high yields
9 91592 allene oxide cyclase variant1 19522 None Did not cleave
10 91593 allene oxide cyclase variant2 21232 Expressed soluble Xray structure 1Z8K
11 35683 cysteine dioxygenase 1 23026 None Xray structure 2ATF, moderate yield
12 605 phosphatase 24537 None Xray structure 1XRI, good yields
13 91571 Enhanced C3 green fluorescent protein 26748 None Xray structure 2QU1
14 91591 TEV protease 26922 None High yields
15 74368 Pre-mRNA processing factor 24 27223 None Xray structure 2GHP, high yields
16 80048 sarcosine dimethylglycine methyl-xase 33324 None Xray structure 2O57, very high yield
17 37540 glyoxylate/hydroxypyruvate reductase 35668 None Xray structure 2H1S, low yield
18 79368 aspartoacylase 35735 Expressed soluble Xray structure 2I3C, low yield
19 70653 dimetal phosphatase 36645 None Xray structure 2NXF, variable yields
20 7312 putative steroid sulfotransferase 37140 None Xray structure 1Q44
21 8210 12-oxophytodienoate-10,11-reductase 42691 None Xray structure 1Q45, high yield
22 24674 agmatine iminohydrolase 43156 Expressed soluble Xray structure 1VKP, very high yields
23 34351 unknown protein 50256 None Xray structure 2GNX, low yield
24 74329 Photinus (Firefly) luciferase 60844 None None
5Expression Vector Comparison
Vector Cloning Method Antibiotic Promotor/ Repressor Tag Cleavage Fusion Protein
Base Vectors Base Vectors Base Vectors Base Vectors Base Vectors Base Vectors
pVP16 (E. coli) Gateway Amp T5/lacIq TEV His- MBP -TEV-S- ORF
pEU-His (cell-free) Restriction N/A SP6 None His - XX- ORF
Experimental Vectors Experimental Vectors Experimental Vectors Experimental Vectors Experimental Vectors Experimental Vectors
pVP68K (E. coli) Flexi Kan T5/lacI TEV His- MBP -3CP-TEV-S- ORF
pVP65K (E. coli) Flexi Kan T5/lacI Self (TVMV) /TEV MBP-TVMV-His-TEV-S-ORF
pEU-His-FV (cell-free) Flexi N/A SP6 TEV His -TEV- S- ORF
(1) Flexi Cloning for both E. coli cell-based
and wheat germ cell-free systems would allow
great savings in time and money. (2) Vectors
with wildtype lacI are more compatible with
autoinduction (Blommel et al., 2007, Biotech
Prog, 23585-598). (3) A self-cleaving vector
might cleave tags more efficiently and save time
and money.
6Performance of 3 E. coli Expression Vectors in
the Production of 24 SeMet-labeled Control
Workgroup Targets
(lacIq GW)
(lacI Flexi)
Success Rate ( Targets)
(lacI Flexi selfcleave)
NA
NA
7Performance of 3 E. coli Expression Vectors in
the Production of 12 15N-labeled Control
Workgroup Targets
(lacIq GW)
(lacI Flexi)
Success Rate ( Targets)
(lacI Flexi selfcleave)
NA
NA
8Mean Yield (mg) of Fusion and Target Proteins
from Three E. coli Vectors
15N Fusion SeMet Fusion 15N Target SeMet Target
pVP16 (lacIq) 133 184 9 19
pVP68K (lacI) 281 255 30 26
pVP65K (lacI self-cleaving) 38 80 2 4
9E. coli Vectors Historical vs. pVP16 vs. pVP68K
15N HSQC
SeMet Crystallization
ORF ID Protein Name MW Historical pVP16 pVP68K Historical pVP16 pVP68K
1 34382 cytoplasmic dynein light chain 10990 X
2 6042 unknown protein At1g77540.1 11736
3 14751 thioredoxin h1 12673 X
4 33810 zinc finger protein 13169 X X X X
5 81370 unknown heme-binding protein 16474 X X X X In Progress X
6 2361 unknown protein At1g01470.1 16543
7 13193 unknown protein At3g03773.1 17366 X
8 11624 thioredoxin-like protein 17947 X X X
9 91592 allene oxide cyclase variant1 19522 X X X X
10 91593 allene oxide cyclase variant2 21232 X X X
11 35683 cysteine dioxygenase 1 23026 X X X
12 605 phosphatase 24537 X X X
13 91571 Enhanced C3 green fluorescent protein 26748
14 91591 TEV protease 26922 X X X
15 74368 Pre-mRNA processing factor 24 27223 X
16 80048 sarcosine dimethylglycine methyl-xase 33324
17 37540 glyoxylate/hydroxypyruvate reductase 35668
18 79368 aspartoacylase 35735
19 70653 dimetal phosphatase 36645 X
20 7312 putative steroid sulfotransferase 37140 X
21 8210 12-oxophytodienoate-10,11-reductase 42691
22 24674 agmatine iminohydrolase 43156
23 34351 unknown protein 50256 X
24 74329 Photinus (Firefly) luciferase 60844
10Performance of a Wheat Germ Cell-FreeExpression
Vector in the Production of24 15N-labeled
Control Workgroup Targets
Success Rate ( Targets)
11Wheat Germ Cell-Free Historical Vectors vs.
Current Vector
ORF ID Protein Name MW Historical HSQC pEU-His-FV HSQC
34382 cytoplasmic dynein light chain 10990
6042 unknown protein At1g77540.1 11736
14751 thioredoxin h1 12673
33810 zinc finger protein 13169
81370 unknown heme-binding protein 16474 X
2361 unknown protein At1g01470.1 16543 X
13193 unknown protein At3g03773.1 17366
11624 thioredoxin-like protein 17947 X X
91592 allene oxide cyclase variant1 19522
91593 allene oxide cyclase variant2 21232 X
35683 cysteine dioxygenase 1 23026
605 phosphatase 24537 X
91571 Enhanced C3 green fluorescent protein 26748
91591 TEV protease 26922 X
74368 Pre-mRNA processing factor 24 27223 X
80048 sarcosine dimethylglycine methyl-xase 33324 X
37540 glyoxylate/hydroxypyruvate reductase 35668 X
79368 aspartoacylase 35735 X X
70653 dimetal phosphatase 36645 X
7312 putative steroid sulfotransferase 37140 X
8210 12-oxophytodienoate-10,11-reductase 42691 X
24674 agmatine iminohydrolase 43156 X X
34351 unknown protein 50256 X
74329 Photinus (Firefly) luciferase 60844 X
12Distinct Control Workgroup Targets Expressed
Solubly from E. coli and Wheat Germ Cell-Free
Wheat Germ Cell-Free
E. coli
18
3
2
13Distinct Arabidopsis Targets Expressed Solubly
from E. coli and Wheat Germ Cell-Free
Wheat Germ Cell-Free
E. coli
20
29
25
Tyler, R.C., et al. (2005) Comparison of
cell-based and cell-free protocols for
producing target proteins from the Arabidopsis
thaliana genome for structural studies. Proteins
59633-643
14- Use of Control Workgroup to test preparation of
plasmid - and mRNA for wheat germ cell-free production
-
Shin-ichi Makino - mRNA coding for target protein is degraded by
endogenous RNase activity carried over from
plasmid template preparation, and this lowers the
protein yield. - mRNA has been concentrated by ethanol
precipitation and is sometimes difficult to
resolubilize, resulting in lower and less
consistent protein yields. - Plasmids are now treated with proteases to
inactivate RNase activity, and ethanol
precipitation of mRNA - has been discontinued. Higher and more consistent
levels of expression result as shown with the
Control Workgroup.
Old protocol
of Targets
New protocol
None Weak Moderate High
None Weak Moderate High
Expression
Solubility
15Conclusions
- The Control Workgroup is a useful tool for
comparing platform - performance over a wide variety and a
significant number of proteins. - The lacI Flexi vector pVP68K does no harm and
produces greater quantities of protein. The Flexi
Cloning system works for both E. coli and wheat
germ platforms. This system saves time and money. - The self-cleaving vector pVP65K is not viable
the concept needs further research. - An improved method of preparing mRNA and plasmid
for cell-free production was demonstrated. - The Control Workgroup will be available
shortly from the PSI - Materials Repository as sets of twenty-four
targets in 4 expression vectors.
16Credit Goes To. Lai Bergeman Craig Bingman
Sethe Burgie Mike Cassidy Claudia
Cornilescu Brian Fox Ronnie Frederick Kasia
Gromek Leigh Grundhoefer Andrew Larkin Betsy
Lytle Shin-ichi Makino John Markley Yuko
Matsubara Karl Nichols Xiaokang Pan
Francis Peterson George Phillips Mike
Popelars John Primm Greg Sabat Sarata Sahu Kory
Seder Donna Troestler Frank Vojtik Gary
Wesenberg Russ Wrobel Brian Volkman Zsolt
Zolnai Many CESG Alumni and many Students