Title: An approach to gene discovery in ALS crops
1An approach to gene discovery in ALS crops
- Kevin Holland, Patrick Thimote, Ernesto Almira,
Li Liu, and William Farmerie - Interdisciplinary Center for Biotechnology
Research - University of Florida
2ALS Crops
- Compact, high yield cultivars of crop plants
developed for food production in space - Significant challenges for ALS crops -- coping
with non-terrestrial conditions - CO2 concentration
- Light quantity and quality
- Temperature
- Relative humidity
- Microgravity
3USU-Apogee Wheat
- Utah State University -- Bruce Bugbee
- Extremely short, high yield cultivar
- Full-dwarf hard red spring wheat (Triticum
aestivum L.) - High yields in controlled environments
- Higher yielding alternative to 'Yecora Rojo' and
Veery-10
4Characterization of ALS Crops
- Measuring interpreting cellular responses
- Quantitative gene expression monitoring
- Microarray
- preferred method for global gene expression
analysis - Commercial gene chips unavailable for ALS crops
- Species specificity of gene probes likely
precludes using commercial chips
5DYI gene chip construction
- Custom gene chip development technology rapidly
evolving - No longer cost prohibitive
- Need probe nucleotide sequences
- Public availability of a catalog of wheat gene
cDNA sequences - GenBank Unigene Database
- ? Applicability of public wheat catalog to dwarf
cultivars
6Why not sequence Apogee wheat?
- Comparison with existing publicly-available wheat
sequences - Identify possible unique genes
- Establish nucleotide sequence database for Apogee
wheat - Facilitate construction of Apogee-specific gene
chips - Relatively cheap insurance that probe sequences
match targets
7Apogee wheat cDNA library construction
- 6 g leaf tissue harvested for RNA prep
- Invitrogen cDNA library construction service
- 3 million primary clones
- 1.6 kb average insert
8Naïve Cluster Assemble
10,842 Apogee EST Sequences
3,740 Singlet ESTs
7,095 ESTs in clusters
clustering
1,321 Contigs
6,011 unique assembly elements
assembly
950 Contig Singlets
9NCBI Triticum aestivum UniGene Build 33
- 1,594 mRNAs
- 56,513 EST, 3 reads
- 156,352 EST, 5 reads
- 158,789 EST, other/unknown
- 373,248 total sequences in clusters
- 22,841 cluster sets total
10Seed Cluster
10,842 Apogee EST Sequences
6,195 Apogee ESTs
20,865 unclaimed Unigenes
22,841 Unigene Sequences from GenBank
4,647 Apogee ESTs 1976 Unigenes
11Seed Assembly
6,195 Apogee ESTs
2,953 singlets
770 contigs
509 contig singlets
assembly
4,647 Apogee ESTs 1976 Unigenes
5,839 unique assembly elements
1,607 Seed contigs
12Assembly Statistics
13EST Characterization
- What do our ESTs do?
- Gene Ontology (GO) categories
- Biological process
- Cellular component
- Molecular function
- Known biochemical pathways (KEGG)
- Transcription and Translation
- Replication, regulation, and repair
14GO Classification
- Swissprot Database
- Most carefully annotated protein database
- Each Swissprot entry contains a list of GO
associations - Blastx search our Apogee assembly-set against the
Swissprot DB - Associates Apogee wheat ESTs with Swissprot
entries and hence with GO terms - Rank associations on the basis of Blast bit score
and e-value
15GO classifications for Apogee EST Assembly-Set
- 3445 of 5839 unique assembly elements classified
- Have at least 1 associated GO Term
- Most have multiple GO associations
- 2776 biological process
- 3062 molecular function
- 2058 cellular component
16Tae.116.C1
- Nucleolar RNA helicase II (DEAD-box protein 21)
- Associated GO Terms (6 of 9)
- Nucleic acid binding (mf)
- RNA binding (mf)
- Adenosinetriphosphatase (mf)
- ATP dependent RNA helicase (mf)
- Nucleus (cc)
- Nucleolus (cc)
17KEGG Classification
- Kyoto Encyclopedia of Genes and Genomes
- Wiring diagrams of life
- KEGG Protein Networks
- Metabolic pathways
- Regulatory pathways
- Molecular complexes
- Network-network relations
- Network-environment relations
18KEGG Summary
- Apogee EST assembly elements associated with KEGG
ortholog database by Blastx search - 3190 of 5839 assembly elements at e lt 10-3
- 2724 of 5839 assembly elements at e lt 10-10
19KEGG Pathways
20Unique to non-Unigene
Common to both
Unique to Unigene
21Unique to non-Unigene
Common to both
22Summary
- 10,842 Apogee wheat EST sequences
- 43 have significant homology with domestic wheat
Unigene ESTs (co-cluster) - 57 are significantly different from Unigenes at
the nucleotide sequence level - gt 50 of the assembled Apogee ESTs are classified
by GO and or KEGG associations