The Complete Arabidopsis Transcriptome MicroArray CATMA Project - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

The Complete Arabidopsis Transcriptome MicroArray CATMA Project

Description:

Each primer designed to synthesize a GST carries a gene specific ... Primer pairs are designed in the remaining regions. ... Selection of specific primer pairs. ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 2
Provided by: Prof442
Category:

less

Transcript and Presenter's Notes

Title: The Complete Arabidopsis Transcriptome MicroArray CATMA Project


1
The Complete Arabidopsis Transcriptome MicroArray
(CATMA) Project
P. Hilson, T. Altmann, S. Aubourg, J. Beynon, F.
Bitton, M. Caboche, M. Crowe, P. Dehais, H.
Eickhoff, E. Kuhn, S. May, W. Nietfeld, J.
Paz-Ares, W. Rensink, P. Reymond, P. Rouzé, U.
Schneider, C. Serizet, A. Tabrett, V. Thareau, M.
Trick, G. van den Ackerveken, P. Van Hummelen, P.
Weisbeek, M. Zabeau http//jic-bioinfo
.bbsrc.ac.uk/CATMA/
2. Automated design of GSTs
Introduction
Most cDNA clones included in DNA arrays are
identified by an EST covering only a portion of
their length. The complete clone sequence is
generally unknown and is not selected to yield
hybridisation results specific to a single gene.
ESTs only represent about half the genes
identified in model eucaryote genomes. To bypass
these shortcomings, we are constructing a
collection of high quality Gene Specific Tags
(GSTs) representing most Arabidopsis genes for
use in microarray transcriptome analyses and in
other functional genomic approaches.
1. Gene structural annotation
The identification of each gene in the
Arabidopsis genome is at the root of any
genome-wide effort to study their expression.
Since the structure of only a minority of
Arabidopsis genes has been determined
experimentally so far, annotation still relies on
gene prediction to identify the boundaries of
transcription units and of the exon(s) within it
(The AGI Consortium, 2000). Using the AGI nuclear
genome, we have generated an updated structural
annotation of all 5 Arabidopsis chromosomes. The
annotation process has been automated. It uses
the EuGène software (Schiex et al, 2001) with a
unique set of parameters and algorithms applied
to all chromosome regions (Figure 1A). Its
prediction quality has been tested by matching
results against a set of experimentally defined
full length cDNA as described by Rouzé and
collaborators (Pavy et al., 1999). Quality
assessment parameters for chromosome 2 annotation
are shown in Table 1. EuGène identifies 29,804
genes in the Arabidopsis nuclear genome, which is
higher than the 25,470 identified by the AGI
(Figure 2). The detailed comparative analysis of
the EuGène and AGI annotations is currently
underway. Preliminary observations indicate that
EuGènes higher number results from the
combination of several factors EuGène can
predict two genes where AGI annotates one, it
predicts genes where none is annotated by AGI
(3,369) more often than the contrary (1,533), and
it seems biased towards overprediction in
pericentromeric regions rich in repeated
sequences.
A. Distribution of GST lengths
B. Position of GSTs
150-200 bp 42
200-300 bp 36
300-500 bp 22
Figure 4. GST characteristics
3. Structure of the GST collection
Each primer designed to synthesize a GST carries
a gene specific 3 domain corresponding to the
sequence selected by SPADS (18-25 nt) and a 5
extension (17 nt) added to allow for
reamplification of the GSTs with a limited set of
universal primers. A set of 40 extensions has
been designed so that each sample in a 384-well
plate can be amplified witt the unique combination
Figure 1. Gene identification and GST selection
combination of one row and another column primer,
hence avoiding cross-contamination which often
plagues the storage and dissemination of
large-scale clone collections. The primary
amplicons obtained from BAC DNA templates in
large excess can be conveniently reamplified and
distributed. Also, amplicon production using BAC
increases the quality of the GSTs and the
fraction of successful PCR amplifications by
reducing the complexity of the templates (Figure
5). All GSTs are oriented with regard to
transcription with column primers at the 5 end
(see above picture). As of 26 September 2001, the
Consortium had PCR amplified 16.280 GSTs.
Figure 2. Gene density according to the Eugène
and AGI annotations
Conclusion
The project is based on a novel complete unified
annotation of the Arabidopsis nuclear genome,
generated with our upgraded EuGène software, from
which GSTs are selected with SPADS. We are
currently studying how best to complement the
current GST collection to minimize the presence
of non specific probes allowing hybridisation
with transcripts from non cognate genes. Given
the structure of the GST collection, it can be
adapted to a variety of microarray protocols and
procedures. It can also serve as a key resource
for other large scale functional genomic
endeavours based on specific nucleic acid
hybridisations, such as systematic Arabidopsis
RNAi programmes.
Write a Comment
User Comments (0)
About PowerShow.com