Title: MAGE and ArrayExpress
1MAGE and ArrayExpress
2ArrayExpress Infrastructure
Local MIAMExpress Installations
EBI
Submissions
www
MIAMExpress (MySQL)
Array Manufacturers
MAGE-ML
Queries
Datapipelines
ArrayExpress (Oracle Tomcat)
www
MAGE-ML
LIMS
Data analysis
www
Data Analysis software
Expression Profiler
MAGE-ML import, export
Microarray software
External Bioinformatics databases
Other Microarray databases
3ArrayExpress architecture
Web page template
Web page template
MAGE-ML (DTD)
Tomcat
Velocity template engine
MAGE-ML (doc)
MAGE-OM
MAGE-ML (doc)
MAGE-ML (doc)
Java servlets
MAGE loader
MAGE validator
MAGE unloader
Castor
object/ relational mapping
ArrayExpress (Oracle)
error.log
4AE schema
- Generated from (modified) MAGE-OM
- Classes ? tables
- Attributes ? fields
- 1..1, 1..n associations ? foreign keys
- m..n associations ? link tables
- Superclasses with incoming associations ? special
tables supporting object-relational layer
(Castor)
5AE schema
- Some say that this cannot work, but
- AE must be able to import any valid MAGE-ML and
not lose information - good for dumping in MAGE-ML
- good for navigating through data in terms of
object model - if some queries dont work well, add something to
the schema - Experiment-Biomaterial, Experiment-Protocol links
- so far works for 80Gb of data
6ArrayExpress purpose
- main objective - help in finding and initial
exploration of data download for detailed
analysis - data repository (now) data warehouse (in
development)
7Queries - logical structure
Organisation - name
Person - last name
Array Design - accession - name
Array
Experiment - accession
Hybridisation
Species
Sample
ExperimentType
Protocol Type
ExperimentDesign
ExperimentalFactor
Protocol - accession
8ArrayExpress other technical details
- Data matrices - stored in NetCDF format
- binary format for efficient storage of
multidimensional array - Arrays - stored as ADF spreadsheets (in addition
to normal MAGE structures)
9Data representation
in MAGE/ArrayExpress
in typical data analysis programs
10ArrayExpress data export
BioAssayData2
11Data export form
12Array representation - ADF format
13More complicated queries
- retrieve all experiments that use S.cerevisiae,
in which any of the genes in my favorite pathway
were down-regulated - retrieve all experiments performed on mice, in
any of the three tissues a, b, or c, where the
fold change for gene x is greater than 2.0.
14ArrayExpress DataWarehouse - very simplified view
Design element/ gene
Sample
BioAssay (hyb)
Array design
Experiment
15AE DW - some more details
gene name
Properties
Properties
Properties
bioassay type
Properties
Properties
Properties
16Next Jaboree
- EBI, Hinxton (close to Cambridge, UK)
- December 1 to 5/6, 2003
- Main objective ensure that MAGE-ML that comes
from various sources is comparable - Registration fee (for cost recovery)
- Participation mostly by invitation
17Acknowledgements
- Gonzalo Garcia Lara - web interface
- Ahmet Oezcimen - DBA
- Anjan Sharma - curation tool
- Niran Abeygunawardena - webmaster
- Curation team
- Helen Parkinson, Philippe Rocca-Serra, Ele
Holloway, Susanna Sansone, Gaurab Mukherjee - Expression Profiler team
- Jaak Vilo, Misha Kapushesky, Patrick Kemmeren
- Alvis Brazma