Conclusions - PowerPoint PPT Presentation

1 / 1
About This Presentation
Title:

Conclusions

Description:

Currently conversion is designed to accept Themo's RAW data format, ... document. Converter software (conceptual design) Common. Internal. Model. other. writer? ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 2
Provided by: james295
Category:

less

Transcript and Presenter's Notes

Title: Conclusions


1
Integration of mzXML and mzData Formats
Reference Implementation of Open-Source MS Data
Interchange Conversion Software Joshua M.
Tasman1, Eric W. Deutsch1, James S. Eddes1, David
D. Shteynberg1, Patrick G. A. Pedrioli2, Jimmy K.
Eng1, Ruedi Aebersold1 1Institute for Systems
Biology, Seattle, WA 2Institute for Molecular
Systems Biology (ETH), Zurich, Switzerland
Integration with Existing Proteomics Pipeline
This poster presents work on raw-data to xml
formats. Once these xml files are available,
only slight modifications to the existing
open-source Trans-proteomic Pipeline tools are
necessary the tools rely on common parsers, RAMP
(C) and JRAP (Java), which can be extended to
support the new dataXML file format.
Methods While the project initially began as an
update of existing C code, the C language
became the language of choice for the project.
Several reasons informed this change. For one,
the language has stronger automatic support for
safety features such as garbage collection and
array checking. Secondly, C provides facilities
for easing the task of working with 3rd party
code. Dot-net assemblies can of course be
easily incorporated. For dealing with older
methods, such as those providing COM and DLL,
Microsoft IDE-provided tools can auto-generate
bridging code to access these from the C. The
Thermo raw file format was chosen as the initial
implementation simply due to familiarity with
their application programming interface.
Actually, the availably or lack of vendor support
is the greatest issue facing expansion of the
project. Because of great differences in API
style between vendors, questions have been raised
as to the efficiency of the adaptor design
pattern used in this project by fellow
developers.
Overview We present a prototype open-source
framework for converting vendor-specific raw
MS/MS data files to open-source XML formats.
The mzXML format (developed by SPC/ISB) and the
PSI consortiums dataXML formats are both target
outputs. Currently conversion is designed to
accept Themo's RAW data format, but the project
is designed to be extendable to other input
formats. The dataXML format is still in flux,
but is nearing final ratification. Once this
happens, with minor modifications to some
supporting programs, data converted to dataXML
format can be supported by the rest of the
SPC/ISB's open-source Trans-proteomic Pipeline
toolchain.
mzXML
mzData
dataXML
RAMP
JRAP
existing
modified
Existing TPP Pipeline Tools Prophets, Web
display/Interaction, Quantation, etc
new
Conclusions Whats next? Implementation of
additional input formats Additional vendor
support As vendors become more open with their
APIs for accessing raw data, implementation of
projects like this one can proceed much more
easily. Additionally, through documentation can
allow Cross-platform support If vendors
move towards software libraries that operate
entirely with the .net framework, and allowed
required libraries to be copied, the code could
be executed on linux and Mac OSX platforms.
Motivation Driven in large part by recent rapid
advances in proteomics, the need for a
vendor-independent means of accurate and robust
representation and exchange for mass spectroscopy
data has become apparent. Two major formats have
emerged mzXML, developed at the Institute for
Systems Biology (ISB) and highly integrated into
the Trans-proteomic Pipeline (TPP) software tool
chain, and mzData, developed by the HUPO
Proteomics Standards Initiative (PSI) MS working
group. Both the proteomics research community
and instrument vendors would clearly benefit from
a single standard. Recently, the PSI-MS group,
the ISB, and instrument vendors collaborated to
produce a draft specification for a unified data
format, tentatively titled "dataXML", with the
intention of combining the best features of the
mzXML and msData formats. For example, the
dataXML format allows additional information not
encoded in the xml schema to be included in the
file through the use of supplemental controlled
vocabularies. Here, we present work towards an
open-source reference implementation for
converters from raw data to both the mzXML and
dataXML formats, which could be extended to other
formats as well.
Software Software will be available
at http//tools.proteomecenter.org/PrototypeCShar
pConverter.php
Converter software (conceptual design)
Thermo Reader (XCalibur)
dataXML writer
dataXML instance document
ABI/Sciex Reader (Analyst)
Common Internal Model
Waters Reader (MassLynx)
mzXML writer
mzXML instance document
Agilent
References mzXML A common open representation
of mass spectrometry data and its application to
proteomics research.Nat Biotechnol. 2004
Nov22(11)1459-66. PSI-MS Mass Spectrometry
Standards Working Group http//psidev.info/
other writer?
. . . (others)
Bruker
Write a Comment
User Comments (0)
About PowerShow.com