Title: Organizing MSMS Proteomic Data for Publication with Scaffold
1Organizing MS/MS Proteomic Data for Publication
with Scaffold
- Brian C. Searle
- Proteome Software Inc.
- Portland, Oregon
- ABRF2009, Memphis TN
- February 10th, 2009
Creative Commons Attribution
2Publication Standards
- In 2006 MCP published guidelines for reporting
peptide and protein identifications - Other proteomics journals have adopted similar
standards
3What do the Guidelines Do?
- The guidelines ensure enough information to
- Understand and critically assess the results
- Enforce a low level of reliability
- Provide enough data concerning potentially
questionable results to allow for reassessment
4What do the Guidelines Do?
- The guidelines ensure enough information to
- Understand and critically assess the results
- Enforce a low level of reliability
- Provide enough data concerning potentially
questionable results to allow for reassessment - Publication guidelines have played a critical
role in the acceptance of proteomic analysis
5Publication Standards are Hard!
6Publication Standards are Hard!
- They're hard for
- the authors who have to comply
7Publication Standards are Hard!
- They're hard for
- the authors who have to comply
- the reviewers who must police compliance
8Publication Standards are Hard!
- They're hard for
- the authors who have to comply
- the reviewers who must police compliance
- the journals because theres a huge amount of
supplemental data in non-standard formats - PowerPoint guarantee readability, but difficult
to use - RAW file formats can expire, but are much more
useful
9Publication Standards are Hard!
- They're hard for
- the authors who have to comply
- the reviewers who must police compliance
- the journals because theres a huge amount of
supplemental data in non-standard formats - But they dont have to be!
10Scaffold Goals
- Make it easier to organize data
- Make tables useful for publication
- Make figures that could be dropped directly into
manuscripts - Clear fit when the guidelines were announced
11Scaffold Makes theGuidelines Easier for Authors
- Collate relevant data from search engine files
- Most search engines supported Mascot, SEQUEST,
X! Tandem, Phenyx, SpectrumMill, OMSSA, IdentityE
12Scaffold Makes theGuidelines Easier for Authors
- Collate relevant data from search engine files
- Most search engines supported Mascot, SEQUEST,
X! Tandem, Phenyx, SpectrumMill, OMSSA, IdentityE - Collapse related parameters across all files
- Clearly point out missing metadata
- Produce comparable search engine and instrument
independent probabilities
13Collate Relevant Data
- Peak picking software, version, altered
parameters - Database Selection
- Database name and version
- Species restriction
- Number of proteins searched
- Database search parameters
- Search engine name and version
- Enzyme specificity
- missed cleavages
- Fixed/variable modifications
- Mass tolerances
- Peptide selection criteria
14Collate Relevant Data
- Peak picking software, version, altered
parameters - Database Selection
- Database name and version
- Species restriction
- Number of proteins searched
- Database search parameters
- Search engine name and version
- Enzyme specificity
- missed cleavages
- Fixed/variable modifications
- Mass tolerances
- Peptide selection criteria
15Collate Relevant Data
16Collate Relevant Data
- This Button Gets
- Protein accession numbers
- Number of unique peptides
- sequence coverage
17Collate Relevant Data
- While This Button Gets
- Peptide sequences identified
- Precursor m/z and charge
- Score and peptide probability
18Collate Relevant Data
- One hit wonders and modifications require
validation - Distribute the Scaffold file to reviewers
19Collate Relevant Data
- Similar table and reports for iTRAQ and TMT
quantitative data in Scaffold Q - Thats it!
20Making the Guidelines Easier for Reviewers to
Police
- All metadata is guaranteed to be present
21Making the Guidelines Easier for Reviewers to
Police
- All metadata is guaranteed to be present
- Quick to review key proteins and spectra
22Making the Guidelines Easier for Reviewers to
Police
- All metadata is guaranteed to be present
- Quick to review key proteins and spectra
- Easy to validate statistical assumptions
23Probability Assumptions Usually Work
24But Not Always!
25Distributing Guideline Compliant Data
- Journals can distribute Scaffold files containing
the entire data set
26Distributing Guideline Compliant Data
- Journals can distribute Scaffold files containing
the entire data set - Authors can share their raw data
- Peak list exports for a variety of formats
- (MGF, DTA, PKL, etc)
27Distributing Guideline Compliant Data
- Journals can distribute Scaffold files containing
the entire data set - Authors can share their raw data
- Peak list exports for a variety of formats
- (MGF, DTA, PKL, etc)
- Journals are allowed to distribute the original
version of the free viewer so the files can
ALWAYS be opened!
28Conclusions
- Scaffold makes it
- Easier to pass publishing data guidelines
- Easier to review data and to police standards
- Easier to safely distribute supplemental material
in a useful format