Title:
1Proteomics Bioinformatics
MBI, Master's Degree Program in Helsinki, Finland
Lecture 3
9 May, 2007
Sophia Kossida, BRF, Academy of Athens,
Greece Esa Pitkänen, Univeristy of Helsinki,
Finland Juho Rouso, University of Helsinki,
Finland
2Protein Identification with MS/MS
Proteins
The selected parent ion is fragmented.
Peptides
The peptides are ionized and separated according
to their m/z ratio. One ion of interest is
selected.
A mass spectrum of parent ions fragments is
acquired.
Database search. Sequest, Mascot
Protein identification
Theoretical
3Fragmentation of peptides
The peptide is a linear chain of amino acids
N-terminus
C-terminus
Ionisation and fragmentation
Modified from N. Edwards, University of
Maryland, College Park
4Fragmentation of peptides
The most commonly observed cleavage is of the
bond between the carbonyl oxygen and the amide
nitrogen. This is cleaved to form y-ion and the
b-ion.
y-ion the positive charge is retained on the
C-terminus of the original peptide ion. b-ion is
the fragment in which the charge is remained on
the N-terminal
5b- and y-ions
When single charged ions fragment, either b- or
y-ion is formed. The other half of the peptide is
lost as a neutral fragment. Doubly charged ions
are most likely to have charges at he opposite
ends of the molecule, both b- and y- ions are
formed (twice as much information)
6More about fragmentation
c
a
- Amino terminal fragments a, b, c
- Carboxy terminal fragments x, y, z
- Various side chain reactions
- the fragments can also fragment, especially if
have a mobile H - b-ions fragment to a-ions
z
x
7Fragment spectrum
The sequence
1166 1080 1022 875 762
633 504 389 260 147
y-ions
S G F L E
E D E L K
88 145 292 405 534
663 778 907 1020 1166
b-ions
Will show a spectrum like this
Modified from N. Edwards, University of
Maryland, College Park
8Fragmentation of sequence
Peptide S-G-F-L-E-E-D-E-L-K Peptide S-G-F-L-E-E-D-E-L-K Peptide S-G-F-L-E-E-D-E-L-K Peptide S-G-F-L-E-E-D-E-L-K Peptide S-G-F-L-E-E-D-E-L-K Peptide S-G-F-L-E-E-D-E-L-K
Mw ion ion Mw
88 b1 S GFLEEDELK y9 1080
145 b2 SG FLEEDELK y8 1022
292 b3 SGF LEEDELK y7 875
405 b4 SGFL EEDELK y6 762
534 b5 SGFLE EDELK y5 633
663 b6 SGFLEE DELK y4 504
778 b7 SGFLEED ELK y3 389
907 b8 SGFLEEDE LK y2 260
1020 b9 SGFLEEDEL K y1 147
Modified from N. Edwards, University of
Maryland, College Park
9Peptide identification
De novo interpretation Sequence database search
The mass of the parent ion, and the MS/MS
spectrum
The amino-acid sequence of the peptide
10De Novo spectrum
1166 1080 1022 875 762
633 504 389 260 147
y-ions
S G F L E
E D E L K
Distance between y5 and y4 is 129 amu,
corresponding to the residue of Glutamic acid
(E), the gap between y9 and y8 is 58,
corresponding to Glycine (G)
11De Novo sequencing
Identify the b- and y-ions in the spectrum
Experimental MS spectrum
Find valid peak pairs
Search for alignement Examine all combinations of
MS spectrum peak intervals, and the protein
fragments the intervals may represent, and
construct a most likely sequence.
http//gridweb.cti.depaul.edu/twiki/pub/Main/GilKw
ak/HypotheticalSeqAnalyzer.pdf
12Advantage/Disadvantage (de novo)
Gets the sequences that are not necessarily in
the database Requires high quality data The best
de novo interpretation may have no biological
relevance Not well suited for high throughput
workflow Difficulty in detecting
post-translational modifications and wild-type
mutants Incomplete ladders create ambiguity
Modified from http//gridweb.cti.depaul.edu/twiki
/pub/Main/GilKwak/HypotheticalSeqAnalyzer.pdf
13De Novo Software
http//www.hairyfatguy.com/lutefisk/
http//www.bioinfor.com8080/peaksonline/
14Sequence Database Search
AVAGCAGARCVAAGAAGRVGGACAAAR..
Experimental fragmentation spectrum
Select peptides that equal the input mass, from
database, - get a sequences that match.
Theoretical spectra
Precursor mass, charge state MH775,8
Theoretically fragment peptides, -generate
virtual MS-MS spectra
Compute correlation scores Rank
hits Peptide/protein validation
Compare virtual spectra to real spectrum
15Amino acids
16Sequence Database Search
Modified fromJimmy Eng, MS/MS Database
Searching http//tools.proteomecenter.org/course/l
ectures/0610Day1.Eng.pdf
17Advantages / Disadvantages
No need for complete ladders All candidates have
some biological relevance Proteins with lots of
identified peptides are not more likely to be
present Practical for high throughput peptide
identification Incomplete databases Poor
quality of fragmentation
18Software tools for MS/MS identification
The outcome of these programs depends on the
quality of the MS-MS data obtained and the
completeness and accuracy of the database used.
Sequest Mascot (Matrix Science) OMSSA
(NCBI) X!Hunter (Global Protein Machine
Organization)
19Sequest
Commercially available, distributed by Finnigan
Corp. Developed by Jimmy Eng and John Yates
Correlates uninterpreted tandem mass spectra of
peptides with amino acid sequences from protein
and nucleotide databases. Determine the amino
acid sequence and thus the protein (s) and
organism (s) that correspond to the mass spectrum
being analyzed.
http//fields.scripps.edu/sequest/
20Monoiotopic vs. average mass
Modified fromJimmy Eng, MS/MS Database
Searching http//tools.proteomecenter.org/course/l
ectures/0610-Day1.Eng.pdf
21Missed cleavage site
22Parameters of MS/MS id search
Modifications Cystein almost always
modified Variable modifications increase search
time exponentially Basic residues (K, R) at
C-terminal attract ionizing charge, leading to
strong y-ions
Digestion Enzyme Trypsin (specific) Non-tryptic
search increase time by two orders of magnitude
Large sequence databases contain many irrelevant
peptide candidates
http//www.matrixscience.com/
23Mascot ms/ms ion search
24MS/MS ion search result
It is the ions scores for individual peptide
matches that are statistically significant
25Peptide summary
The proteins are listed, by descending score,
each with a table summarising the matched peptides
Protein view
Experimental m/z value
Calculated rel mass
(relative molecular mass)
Expectation value for the peptide match, (the
number of times we would expect to obtain an
equal or higher score, purely by chance. The
lower this value, the more significant the
result.)
26Peptide view
27Peptide fragmentation of APGFGDNR
Matches (Bold Red) 9/58 fragment ions using 18
most intense peaks
28OMSSA
http//pubchem.ncbi.nlm.nih.gov/omssa/
29GPM X
(The Global Proteome Machine Organization),
http//www.thegpm.org