A New Approach to Protein Identification
Lecture Notes in Computer Science
Advances in tandem mass-spectrometry (MS/MS) steadily increase the rate of generation of MS/MS spectra and make it more computationally challenging to analyze such huge datasets. As a result, the existing approaches that compare spectra against databases are already facing a bottleneck, particularly when interpreting spectra of post-translationally modified peptides. In this paper we introduce a new idea that allows one to perform MS/MS database search . . . without ever comparing a spectrum
... inst a database. The idea has two components: experimental and computational. Our experimental idea is counter-intuitive: we propose to intentionally introduce chemical damage to the sample. Although it does not appear to make any sense from the experimental perspective, it creates a large number of "spectral pairs" that, as we show below, open up computational avenues that were never explored before. Having a spectrum of a modified peptide paired with a spectrum of an unmodified peptide, allows one to separate the prefix and suffix ladders, to greatly reduce the number of noise peaks, and to generate a small number of peptide reconstructions that are very likely to contain the correct one. The MS/MS database search is thus reduced to extremely fast pattern matching (rather than time-consuming matching of spectra against databases). In addition to speed, our approach provides a new paradigm for identifying post-translational modifications. 1 Hunyadi-Goulyas and Medzihradszky, 2004  give a table of over 30 common chemical adducts that are currently viewed as annoyances. 2 Probably the easiest way to chemically damage the sample is to warm it up in urea solution or to simply bring it into mildly acidic pH and add a hefty concentration of hydrogen peroxide. See Levine et al., 1996  for an example of a slightly more involved protocol that generates samples with desired extent of oxidation in a controlled fashion. Also, to create a mixture of modified and unmodified peptides, one can split the sample in half, chemically damage one half, and combine both halves together again. 3 We remark that the Peptide Sequence Tag approach reduces the number of considered peptides but does not eliminates the need to match spectra against the filtered database. For example, Tanner et al., 2005  describe a dynamic programming approach for matching spectra against a filtered database.