A Statistical Technique for Monoisotopic Peak Detection in a Mass Spectrum

Mourad Atlas, Susmita Datta
2009 Journal of Proteomics & Bioinformatics  
Mass spectrometry has emerged as a core technology for high throughput proteomics profiling. It has enormous potential in biomedical research. However, the complexity of the data poses new statistical challenges for the analysis. Statistical methods and software developments for analyzing proteomic data are likely to continue to be a major area of research in the coming years. In this paper, a novel statistical method for analyzing high dimensional MALDI-TOF mass-spectrometry data in proteomic
more » ... data in proteomic research is proposed. The chemical knowledge regarding isotopic distribution of the peptide molecules along with quantitative modeling is used to detect chemically valuable peaks from each spectrum. More specifically, a mixture of location-shifted Poisson distribution is fitted to the deamidated isotopic distribution of a peptide molecule. Maximum likelihood estimation by the expectation-maximization (EM) technique is used to estimate the parameters of the distribution. A formal statistical test is then constructed to determine whether a cluster of consecutive features (intensity values) in a mass spectrum corresponds to a true isotropic pattern. Thus, the monoisotopic peaks in an individual spectrum are identified. Performance of our method is examined through extensive simulations. We also provide a numerical illustration of our method with a real dataset and compare it with an existing method of peak detection. External biochemical validation of our detected peaks is provided. JPB, an open access journal m/z values and the intensities y's are influenced by confounding factors. Research Article JPB/Vol.2/May 2009 Breen et al., (2000, 2003) utilized existing database knowledge to establish a linear equation between M the mean of a
doi:10.4172/jpb.1000078 fatcat:5gyf6u6wobdh3bh57kcxhtopiy