Assessing multiple evidence streams to decide on confidence for identification of post-translational modifications, within and across data sets [article]

Oscar M Camacho, Kerry Ramsbottom, Andrew Collins, Andrew R Jones
2022 bioRxiv   pre-print
Phosphorylation is a post-translational modification of great interest to researchers due to its relevance in many biological processes. LC-MS/MS techniques have enabled high-throughput data acquisition with studies claiming identification and localisation of thousands of phosphosites. The identification and localisation of phosphosites emerge from different analytical pipelines and scoring algorithms, with uncertainty embedded throughout the pipeline. For many pipelines and algorithms,
more » ... y thresholding is used, but little is known about the actual global false localisation rate in these studies. Recently, it has been suggested using decoy amino acids to estimate global false localisation rates of phosphosites, amongst the peptide-spectrum matches reported. We here describe a simple pipeline aiming to maximize the information extracted from these studies by objectively collapsing from peptide-spectrum match to peptidoform-site level, as well as combining findings from multiple studies while maintaining track of false localisation rates. We show that the approach is more effective than current processes that use a simpler mechanism for handling phosphosite identification redundancy within and across studies. In our case study using 8 rice phophoproteomics data sets, 6,368 unique sites were identified confidently identified using our decoy approach compared to 4,687 using traditional thresholding in which false localisation rates are unknown.
doi:10.1101/2022.12.15.520504 fatcat:lyumc6v3tngulercwrocpbkxza