scholarly journals A mass-tolerant database search identifies a large proportion of unassigned spectra in shotgun proteomics as modified peptides

2015 ◽  
Vol 33 (7) ◽  
pp. 743-749 ◽  
Author(s):  
Joel M Chick ◽  
Deepak Kolippakkam ◽  
David P Nusinow ◽  
Bo Zhai ◽  
Ramin Rad ◽  
...  
2015 ◽  
Vol 33 (8) ◽  
pp. 882-882 ◽  
Author(s):  
Joel M Chick ◽  
Deepak Kolippakkam ◽  
David P Nusinow ◽  
Bo Zhai ◽  
Ramin Rad ◽  
...  

2020 ◽  
Author(s):  
John T. Halloran ◽  
Gregor Urban ◽  
David Rocke ◽  
Pierre Baldi

AbstractSemi-supervised machine learning post-processors critically improve peptide identification of shot-gun proteomics data. Such post-processors accept the peptide-spectrum matches (PSMs) and feature vectors resulting from a database search, train a machine learning classifier, and recalibrate PSMs using the trained parameters, often yielding significantly more identified peptides across q-value thresholds. However, current state-of-the-art post-processors rely on shallow machine learning methods, such as support vector machines. In contrast, the powerful training capabilities of deep learning models have displayed superior performance to shallow models in an ever-growing number of other fields. In this work, we show that deep models significantly improve the recalibration of PSMs compared to the most accurate and widely-used post-processors, such as Percolator and PeptideProphet. Furthermore, we show that deep learning is able to adaptively analyze complex datasets and features for more accurate universal post-processing, leading to both improved Prosit analysis and markedly better recalibration of recently developed database-search functions.


PROTEOMICS ◽  
2012 ◽  
Vol 12 (22) ◽  
pp. 3403-3406 ◽  
Author(s):  
Abdulqader A. Alhaider ◽  
Nervana Bayoumy ◽  
Evelyn Argo ◽  
Abdel G. M. A. Gader ◽  
David A. Stead

2018 ◽  
Author(s):  
Andy Lin ◽  
J. Jeffry Howbert ◽  
William Stafford Noble

AbstractTo achieve accurate assignment of peptide sequences to observed fragmentation spectra, a shotgun proteomics database search tool must make good use of the very high resolution information produced by state-of-the-art mass spectrometers. However, making use of this information while also ensuring that the search engine’s scores are well calibrated—i.e., that the score assigned to one spectrum can be meaningfully compared to the score assigned to a different spectrum—has proven to be challenging. Here, we describe a database search score function, the “residue evidence” (res-ev) score, that achieves both of these goals simultaneously. We also demonstrate how to combine calibrated res-ev scores with calibrated XCorr scores to produce a “combined p-value” score function. We provide a benchmark consisting of four mass spectrometry data sets, which we use to compare the combined p-value to the score functions used by several existing search engines. Our results suggest that the combined p-value achieves state-of-the-art performance, generally outperforming MS Amanda and Morpheus and performing comparably to MS-GF+. The res-ev and combined p-value score functions are freely available as part of the Tide search engine in the Crux mass spectrometry toolkit (http://crux.ms).


2018 ◽  
Author(s):  
Uri Keich ◽  
Kaipo Tamura ◽  
William Stafford Noble

AbstractDecoy database search with target-decoy competition (TDC) provides an intuitive, easy-to-implement method for estimating the false discovery rate (FDR) associated with spectrum identifications from shotgun proteomics data. However, the procedure can yield different results for a fixed dataset analyzed with different decoy databases, and this decoy-induced variability is particularly problematic for smaller FDR thresholds, datasets or databases. In such cases, the nominal FDR might be 1% but the true proportion of false discoveries might be 10%. The averaged TDC protocol combats this problem by exploiting multiple independently shuffled decoy databases to provide an FDR estimate with reduced variability. We provide a tutorial introduction to aTDC, describe an improved variant of the protocol that offers increased statistical power, and discuss how to deploy aTDC in practice using the Crux software toolkit.


2012 ◽  
Vol 35 (14) ◽  
pp. 1771-1778 ◽  
Author(s):  
Eugene Moskovets ◽  
Anton A. Goloborodko ◽  
Alexander V. Gorshkov ◽  
Mikhail V. Gorshkov

Sign in / Sign up

Export Citation Format

Share Document