scholarly journals Protein inference using PIA workflows and PSI standard file formats

2018 ◽  
Author(s):  
Julian Uszkoreit ◽  
Yasset Perez-Riverol ◽  
Britta Eggers ◽  
Katrin Marcus ◽  
Martin Eisenacher

AbstractProteomics using LC-MS/MS has become one of the main methods to analyze the proteins in biological samples in high-throughput. But the existing mass spectrometry instruments are still limited with respect to resolution and measurable mass ranges, which is one of the main reasons why shotgun proteomics is the major approach. Here, proteins are digested, which leads to the identification and quantification of peptides instead. While often neglected, the important step of protein inference needs to be conducted to infer from the identified peptides to the actual proteins in the original sample.In this work, we highlight some of the previously published and newly added features of the tool PIA – Protein Inference Algorithms, which helps the user with the protein inference of measured samples. We also highlight the importance of the usage of PSI standard file formats, as PIA is the only current software supporting all available standards used for spectrum identification and protein inference. Additionally, we briefly describe the benefits of working with workflow environments for proteomics analyses and show the new features of the PIA nodes for the KNIME Analytics Platform. Finally, we benchmark PIA against a recently published dataset for isoform detection.PIA is open source and available for download on GitHub (https://github.com/mpc-bioinformatics/pia) or directly via the community extensions inside the KNIME analytics platform.

2018 ◽  
Vol 25 (2) ◽  
pp. 251-258 ◽  
Author(s):  
Estelle Rathahao-Paris ◽  
Sandra Alves ◽  
Nawel Boussaid ◽  
Nicole Picard-Hagen ◽  
Véronique Gayrard ◽  
...  

Direct injection–mass spectrometry can be used to perform high-throughput metabolomic fingerprinting. This work aims to evaluate a global analytical workflow in terms of sample preparation (urine sample dilution), high-resolution detection (quality of generated data based on criteria such as mass measurement accuracy and detection sensitivity) and data analysis using dedicated bioinformatics tools. Investigation was performed on a large number of biological samples collected from sheep infected or not with scrapie. Direct injection–mass spectrometry approach is usually affected by matrix effects, eventually hampering detection of some relevant biomarkers. Reference compounds were spiked in biological samples to help evaluate the quality of direct injection–mass spectrometry data produced by Fourier Transform mass spectrometry. Despite the potential of high-resolution detection, some drawbacks still remain. The most critical is the presence of matrix effects, which could be minimized by optimizing the sample dilution factor. The data quality in terms of mass measurement accuracy and reproducible intensity was evaluated. Good repeatability was obtained for the chosen dilution factor (i.e., 2000). More than 150 analyses were performed in less than 16 hours using the optimized direct injection–mass spectrometry approach. Discrimination of different status of sheeps in relation to scrapie infection (i.e., scrapie-affected, preclinical scrapie or healthy) was obtained from the application of Shrinkage Discriminant Analysis to the direct injection–mass spectrometry data. The most relevant variables related to this discrimination were selected and annotated. This study demonstrated that the choice of appropriated dilution faction is indispensable for producing quality and informative direct injection–mass spectrometry data. Successful application of direct injection–mass spectrometry approach for high throughput analysis of a large number of biological samples constitutes the proof of the concept.


2010 ◽  
Vol 11 (1) ◽  
Author(s):  
Chongle Pan ◽  
Byung H Park ◽  
William H McDonald ◽  
Patricia A Carey ◽  
Jillian F Banfield ◽  
...  

2017 ◽  
Author(s):  
Matthew The ◽  
Fredrik Edfors ◽  
Yasset Perez-Riverol ◽  
Samuel H. Payne ◽  
Michael R. Hoopmann ◽  
...  

AbstractA natural way to benchmark the performance of an analytical experimental setup is to use samples of known content, and see to what degree one can correctly infer the content of such a sample from the data. For shotgun proteomics, one of the inherent problems of interpreting data is that the measured analytes are peptides and not the actual proteins themselves. As some proteins share proteolytic peptides, there might be more than one possible causative set of proteins resulting in a given set of peptides and there is a need for mechanisms that infer proteins from lists of detected peptides. A weakness of commercially available samples of known content is that they consist of proteins that are deliberately selected for producing tryptic peptides that are unique to a single protein. Unfortunately, such samples do not expose any complications in protein inference. For a realistic benchmark of protein inference procedures, there is, therefore, a need for samples of known content where the present proteins share peptides with known absent proteins. Here, we present such a standard, that is based on E. coli expressed human protein fragments. To illustrate the usage of this standard, we benchmark a set of different protein inference procedures on the data. We observe that inference procedures excluding shared peptides provide more accurate estimates of errors compared to methods that include information from shared peptides, while still giving a reasonable performance in terms of the number of identified proteins. We also demonstrate that using a sample of known protein content without proteins with shared tryptic peptides can give a false sense of accuracy for many protein inference methods.


2016 ◽  
Vol 31 (11) ◽  
pp. 2227-2232 ◽  
Author(s):  
Luca Bamonti ◽  
Sarah Theiner ◽  
Nataliya Rohr-Udilova ◽  
Bernhard K. Keppler ◽  
Gunda Koellensperger

Different strategies for the analysis of selenium in human serum were validated by tandem ICP-MS and isotope dilution.


2020 ◽  
Author(s):  
Dominik Kopczynski ◽  
Nils Hoffmann ◽  
Bing Peng ◽  
Robert Ahrends

We introduce Goslin, a polyglot grammar for common lipid shorthand nomenclatures based on the LipidMaps nomenclature and the shorthand nomenclature established by Liebisch et al. and used by LipidHome and SwissLipids. Goslin was designed to address the following pressing issues in the lipidomics field: 1) to simplify the implementation of lipid name handling for developers of mass spectrometry-based lipidomics tools; 2) to offer a tool that unifies and normalizes the main existing lipid name dialects enabling a lipidomics analysis in a high-throughput fashion. We provide implementations of Goslin in four major programming languages, namely C++, Java, Python 3, and R to kick-start adoption and integration. Further, we set up a web service for users to work with Goslin directly. All implementations are available free of charge under a permissive open source license.


Sign in / Sign up

Export Citation Format

Share Document