SpecDB: A Database for Storing and Managing Mass Spectrometry Proteomics Data

Author(s):  
Mario Cannataro ◽  
Pierangelo Veltri
2018 ◽  
Vol 90 (21) ◽  
pp. 13112-13117 ◽  
Author(s):  
Lindsay K. Pino ◽  
Brian C. Searle ◽  
Eric L. Huang ◽  
William Stafford Noble ◽  
Andrew N. Hoofnagle ◽  
...  

Proteomes ◽  
2020 ◽  
Vol 8 (1) ◽  
pp. 3 ◽  
Author(s):  
Zhujia Ye ◽  
Sasikiran Reddy Sangireddy ◽  
Chih-Li Yu ◽  
Dafeng Hui ◽  
Kevin Howe ◽  
...  

Switchgrass plants were grown in a Sandwich tube system to induce gradual drought stress by withholding watering. After 29 days, the leaf photosynthetic rate decreased significantly, compared to the control plants which were watered regularly. The drought-treated plants recovered to the same leaf water content after three days of re-watering. The root tip (1cm basal fragment, designated as RT1 hereafter) and the elongation/maturation zone (the next upper 1 cm tissue, designated as RT2 hereafter) tissues were collected at the 29th day of drought stress treatment, (named SDT for severe drought treated), after one (D1W) and three days (D3W) of re-watering. The tandem mass tags mass spectrometry-based quantitative proteomics analysis was performed to identify the proteomes, and drought-induced differentially accumulated proteins (DAPs). From RT1 tissues, 6156, 7687, and 7699 proteins were quantified, and 296, 535, and 384 DAPs were identified in the SDT, D1W, and D3W samples, respectively. From RT2 tissues, 7382, 7255, and 6883 proteins were quantified, and 393, 587, and 321 proteins DAPs were identified in the SDT, D1W, and D3W samples. Between RT1 and RT2 tissues, very few DAPs overlapped at SDT, but the number of such proteins increased during the recovery phase. A large number of hydrophilic proteins and stress-responsive proteins were induced during SDT and remained at a higher level during the recovery stages. A large number of DAPs in RT1 tissues maintained the same expression pattern throughout drought treatment and the recovery phases. The DAPs in RT1 tissues were classified in cell proliferation, mitotic cell division, and chromatin modification, and those in RT2 were placed in cell wall remodeling and cell expansion processes. This study provided information pertaining to root zone-specific proteome changes during drought and recover phases, which will allow us to select proteins (genes) as better defined targets for developing drought tolerant plants. The mass spectrometry proteomics data are available via ProteomeXchange with identifier PXD017441.


2015 ◽  
Author(s):  
Lisa M. Breckels ◽  
Sean Holden ◽  
David Wojnar ◽  
Claire M. Mulvey ◽  
Andy Christoforou ◽  
...  

AbstractSub-cellular localisation of proteins is an essential post-translational regulatory mechanism that can be assayed using high-throughput mass spectrometry (MS). These MS-based spatial proteomics experiments enable us to pinpoint the sub-cellular distribution of thousands of proteins in a specific system under controlled conditions. Recent advances in high-throughput MS methods have yielded a plethora of experimental spatial proteomics data for the cell biology community. Yet, there are many third-party data sources, such as immunofluorescence microscopy or protein annotations and sequences, which represent a rich and vast source of complementary information. We present a unique transfer learning classification framework that utilises a nearest-neighbour or support vector machine system, to integrate heterogeneous data sources to considerably improve on the quantity and quality of sub-cellular protein assignment. We demonstrate the utility of our algorithms through evaluation of five experimental datasets, from four different species in conjunction with four different auxiliary data sources to classify proteins to tens of sub-cellular compartments with high generalisation accuracy. We further apply the method to an experiment on pluripotent mouse embryonic stem cells to classify a set of previously unknown proteins, and validate our findings against a recent high resolution map of the mouse stem cell proteome. The methodology is distributed as part of the open-source Bioconductor pRoloc suite for spatial proteomics data analysis.AbbreviationsLOPITLocalisation of Organelle Proteins by Isotope TaggingPCPProtein Correlation ProfilingMLMachine learningTLTransfer learningSVMSupport vector machinePCAPrincipal component analysisGOGene OntologyCCCellular compartmentiTRAQIsobaric tags for relative and absolute quantitationTMTTandem mass tagsMSMass spectrometry


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Shisheng Wang ◽  
Hongwen Zhu ◽  
Hu Zhou ◽  
Jingqiu Cheng ◽  
Hao Yang

Abstract Background Mass spectrometry (MS) has become a promising analytical technique to acquire proteomics information for the characterization of biological samples. Nevertheless, most studies focus on the final proteins identified through a suite of algorithms by using partial MS spectra to compare with the sequence database, while the pattern recognition and classification of raw mass-spectrometric data remain unresolved. Results We developed an open-source and comprehensive platform, named MSpectraAI, for analyzing large-scale MS data through deep neural networks (DNNs); this system involves spectral-feature swath extraction, classification, and visualization. Moreover, this platform allows users to create their own DNN model by using Keras. To evaluate this tool, we collected the publicly available proteomics datasets of six tumor types (a total of 7,997,805 mass spectra) from the ProteomeXchange consortium and classified the samples based on the spectra profiling. The results suggest that MSpectraAI can distinguish different types of samples based on the fingerprint spectrum and achieve better prediction accuracy in MS1 level (average 0.967). Conclusion This study deciphers proteome profiling of raw mass spectrometry data and broadens the promising application of the classification and prediction of proteomics data from multi-tumor samples using deep learning methods. MSpectraAI also shows a better performance compared to the other classical machine learning approaches.


Author(s):  
Mario Cannataro ◽  
Pietro Hiram Guzzi ◽  
Giuseppe Tradigo ◽  
Pierangelo Veltri

Recent advances in high throughput technologies analysing biological samples enabled the researchers to collect a huge amount of data. In particular, mass spectrometry-based proteomics uses the mass spectrometry to investigate proteins expressed in an organism or a cell. The manual inspection of spectra is unfeasible, so the need to introduce a set of algorithms, tools and platforms to manage and analyze them arises. Computational Proteomics regards the computational methods for analyzing spectra data in qualitative (i.e. peptide/protein identification in tandem mass spectrometry), and quantitative proteomics (i.e. protein expression in samples), as well as in biomarker discovery (i.e. the identification of a molecular signature of a disease directly from spectra). This chapter presents main standards, tools, and technologies for building scalable, reusable, and portable applications in this field. The chapter surveys available solutions for computational proteomics and includes a deep description of MS-Analyzer, a Grid-based software platform for the integrated management and analysis of spectra data. MS-Analyzer provides efficient spectra management through a specialized spectra database, and supports the semantic composition of pre-processing and data mining services to analyze spectra on the Grid.


2018 ◽  
Vol 17 (4) ◽  
pp. 1547-1558 ◽  
Author(s):  
Salvador Martínez-Bartolomé ◽  
J. Alberto Medina-Aunon ◽  
Miguel Ángel López-García ◽  
Carmen González-Tejedo ◽  
Gorka Prieto ◽  
...  

2018 ◽  
Author(s):  
Aikaterini Geladaki ◽  
Nina Kočevar Britovšek ◽  
Lisa M. Breckels ◽  
Tom S. Smith ◽  
Claire M. Mulvey ◽  
...  

AbstractHyperplexed Localisation of Organelle Proteins by Isotope Tagging (hyperLOPIT) is a well-established method for studying protein subcellular localisation in complex biological samples. As a simpler alternative we developed a second workflow named Localisation of Organelle Proteins by Isotope Tagging after Differential ultraCentrifugation (LOPIT-DC) which is faster and less resource-intensive. We present the most comprehensive high-resolution mass spectrometry-based human dataset to date and deliver a flexible set of subcellular proteomics protocols for sample preparation and data analysis. For the first time, we methodically compare these two different mass spectrometry-based spatial proteomics methods within the same study and also apply QSep, the first tool that objectively and robustly quantifies subcellular resolution in spatial proteomics data. Using both approaches we highlight suborganellar resolution and isoform-specific subcellular niches as well as the locations of large protein complexes and proteins involved in signalling pathways which play important roles in cancer and metabolism. Finally, we showcase an extensive analysis of the multilocalising proteome identified via both methods.


Author(s):  
Meng Wang ◽  
Lihua Jiang ◽  
Ruiqi Jian ◽  
Joanne Y Chan ◽  
Qing Liu ◽  
...  

Abstract Motivation Data normalization is an important step in processing proteomics data generated in mass spectrometry experiments, which aims to reduce sample-level variation and facilitate comparisons of samples. Previously published methods for normalization primarily depend on the assumption that the distribution of protein expression is similar across all samples. However, this assumption fails when the protein expression data is generated from heterogenous samples, such as from various tissue types. This led us to develop a novel data-driven method for improved normalization to correct the systematic bias meanwhile maintaining underlying biological heterogeneity. Results To robustly correct the systematic bias, we used the density-power-weight method to down-weigh outliers and extended the one-dimensional robust fitting method described in the previous work to our structured data. We then constructed a robustness criterion and developed a new normalization algorithm, called RobNorm. In simulation studies and analysis of real data from the genotype-tissue expression project, we compared and evaluated the performance of RobNorm against other normalization methods. We found that the RobNorm approach exhibits the greatest reduction in systematic bias while maintaining across-tissue variation, especially for datasets from highly heterogeneous samples. Availabilityand implementation https://github.com/mwgrassgreen/RobNorm. Supplementary information Supplementary data are available at Bioinformatics online.


PROTEOMICS ◽  
2005 ◽  
Vol 5 (13) ◽  
pp. 3501-3505 ◽  
Author(s):  
Lennart Martens ◽  
Alexey I. Nesvizhskii ◽  
Henning Hermjakob ◽  
Marcin Adamski ◽  
Gilbert S. Omenn ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document