scholarly journals MoMo: Discovery of post-translational modification motifs

2017 ◽  
Author(s):  
Alice Cheng ◽  
Charles E. Grant ◽  
Timothy L. Bailey ◽  
William Stafford Noble

AbstractMotivationPost-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called “motifs” that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data.ResultsMoMo is a software tool for identifying motifs among sets of PTMs. The program re-implements two previously described algorithms, Motif-X and MoDL, packaging them in a web-accessible user interface. In addition to reading sequence files in FASTA format, MoMo is capable of directly parsing output files produced by commonly used mass spectrometry search engines. The resulting motifs are presented to the user in an HTML summary with motif logos and linked text files in MEME motif format.AvailabilitySource code and web server available at http://[email protected] and [email protected] informationSupplementary figures are available at Bioinformatics online.

2018 ◽  
Vol 35 (16) ◽  
pp. 2774-2782 ◽  
Author(s):  
Alice Cheng ◽  
Charles E Grant ◽  
William S Noble ◽  
Timothy L Bailey

Abstract Motivation Post-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called ‘motifs’ that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation. Results We describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate P-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo re-implements the two most widely used PTM motif discovery algorithms—motif-x and MoDL—while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing ‘background’ peptides from an unshuffled proteome database. Our results thus suggest that many of the papers that use motif-x to find motifs may be reporting results that lack statistical support. Availability and implementation The MoMo web server and source code are provided at http://meme-suite.org. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (18) ◽  
pp. 3365-3371 ◽  
Author(s):  
Alon Diament ◽  
Iddo Weiner ◽  
Noam Shahar ◽  
Shira Landman ◽  
Yael Feldman ◽  
...  

Abstract Motivation Regulation of the amount of protein that is synthesized from genes has proved to be a serious challenge in terms of analysis and prediction, and in terms of engineering and optimization, due to the large diversity in expression machinery across species. Results To address this challenge, we developed a methodology and a software tool (ChimeraUGEM) for predicting gene expression as well as adapting the coding sequence of a target gene to any host organism. We demonstrate these methods by predicting protein levels in seven organisms, in seven human tissues, and by increasing in vivo the expression of a synthetic gene up to 26-fold in the single-cell green alga Chlamydomonas reinhardtii. The underlying model is designed to capture sequence patterns and regulatory signals with minimal prior knowledge on the host organism and can be applied to a multitude of species and applications. Availability and implementation Source code (MATLAB, C) and binaries are freely available for download for non-commercial use at http://www.cs.tau.ac.il/~tamirtul/ChimeraUGEM/, and supported on macOS, Linux and Windows. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Alice Cheng ◽  
Charles E. Grant ◽  
William S. Noble ◽  
Timothy L. Bailey

AbstractMotivationPost-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called “motifs” that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation.ResultsWe describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate p-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo reimplements the two most widely used PTM motif discovery algorithms—motif-x and MoDL—while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing “background” peptides from an unshuffled proteome database. Our results thus suggest that many of the hundreds of papers that use motif-x to find motifs may be reporting results that lack statistical support.Availabilityhttp://[email protected]


2020 ◽  
Vol 64 (1) ◽  
pp. 97-110
Author(s):  
Christian Sibbersen ◽  
Mogens Johannsen

Abstract In living systems, nucleophilic amino acid residues are prone to non-enzymatic post-translational modification by electrophiles. α-Dicarbonyl compounds are a special type of electrophiles that can react irreversibly with lysine, arginine, and cysteine residues via complex mechanisms to form post-translational modifications known as advanced glycation end-products (AGEs). Glyoxal, methylglyoxal, and 3-deoxyglucosone are the major endogenous dicarbonyls, with methylglyoxal being the most well-studied. There are several routes that lead to the formation of dicarbonyl compounds, most originating from glucose and glucose metabolism, such as the non-enzymatic decomposition of glycolytic intermediates and fructosyl amines. Although dicarbonyls are removed continuously mainly via the glyoxalase system, several conditions lead to an increase in dicarbonyl concentration and thereby AGE formation. AGEs have been implicated in diabetes and aging-related diseases, and for this reason the elucidation of their structure as well as protein targets is of great interest. Though the dicarbonyls and reactive protein side chains are of relatively simple nature, the structures of the adducts as well as their mechanism of formation are not that trivial. Furthermore, detection of sites of modification can be demanding and current best practices rely on either direct mass spectrometry or various methods of enrichment based on antibodies or click chemistry followed by mass spectrometry. Future research into the structure of these adducts and protein targets of dicarbonyl compounds may improve the understanding of how the mechanisms of diabetes and aging-related physiological damage occur.


2018 ◽  
Author(s):  
Zhiwu An ◽  
Fuzhou Gong ◽  
Yan Fu

We have developed PTMiner, a first software tool for automated, confident filtering, localization and annotation of protein post-translational modifications identified by open (mass-tolerant) search of large tandem mass spectrometry datasets. The performance of the software was validated on carefully designed simulation data. <br>


Author(s):  
Marcela Aguilera Flores ◽  
Iulia M Lazar

Abstract Summary The ‘Unknown Mutation Analysis (XMAn)’ database is a compilation of Homo sapiens mutated peptides in FASTA format, that was constructed for facilitating the identification of protein sequence alterations by tandem mass spectrometry detection. The database comprises 2 539 031 non-redundant mutated entries from 17 599 proteins, of which 2 377 103 are missense and 161 928 are nonsense mutations. It can be used in conjunction with search engines that seek the identification of peptide amino acid sequences by matching experimental tandem mass spectrometry data to theoretical sequences from a database. Availability and implementation XMAn v2 can be accessed from github.com/lazarlab/XMAnv2. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Vol 46 (5) ◽  
pp. 1381-1392 ◽  
Author(s):  
Ivar W. Dilweg ◽  
Remus T. Dame

Post-translational modification (PTM) of histones has been investigated in eukaryotes for years, revealing its widespread occurrence and functional importance. Many PTMs affect chromatin folding and gene activity. Only recently the occurrence of such modifications has been recognized in bacteria. However, it is unclear whether PTM of the bacterial counterparts of eukaryotic histones, nucleoid-associated proteins (NAPs), bears a comparable significance. Here, we scrutinize proteome mass spectrometry data for PTMs of the four most abundantly present NAPs in Escherichia coli (H-NS, HU, IHF and FIS). This approach allowed us to identify a total of 101 unique PTMs in the 11 independent proteomic studies covered in this review. Combined with structural and genetic information on these proteins, we describe potential effects of these modifications (perturbed DNA-binding, structural integrity or interaction with other proteins) on their function.


2015 ◽  
Vol 32 (6) ◽  
pp. 955-957 ◽  
Author(s):  
Filippo Piccinini ◽  
Alexa Kiss ◽  
Peter Horvath

Abstract Motivation: Time-lapse experiments play a key role in studying the dynamic behavior of cells. Single-cell tracking is one of the fundamental tools for such analyses. The vast majority of the recently introduced cell tracking methods are limited to fluorescently labeled cells. An equally important limitation is that most software cannot be effectively used by biologists without reasonable expertise in image processing. Here we present CellTracker, a user-friendly open-source software tool for tracking cells imaged with various imaging modalities, including fluorescent, phase contrast and differential interference contrast (DIC) techniques. Availability and implementation: CellTracker is written in MATLAB (The MathWorks, Inc., USA). It works with Windows, Macintosh and UNIX-based systems. Source code and graphical user interface (GUI) are freely available at: http://celltracker.website/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


2018 ◽  
Author(s):  
Jiaan Dai ◽  
Fengchao Yu ◽  
Ning Li ◽  
Weichuan Yu

AbstractMotivationAnalyzing tandem mass spectrometry data to recognize peptides in a sample is the fundamental task in computational proteomics. Traditional peptide identification algorithms perform well when identifying unmodified peptides. However, when peptides have post-translational modifications (PTMs), these methods cannot provide satisfactory results. Recently, Chick et al., 2015 and Yu et al., 2016 proposed the spectrum-based and tag-based open search methods, respectively, to identify peptides with PTMs. While the performance of these two methods is promising, the identification results vary greatly with respect to the quality of tandem mass spectra and the number of PTMs in peptides. This motivates us to systematically study the relationship between the performance of open search methods and quality parameters of tandem mass spectrum data, as well as the number of PTMs in peptides.ResultsThrough large-scale simulations, we obtain the performance trend when simulated tandem mass spectra are of different quality. We propose an analytical model to describe the relationship between the probability of obtaining correct identifications and the spectrum quality as well as the number of PTMs. Based on the analytical model, we can quantitatively describe the necessary condition to effectively apply open search methods.AvailabilitySource codes of the simulation are available at http://bioinformatics.ust.hk/[email protected] or [email protected] informationSupplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document