MoMo: Discovery of post-translational modification motifs

MoMo: discovery of statistically significant post-translational modification motifs

Bioinformatics ◽

10.1093/bioinformatics/bty1058 ◽

2018 ◽

Vol 35 (16) ◽

pp. 2774-2782 ◽

Cited By ~ 18

Author(s):

Alice Cheng ◽

Charles E Grant ◽

William S Noble ◽

Timothy L Bailey

Keyword(s):

Mass Spectrometry ◽

Motif Discovery ◽

Source Code ◽

Web Server ◽

Software Tool ◽

Mass Spectrometry Data ◽

Supplementary Information ◽

Post Translational Modification ◽

Statistical Confidence ◽

Confidence Estimates

Abstract Motivation Post-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called ‘motifs’ that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation. Results We describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate P-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo re-implements the two most widely used PTM motif discovery algorithms—motif-x and MoDL—while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing ‘background’ peptides from an unshuffled proteome database. Our results thus suggest that many of the papers that use motif-x to find motifs may be reporting results that lack statistical support. Availability and implementation The MoMo web server and source code are provided at http://meme-suite.org. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

ChimeraUGEM: unsupervised gene expression modeling in any given organism

Bioinformatics ◽

10.1093/bioinformatics/btz080 ◽

2019 ◽

Vol 35 (18) ◽

pp. 3365-3371 ◽

Cited By ~ 4

Author(s):

Alon Diament ◽

Iddo Weiner ◽

Noam Shahar ◽

Shira Landman ◽

Yael Feldman ◽

...

Keyword(s):

Gene Expression ◽

Target Gene ◽

Source Code ◽

Software Tool ◽

Supplementary Information ◽

Host Organism ◽

Protein Levels ◽

Commercial Use ◽

Sequence Patterns

Abstract Motivation Regulation of the amount of protein that is synthesized from genes has proved to be a serious challenge in terms of analysis and prediction, and in terms of engineering and optimization, due to the large diversity in expression machinery across species. Results To address this challenge, we developed a methodology and a software tool (ChimeraUGEM) for predicting gene expression as well as adapting the coding sequence of a target gene to any host organism. We demonstrate these methods by predicting protein levels in seven organisms, in seven human tissues, and by increasing in vivo the expression of a synthetic gene up to 26-fold in the single-cell green alga Chlamydomonas reinhardtii. The underlying model is designed to capture sequence patterns and regulatory signals with minimal prior knowledge on the host organism and can be applied to a multitude of species and applications. Availability and implementation Source code (MATLAB, C) and binaries are freely available for download for non-commercial use at http://www.cs.tau.ac.il/~tamirtul/ChimeraUGEM/, and supported on macOS, Linux and Windows. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

MoMo: Discovery of statistically significant post-translational modification motifs

10.1101/410050 ◽

2018 ◽

Cited By ~ 2

Author(s):

Alice Cheng ◽

Charles E. Grant ◽

William S. Noble ◽

Timothy L. Bailey

Keyword(s):

Mass Spectrometry ◽

Motif Discovery ◽

Web Server ◽

Software Tool ◽

Real Data ◽

Mass Spectrometry Data ◽

Post Translational Modification ◽

Statistical Confidence ◽

Proteome Database ◽

Confidence Estimates

AbstractMotivationPost-translational modifications (PTMs) of proteins are associated with many significant biological functions and can be identified in high throughput using tandem mass spectrometry. Many PTMs are associated with short sequence patterns called “motifs” that help localize the modifying enzyme. Accordingly, many algorithms have been designed to identify these motifs from mass spectrometry data. Accurate statistical confidence estimates for discovered motifs are critically important for proper interpretation and in the design of downstream experimental validation.ResultsWe describe a method for assigning statistical confidence estimates to PTM motifs, and we demonstrate that this method provides accurate p-values on both simulated and real data. Our methods are implemented in MoMo, a software tool for discovering motifs among sets of PTMs that we make available as a web server and as downloadable source code. MoMo reimplements the two most widely used PTM motif discovery algorithms—motif-x and MoDL—while offering many enhancements. Relative to motif-x, MoMo offers improved statistical confidence estimates and more accurate calculation of motif scores. The MoMo web server offers more proteome databases, more input formats, larger inputs and longer running times than the motif-x web server. Finally, our study demonstrates that the confidence estimates produced by motif-x are inaccurate. This inaccuracy stems in part from the common practice of drawing “background” peptides from an unshuffled proteome database. Our results thus suggest that many of the hundreds of papers that use motif-x to find motifs may be reporting results that lack statistical support.Availabilityhttp://[email protected]

Download Full-text

Dicarbonyl derived post-translational modifications: chemistry bridging biology and aging-related disease

Essays in Biochemistry ◽

10.1042/ebc20190057 ◽

2020 ◽

Vol 64 (1) ◽

pp. 97-110

Author(s):

Christian Sibbersen ◽

Mogens Johannsen

Keyword(s):

Mass Spectrometry ◽

Future Research ◽

Amino Acid Residues ◽

Post Translational Modification ◽

Dicarbonyl Compounds ◽

Protein Targets ◽

Post Translational Modifications ◽

Reactive Protein ◽

Glycation End Products ◽

Direct Mass Spectrometry

Abstract In living systems, nucleophilic amino acid residues are prone to non-enzymatic post-translational modification by electrophiles. α-Dicarbonyl compounds are a special type of electrophiles that can react irreversibly with lysine, arginine, and cysteine residues via complex mechanisms to form post-translational modifications known as advanced glycation end-products (AGEs). Glyoxal, methylglyoxal, and 3-deoxyglucosone are the major endogenous dicarbonyls, with methylglyoxal being the most well-studied. There are several routes that lead to the formation of dicarbonyl compounds, most originating from glucose and glucose metabolism, such as the non-enzymatic decomposition of glycolytic intermediates and fructosyl amines. Although dicarbonyls are removed continuously mainly via the glyoxalase system, several conditions lead to an increase in dicarbonyl concentration and thereby AGE formation. AGEs have been implicated in diabetes and aging-related diseases, and for this reason the elucidation of their structure as well as protein targets is of great interest. Though the dicarbonyls and reactive protein side chains are of relatively simple nature, the structures of the adducts as well as their mechanism of formation are not that trivial. Furthermore, detection of sites of modification can be demanding and current best practices rely on either direct mass spectrometry or various methods of enrichment based on antibodies or click chemistry followed by mass spectrometry. Future research into the structure of these adducts and protein targets of dicarbonyl compounds may improve the understanding of how the mechanisms of diabetes and aging-related physiological damage occur.

Download Full-text

Unrestrictive protein modification localization and quality control for open search of mass spectra

10.26434/chemrxiv.5797995 ◽

2018 ◽

Author(s):

Zhiwu An ◽

Fuzhou Gong ◽

Yan Fu

Keyword(s):

Mass Spectrometry ◽

Quality Control ◽

Tandem Mass Spectrometry ◽

Mass Spectra ◽

Protein Modification ◽

Software Tool ◽

Tandem Mass ◽

Simulation Data ◽

Post Translational Modifications

We have developed PTMiner, a first software tool for automated, confident filtering, localization and annotation of protein post-translational modifications identified by open (mass-tolerant) search of large tandem mass spectrometry datasets. The performance of the software was validated on carefully designed simulation data. <br>

Download Full-text

MSTracer: A Machine Learning Software Tool for Peptide Feature Detection from Liquid Chromatography–Mass Spectrometry Data

Journal of Proteome Research ◽

10.1021/acs.jproteome.0c01029 ◽

2021 ◽

Author(s):

Xiangyuan Zeng ◽

Bin Ma

Keyword(s):

Machine Learning ◽

Mass Spectrometry ◽

Liquid Chromatography ◽

Feature Detection ◽

Software Tool ◽

Mass Spectrometry Data ◽

Liquid Chromatography Mass Spectrometry ◽

Chromatography Mass Spectrometry ◽

Learning Software

Download Full-text

XMAn v2—a database of Homo sapiens mutated peptides

Bioinformatics ◽

10.1093/bioinformatics/btz693 ◽

2019 ◽

Cited By ~ 1

Author(s):

Marcela Aguilera Flores ◽

Iulia M Lazar

Keyword(s):

Mass Spectrometry ◽

Tandem Mass Spectrometry ◽

Homo Sapiens ◽

Amino Acid Sequences ◽

Mass Spectrometry Data ◽

Supplementary Information ◽

Tandem Mass ◽

Mass Spectrometry Detection ◽

Nonsense Mutations ◽

Tandem Mass Spectrometry Data

Abstract Summary The ‘Unknown Mutation Analysis (XMAn)’ database is a compilation of Homo sapiens mutated peptides in FASTA format, that was constructed for facilitating the identification of protein sequence alterations by tandem mass spectrometry detection. The database comprises 2 539 031 non-redundant mutated entries from 17 599 proteins, of which 2 377 103 are missense and 161 928 are nonsense mutations. It can be used in conjunction with search engines that seek the identification of peptide amino acid sequences by matching experimental tandem mass spectrometry data to theoretical sequences from a database. Availability and implementation XMAn v2 can be accessed from github.com/lazarlab/XMAnv2. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Post-translational modification of nucleoid-associated proteins: an extra layer of functional modulation in bacteria?

Biochemical Society Transactions ◽

10.1042/bst20180488 ◽

2018 ◽

Vol 46 (5) ◽

pp. 1381-1392 ◽

Cited By ~ 16

Author(s):

Ivar W. Dilweg ◽

Remus T. Dame

Keyword(s):

Mass Spectrometry ◽

Genetic Information ◽

Structural Integrity ◽

Mass Spectrometry Data ◽

Post Translational Modification ◽

Functional Importance ◽

Widespread Occurrence ◽

Associated Proteins ◽

Chromatin Folding ◽

Functional Modulation

Post-translational modification (PTM) of histones has been investigated in eukaryotes for years, revealing its widespread occurrence and functional importance. Many PTMs affect chromatin folding and gene activity. Only recently the occurrence of such modifications has been recognized in bacteria. However, it is unclear whether PTM of the bacterial counterparts of eukaryotic histones, nucleoid-associated proteins (NAPs), bears a comparable significance. Here, we scrutinize proteome mass spectrometry data for PTMs of the four most abundantly present NAPs in Escherichia coli (H-NS, HU, IHF and FIS). This approach allowed us to identify a total of 101 unique PTMs in the 11 independent proteomic studies covered in this review. Combined with structural and genetic information on these proteins, we describe potential effects of these modifications (perturbed DNA-binding, structural integrity or interaction with other proteins) on their function.

Download Full-text

CellTracker (not only) for dummies

Bioinformatics ◽

10.1093/bioinformatics/btv686 ◽

2015 ◽

Vol 32 (6) ◽

pp. 955-957 ◽

Cited By ~ 46

Author(s):

Filippo Piccinini ◽

Alexa Kiss ◽

Peter Horvath

Keyword(s):

Graphical User Interface ◽

Open Source Software ◽

Phase Contrast ◽

Cell Tracking ◽

Source Code ◽

Software Tool ◽

Time Lapse ◽

Supplementary Information ◽

Differential Interference Contrast ◽

User Friendly

Abstract Motivation: Time-lapse experiments play a key role in studying the dynamic behavior of cells. Single-cell tracking is one of the fundamental tools for such analyses. The vast majority of the recently introduced cell tracking methods are limited to fluorescently labeled cells. An equally important limitation is that most software cannot be effectively used by biologists without reasonable expertise in image processing. Here we present CellTracker, a user-friendly open-source software tool for tracking cells imaged with various imaging modalities, including fluorescent, phase contrast and differential interference contrast (DIC) techniques. Availability and implementation: CellTracker is written in MATLAB (The MathWorks, Inc., USA). It works with Windows, Macintosh and UNIX-based systems. Source code and graphical user interface (GUI) are freely available at: http://celltracker.website/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

Download Full-text

Understanding the limit of open search in the identification of peptides with post-translational modifications — A simulation-based study

10.1101/289710 ◽

2018 ◽

Author(s):

Jiaan Dai ◽

Fengchao Yu ◽

Ning Li ◽

Weichuan Yu

Keyword(s):

Analytical Model ◽

Mass Spectra ◽

Mass Spectrometry Data ◽

Supplementary Information ◽

Necessary Condition ◽

Tandem Mass ◽

Search Methods ◽

Post Translational Modifications ◽

Tandem Mass Spectra ◽

The Relationship

AbstractMotivationAnalyzing tandem mass spectrometry data to recognize peptides in a sample is the fundamental task in computational proteomics. Traditional peptide identification algorithms perform well when identifying unmodified peptides. However, when peptides have post-translational modifications (PTMs), these methods cannot provide satisfactory results. Recently, Chick et al., 2015 and Yu et al., 2016 proposed the spectrum-based and tag-based open search methods, respectively, to identify peptides with PTMs. While the performance of these two methods is promising, the identification results vary greatly with respect to the quality of tandem mass spectra and the number of PTMs in peptides. This motivates us to systematically study the relationship between the performance of open search methods and quality parameters of tandem mass spectrum data, as well as the number of PTMs in peptides.ResultsThrough large-scale simulations, we obtain the performance trend when simulated tandem mass spectra are of different quality. We propose an analytical model to describe the relationship between the probability of obtaining correct identifications and the spectrum quality as well as the number of PTMs. Based on the analytical model, we can quantitatively describe the necessary condition to effectively apply open search methods.AvailabilitySource codes of the simulation are available at http://bioinformatics.ust.hk/[email protected] or [email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text