Functional genetic discovery of enzymes using full-scan mass spectrometry metabolomics

2019 ◽  
Vol 97 (1) ◽  
pp. 73-84 ◽  
Author(s):  
Amy A. Caudy ◽  
Julia A. Hanchard ◽  
Alan Hsieh ◽  
Saravannan Shaan ◽  
Adam P. Rosebrock

Our understanding of metabolic networks is incomplete, and new enzymatic activities await discovery in well-studied organisms. Mass spectrometric measurement of cellular metabolites reveals compounds inside cells that are unexplained by current maps of metabolic reactions, and existing computational models are unable to account for all activities observed within cells. Additional large-scale genetic and biochemical approaches are required to elucidate metabolic gene function. We have used full-scan mass spectrometry metabolomics of polar small molecules to examine deletion mutants of candidate enzymes in the model yeast Saccharomyces cerevisiae. We report the identification of 25 genes whose deletion results in focal metabolic changes consistent with loss of enzymatic activity and describe the informatic approaches used to enrich for candidate enzymes from uncharacterized open reading frames. Triumphs and pitfalls of metabolic phenotyping screens are discussed, including estimates of the frequency of uncharacterized eukaryotic genes that affect metabolism and key issues to consider when searching for new enzymatic functions in other organisms.

2021 ◽  
Author(s):  
◽  
Hannah D. Hoang

<p>The goal of this research was to use two-dimensional electrophoresis to examine changes in abundance of enzymes of the glycolytic pathway in the yeast Saccharomyces cerevisiae grown on carbon sources that support either fermentation to ethanol or oxidative metabolism. Large-scale profiling of protein abundances (expression proteomics) often detects changes in protein abundance between physiological states. Such changes in enzyme abundance are often interpreted as evidence of metabolic change although most textbooks emphasise control of enzyme activities not enzyme amount. Two-dimensional difference gel electrophoresis (2DDIGE) was therefore used to examine differences in protein abundance between S. cerevisiae strain BY4741 grown on either glucose (fermentation) or glycerol. Growth on 2% glucose, but not on glycerol, was accompanied by extensive production of ethanol. Doubling times for growth were 2 h 5 min in glucose and 9 h 41 min in glycerol. Conditions for extraction and two-dimensional electrophoresis of proteins were established. One hundred and seventy nine proteins were identified by MALDI mass spectrometry of tryptic digests of protein spots excised from Coomassie stained gels. All of the enzymes for conversion of glucose to ethanol, except for the second enzyme of glycolysis phosphoglucose isomerase, were identified using twodimensional electrophoresis of 100 μg of protein from cells grown on 2% glucose. Identification of proteins excised from the DIGE gels was more challenging, partly because of the lower amount of protein. Eight of the proteins that showed statistically significant differences in abundance (≥ 2-fold, p ≤ 0.01) between glucose and glycerol were identified by mass spectrometry of proteins excised from the 2DDIGE gels, and a further 18 varying proteins were matched to proteins identified from the Coomassie stained gels. Of these total 26 identified or matched proteins, subunits of five of the enzymes for conversion of glucose to ethanol were more abundant from the fermentative cells grown on glucose. The more abundant glycolytic enzymes were phosphofructokinase 2, fructose-1,6-bisphosphate aldolase, triosephosphate isomerase and enolase, plus pyruvate decarboxylase that was required for conversion of the glycolytic product pyruvate to acetaldehyde. The alcohol dehydrogenases Adh1 and Adh4 that convert acetaldehyde to ethanol were detected but did not vary significantly between growth on glucose or glycerol. The results confirmed that in this case changes in abundance of some enzymes were consistent with the altered metabolic output. Future studies should examine whether changes in the abundance and activity of these enzymes are responsible for the differences in metabolism.</p>


2021 ◽  
Author(s):  
◽  
Hannah D. Hoang

<p>The goal of this research was to use two-dimensional electrophoresis to examine changes in abundance of enzymes of the glycolytic pathway in the yeast Saccharomyces cerevisiae grown on carbon sources that support either fermentation to ethanol or oxidative metabolism. Large-scale profiling of protein abundances (expression proteomics) often detects changes in protein abundance between physiological states. Such changes in enzyme abundance are often interpreted as evidence of metabolic change although most textbooks emphasise control of enzyme activities not enzyme amount. Two-dimensional difference gel electrophoresis (2DDIGE) was therefore used to examine differences in protein abundance between S. cerevisiae strain BY4741 grown on either glucose (fermentation) or glycerol. Growth on 2% glucose, but not on glycerol, was accompanied by extensive production of ethanol. Doubling times for growth were 2 h 5 min in glucose and 9 h 41 min in glycerol. Conditions for extraction and two-dimensional electrophoresis of proteins were established. One hundred and seventy nine proteins were identified by MALDI mass spectrometry of tryptic digests of protein spots excised from Coomassie stained gels. All of the enzymes for conversion of glucose to ethanol, except for the second enzyme of glycolysis phosphoglucose isomerase, were identified using twodimensional electrophoresis of 100 μg of protein from cells grown on 2% glucose. Identification of proteins excised from the DIGE gels was more challenging, partly because of the lower amount of protein. Eight of the proteins that showed statistically significant differences in abundance (≥ 2-fold, p ≤ 0.01) between glucose and glycerol were identified by mass spectrometry of proteins excised from the 2DDIGE gels, and a further 18 varying proteins were matched to proteins identified from the Coomassie stained gels. Of these total 26 identified or matched proteins, subunits of five of the enzymes for conversion of glucose to ethanol were more abundant from the fermentative cells grown on glucose. The more abundant glycolytic enzymes were phosphofructokinase 2, fructose-1,6-bisphosphate aldolase, triosephosphate isomerase and enolase, plus pyruvate decarboxylase that was required for conversion of the glycolytic product pyruvate to acetaldehyde. The alcohol dehydrogenases Adh1 and Adh4 that convert acetaldehyde to ethanol were detected but did not vary significantly between growth on glucose or glycerol. The results confirmed that in this case changes in abundance of some enzymes were consistent with the altered metabolic output. Future studies should examine whether changes in the abundance and activity of these enzymes are responsible for the differences in metabolism.</p>


2020 ◽  
Vol 86 (7) ◽  
pp. 12-19
Author(s):  
I. V. Plyushchenko ◽  
D. G. Shakhmatov ◽  
I. A. Rodin

A viral development of statistical data processing, computing capabilities, chromatography-mass spectrometry, and omics technologies (technologies based on the achievements of genomics, transcriptomics, proteomics, metabolomics) in recent decades has not led to formation of a unified protocol for untargeted profiling. Systematic errors reduce the reproducibility and reliability of the obtained results, and at the same time hinder consolidation and analysis of data gained in large-scale multi-day experiments. We propose an algorithm for conducting omics profiling to identify potential markers in the samples of complex composition and present the case study of urine samples obtained from different clinical groups of patients. Profiling was carried out by the method of liquid chromatography mass spectrometry. The markers were selected using methods of multivariate analysis including machine learning and feature selection. Testing of the approach was performed using an independent dataset by clustering and projection on principal components.


2020 ◽  
Vol 27 ◽  
Author(s):  
Zaheer Ullah Khan ◽  
Dechang Pi

Background: S-sulfenylation (S-sulphenylation, or sulfenic acid) proteins, are special kinds of post-translation modification, which plays an important role in various physiological and pathological processes such as cytokine signaling, transcriptional regulation, and apoptosis. Despite these aforementioned significances, and by complementing existing wet methods, several computational models have been developed for sulfenylation cysteine sites prediction. However, the performance of these models was not satisfactory due to inefficient feature schemes, severe imbalance issues, and lack of an intelligent learning engine. Objective: In this study, our motivation is to establish a strong and novel computational predictor for discrimination of sulfenylation and non-sulfenylation sites. Methods: In this study, we report an innovative bioinformatics feature encoding tool, named DeepSSPred, in which, resulting encoded features is obtained via n-segmented hybrid feature, and then the resampling technique called synthetic minority oversampling was employed to cope with the severe imbalance issue between SC-sites (minority class) and non-SC sites (majority class). State of the art 2DConvolutional Neural Network was employed over rigorous 10-fold jackknife cross-validation technique for model validation and authentication. Results: Following the proposed framework, with a strong discrete presentation of feature space, machine learning engine, and unbiased presentation of the underline training data yielded into an excellent model that outperforms with all existing established studies. The proposed approach is 6% higher in terms of MCC from the first best. On an independent dataset, the existing first best study failed to provide sufficient details. The model obtained an increase of 7.5% in accuracy, 1.22% in Sn, 12.91% in Sp and 13.12% in MCC on the training data and12.13% of ACC, 27.25% in Sn, 2.25% in Sp, and 30.37% in MCC on an independent dataset in comparison with 2nd best method. These empirical analyses show the superlative performance of the proposed model over both training and Independent dataset in comparison with existing literature studies. Conclusion : In this research, we have developed a novel sequence-based automated predictor for SC-sites, called DeepSSPred. The empirical simulations outcomes with a training dataset and independent validation dataset have revealed the efficacy of the proposed theoretical model. The good performance of DeepSSPred is due to several reasons, such as novel discriminative feature encoding schemes, SMOTE technique, and careful construction of the prediction model through the tuned 2D-CNN classifier. We believe that our research work will provide a potential insight into a further prediction of S-sulfenylation characteristics and functionalities. Thus, we hope that our developed predictor will significantly helpful for large scale discrimination of unknown SC-sites in particular and designing new pharmaceutical drugs in general.


1994 ◽  
Vol 40 (2) ◽  
pp. 216-220 ◽  
Author(s):  
A H Wu ◽  
D Ostheimer ◽  
M Cremese ◽  
E Forte ◽  
D Hill

Abstract Interference by substances coeluting with targeted drugs is a general problem for gas chromatographic/mass spectrometric analysis of urine. To characterize these interferences, we examined human urine samples containing benzoylecgonine and fluconazole, and other drug combinations including deuterated internal standards that coelute (ISd,c) with target drugs, by selected-ion monitoring (SIM) and full-scan mass spectrometry. We show that, by SIM analysis, detecting the presence of an interferent is dependent on the specific IS used for the assay. When an ISd,c is used, the presence of another coeluting substance (interferent) suggests that the intensity of IS ions is substantially diminished, because the interferent affects both the ISd,c and target drug. When a noncoeluting IS (ISnc) is used, the interferent cannot be discerned unless it coincidently contains one or more of the ions monitored for either the target drug or ISnc. Under full-scan analysis, a coeluting interferent is directly discernable by examining the total ion gas chromatogram.


Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2111
Author(s):  
Bo-Wei Zhao ◽  
Zhu-Hong You ◽  
Lun Hu ◽  
Zhen-Hao Guo ◽  
Lei Wang ◽  
...  

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.


2021 ◽  
Vol 28 (1) ◽  
pp. e100251
Author(s):  
Ian Scott ◽  
Stacey Carter ◽  
Enrico Coiera

Machine learning algorithms are being used to screen and diagnose disease, prognosticate and predict therapeutic responses. Hundreds of new algorithms are being developed, but whether they improve clinical decision making and patient outcomes remains uncertain. If clinicians are to use algorithms, they need to be reassured that key issues relating to their validity, utility, feasibility, safety and ethical use have been addressed. We propose a checklist of 10 questions that clinicians can ask of those advocating for the use of a particular algorithm, but which do not expect clinicians, as non-experts, to demonstrate mastery over what can be highly complex statistical and computational concepts. The questions are: (1) What is the purpose and context of the algorithm? (2) How good were the data used to train the algorithm? (3) Were there sufficient data to train the algorithm? (4) How well does the algorithm perform? (5) Is the algorithm transferable to new clinical settings? (6) Are the outputs of the algorithm clinically intelligible? (7) How will this algorithm fit into and complement current workflows? (8) Has use of the algorithm been shown to improve patient care and outcomes? (9) Could the algorithm cause patient harm? and (10) Does use of the algorithm raise ethical, legal or social concerns? We provide examples where an algorithm may raise concerns and apply the checklist to a recent review of diagnostic imaging applications. This checklist aims to assist clinicians in assessing algorithm readiness for routine care and identify situations where further refinement and evaluation is required prior to large-scale use.


2006 ◽  
Vol 3 (2) ◽  
pp. 109-122 ◽  
Author(s):  
◽  
Christopher H. Bryant ◽  
Graham J.L. Kemp ◽  
Marija Cvijovic

Summary We have taken a first step towards learning which upstream Open Reading Frames (uORFs) regulate gene expression (i.e., which uORFs are functional) in the yeast Saccharomyces cerevisiae. We do this by integrating data from several resources and combining a bioinformatics tool, ORF Finder, with a machine learning technique, inductive logic programming (ILP). Here, we report the challenge of using ILP as part of this integrative system, in order to automatically generate a model that identifies functional uORFs. Our method makes searching for novel functional uORFs more efficient than random sampling. An attempt has been made to predict novel functional uORFs using our method. Some preliminary evidence that our model may be biologically meaningful is presented.


2021 ◽  
Vol 20 (2) ◽  
pp. 1280-1295
Author(s):  
Aleksandr Gaun ◽  
Kaitlyn N. Lewis Hardell ◽  
Niclas Olsson ◽  
Jonathon J. O’Brien ◽  
Sudha Gollapudi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document