Functional genetic discovery of enzymes using full-scan mass spectrometry metabolomics

<p>The goal of this research was to use two-dimensional electrophoresis to examine changes in abundance of enzymes of the glycolytic pathway in the yeast Saccharomyces cerevisiae grown on carbon sources that support either fermentation to ethanol or oxidative metabolism. Large-scale profiling of protein abundances (expression proteomics) often detects changes in protein abundance between physiological states. Such changes in enzyme abundance are often interpreted as evidence of metabolic change although most textbooks emphasise control of enzyme activities not enzyme amount. Two-dimensional difference gel electrophoresis (2DDIGE) was therefore used to examine differences in protein abundance between S. cerevisiae strain BY4741 grown on either glucose (fermentation) or glycerol. Growth on 2% glucose, but not on glycerol, was accompanied by extensive production of ethanol. Doubling times for growth were 2 h 5 min in glucose and 9 h 41 min in glycerol. Conditions for extraction and two-dimensional electrophoresis of proteins were established. One hundred and seventy nine proteins were identified by MALDI mass spectrometry of tryptic digests of protein spots excised from Coomassie stained gels. All of the enzymes for conversion of glucose to ethanol, except for the second enzyme of glycolysis phosphoglucose isomerase, were identified using twodimensional electrophoresis of 100 μg of protein from cells grown on 2% glucose. Identification of proteins excised from the DIGE gels was more challenging, partly because of the lower amount of protein. Eight of the proteins that showed statistically significant differences in abundance (≥ 2-fold, p ≤ 0.01) between glucose and glycerol were identified by mass spectrometry of proteins excised from the 2DDIGE gels, and a further 18 varying proteins were matched to proteins identified from the Coomassie stained gels. Of these total 26 identified or matched proteins, subunits of five of the enzymes for conversion of glucose to ethanol were more abundant from the fermentative cells grown on glucose. The more abundant glycolytic enzymes were phosphofructokinase 2, fructose-1,6-bisphosphate aldolase, triosephosphate isomerase and enolase, plus pyruvate decarboxylase that was required for conversion of the glycolytic product pyruvate to acetaldehyde. The alcohol dehydrogenases Adh1 and Adh4 that convert acetaldehyde to ethanol were detected but did not vary significantly between growth on glucose or glycerol. The results confirmed that in this case changes in abundance of some enzymes were consistent with the altered metabolic output. Future studies should examine whether changes in the abundance and activity of these enzymes are responsible for the differences in metabolism.</p>

Download Full-text

Proteomic analysis of Saccharomyces cerevisiae grown on glucose or glycerol

10.26686/wgtn.17006830.v1 ◽

2021 ◽

Author(s):

◽

Hannah D. Hoang

Keyword(s):

Mass Spectrometry ◽

Saccharomyces Cerevisiae ◽

Large Scale ◽

Triosephosphate Isomerase ◽

Lower Amount ◽

Protein Abundance ◽

Two Dimensional ◽

Yeast Saccharomyces Cerevisiae ◽

Two Dimensional Electrophoresis ◽

Dimensional Electrophoresis

<p>The goal of this research was to use two-dimensional electrophoresis to examine changes in abundance of enzymes of the glycolytic pathway in the yeast Saccharomyces cerevisiae grown on carbon sources that support either fermentation to ethanol or oxidative metabolism. Large-scale profiling of protein abundances (expression proteomics) often detects changes in protein abundance between physiological states. Such changes in enzyme abundance are often interpreted as evidence of metabolic change although most textbooks emphasise control of enzyme activities not enzyme amount. Two-dimensional difference gel electrophoresis (2DDIGE) was therefore used to examine differences in protein abundance between S. cerevisiae strain BY4741 grown on either glucose (fermentation) or glycerol. Growth on 2% glucose, but not on glycerol, was accompanied by extensive production of ethanol. Doubling times for growth were 2 h 5 min in glucose and 9 h 41 min in glycerol. Conditions for extraction and two-dimensional electrophoresis of proteins were established. One hundred and seventy nine proteins were identified by MALDI mass spectrometry of tryptic digests of protein spots excised from Coomassie stained gels. All of the enzymes for conversion of glucose to ethanol, except for the second enzyme of glycolysis phosphoglucose isomerase, were identified using twodimensional electrophoresis of 100 μg of protein from cells grown on 2% glucose. Identification of proteins excised from the DIGE gels was more challenging, partly because of the lower amount of protein. Eight of the proteins that showed statistically significant differences in abundance (≥ 2-fold, p ≤ 0.01) between glucose and glycerol were identified by mass spectrometry of proteins excised from the 2DDIGE gels, and a further 18 varying proteins were matched to proteins identified from the Coomassie stained gels. Of these total 26 identified or matched proteins, subunits of five of the enzymes for conversion of glucose to ethanol were more abundant from the fermentative cells grown on glucose. The more abundant glycolytic enzymes were phosphofructokinase 2, fructose-1,6-bisphosphate aldolase, triosephosphate isomerase and enolase, plus pyruvate decarboxylase that was required for conversion of the glycolytic product pyruvate to acetaldehyde. The alcohol dehydrogenases Adh1 and Adh4 that convert acetaldehyde to ethanol were detected but did not vary significantly between growth on glucose or glycerol. The results confirmed that in this case changes in abundance of some enzymes were consistent with the altered metabolic output. Future studies should examine whether changes in the abundance and activity of these enzymes are responsible for the differences in metabolism.</p>

Download Full-text

Algorithm of combining chromatography mass spectrometry-untargeted profiling and multivariate analysis for identification of marker-substances in samples of complex composition

Industrial laboratory Diagnostics of materials ◽

10.26896/1028-6861-2020-86-7-12-19 ◽

2020 ◽

Vol 86 (7) ◽

pp. 12-19

Author(s):

I. V. Plyushchenko ◽

D. G. Shakhmatov ◽

I. A. Rodin

Keyword(s):

Mass Spectrometry ◽

Multivariate Analysis ◽

Large Scale ◽

Complex Composition ◽

Unified Protocol ◽

Chromatography Mass Spectrometry ◽

Marker Substances ◽

Selection Testing ◽

Untargeted Profiling

A viral development of statistical data processing, computing capabilities, chromatography-mass spectrometry, and omics technologies (technologies based on the achievements of genomics, transcriptomics, proteomics, metabolomics) in recent decades has not led to formation of a unified protocol for untargeted profiling. Systematic errors reduce the reproducibility and reliability of the obtained results, and at the same time hinder consolidation and analysis of data gained in large-scale multi-day experiments. We propose an algorithm for conducting omics profiling to identify potential markers in the samples of complex composition and present the case study of urine samples obtained from different clinical groups of patients. Profiling was carried out by the method of liquid chromatography mass spectrometry. The markers were selected using methods of multivariate analysis including machine learning and feature selection. Testing of the approach was performed using an independent dataset by clustering and projection on principal components.

Download Full-text

DeepSSPred: A Deep Learning Based Sulfenylation site predictor via a novel n-segmented optimize federated feature encoder

Protein and Peptide Letters ◽

10.2174/0929866527666201202103411 ◽

2020 ◽

Vol 27 ◽

Author(s):

Zaheer Ullah Khan ◽

Dechang Pi

Keyword(s):

Large Scale ◽

Computational Models ◽

Research Work ◽

Training Data ◽

Training Dataset ◽

Validation Dataset ◽

Cytokine Signaling ◽

Minority Class ◽

Independent Dataset ◽

Feature Encoding

Background: S-sulfenylation (S-sulphenylation, or sulfenic acid) proteins, are special kinds of post-translation modification, which plays an important role in various physiological and pathological processes such as cytokine signaling, transcriptional regulation, and apoptosis. Despite these aforementioned significances, and by complementing existing wet methods, several computational models have been developed for sulfenylation cysteine sites prediction. However, the performance of these models was not satisfactory due to inefficient feature schemes, severe imbalance issues, and lack of an intelligent learning engine. Objective: In this study, our motivation is to establish a strong and novel computational predictor for discrimination of sulfenylation and non-sulfenylation sites. Methods: In this study, we report an innovative bioinformatics feature encoding tool, named DeepSSPred, in which, resulting encoded features is obtained via n-segmented hybrid feature, and then the resampling technique called synthetic minority oversampling was employed to cope with the severe imbalance issue between SC-sites (minority class) and non-SC sites (majority class). State of the art 2DConvolutional Neural Network was employed over rigorous 10-fold jackknife cross-validation technique for model validation and authentication. Results: Following the proposed framework, with a strong discrete presentation of feature space, machine learning engine, and unbiased presentation of the underline training data yielded into an excellent model that outperforms with all existing established studies. The proposed approach is 6% higher in terms of MCC from the first best. On an independent dataset, the existing first best study failed to provide sufficient details. The model obtained an increase of 7.5% in accuracy, 1.22% in Sn, 12.91% in Sp and 13.12% in MCC on the training data and12.13% of ACC, 27.25% in Sn, 2.25% in Sp, and 30.37% in MCC on an independent dataset in comparison with 2nd best method. These empirical analyses show the superlative performance of the proposed model over both training and Independent dataset in comparison with existing literature studies. Conclusion : In this research, we have developed a novel sequence-based automated predictor for SC-sites, called DeepSSPred. The empirical simulations outcomes with a training dataset and independent validation dataset have revealed the efficacy of the proposed theoretical model. The good performance of DeepSSPred is due to several reasons, such as novel discriminative feature encoding schemes, SMOTE technique, and careful construction of the prediction model through the tuned 2D-CNN classifier. We believe that our research work will provide a potential insight into a further prediction of S-sulfenylation characteristics and functionalities. Thus, we hope that our developed predictor will significantly helpful for large scale discrimination of unknown SC-sites in particular and designing new pharmaceutical drugs in general.

Download Full-text

Characterization of drug interferences caused by coelution of substances in gas chromatography/mass spectrometry confirmation of targeted drugs in full-scan and selected-ion monitoring modes

Clinical Chemistry ◽

10.1093/clinchem/40.2.216 ◽

1994 ◽

Vol 40 (2) ◽

pp. 216-220 ◽

Cited By ~ 16

Author(s):

A H Wu ◽

D Ostheimer ◽

M Cremese ◽

E Forte ◽

D Hill

Keyword(s):

Mass Spectrometry ◽

Gas Chromatography Mass Spectrometry ◽

Spectrometric Analysis ◽

Targeted Drugs ◽

Selected Ion Monitoring ◽

Internal Standards ◽

Target Drug ◽

Full Scan ◽

Chromatographic Mass

Abstract Interference by substances coeluting with targeted drugs is a general problem for gas chromatographic/mass spectrometric analysis of urine. To characterize these interferences, we examined human urine samples containing benzoylecgonine and fluconazole, and other drug combinations including deuterated internal standards that coelute (ISd,c) with target drugs, by selected-ion monitoring (SIM) and full-scan mass spectrometry. We show that, by SIM analysis, detecting the presence of an interferent is dependent on the specific IS used for the assay. When an ISd,c is used, the presence of another coeluting substance (interferent) suggests that the intensity of IS ions is substantially diminished, because the interferent affects both the ISd,c and target drug. When a noncoeluting IS (ISnc) is used, the interferent cannot be discerned unless it coincidently contains one or more of the ions monitored for either the target drug or ISnc. Under full-scan analysis, a coeluting interferent is directly discernable by examining the total ion gas chromatogram.

Download Full-text

Ultra‐Fast Retroactive Processing by MetAlign of Liquid‐Chromatography High‐Resolution Full‐Scan Orbitrap Mass Spectrometry Data in WADA Human Urine Sample Monitoring Program

Rapid Communications in Mass Spectrometry ◽

10.1002/rcm.9141 ◽

2021 ◽

Author(s):

Safa Khelifi ◽

Khadija Saad ◽

Ariadni Vonaparti ◽

Souhila Mahieddine ◽

Sofia Salama ◽

...

Keyword(s):

Mass Spectrometry ◽

Liquid Chromatography ◽

High Resolution ◽

Urine Sample ◽

Human Urine ◽

Monitoring Program ◽

Mass Spectrometry Data ◽

Human Urine Sample ◽

Orbitrap Mass Spectrometry ◽

Full Scan

Download Full-text

A Novel Method to Predict Drug-Target Interactions Based on Large-Scale Graph Representation Learning

Cancers ◽

10.3390/cancers13092111 ◽

2021 ◽

Vol 13 (9) ◽

pp. 2111

Author(s):

Bo-Wei Zhao ◽

Zhu-Hong You ◽

Lun Hu ◽

Zhen-Hao Guo ◽

Lei Wang ◽

...

Keyword(s):

Drug Target ◽

Large Scale ◽

Computational Models ◽

Structural Information ◽

Characteristic Curve ◽

Representation Learning ◽

Graph Representation ◽

Convolutional Network ◽

Novel Method

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.

Download Full-text

Clinician checklist for assessing suitability of machine learning applications in healthcare

BMJ Health & Care Informatics ◽

10.1136/bmjhci-2020-100251 ◽

2021 ◽

Vol 28 (1) ◽

pp. e100251

Author(s):

Ian Scott ◽

Stacey Carter ◽

Enrico Coiera

Keyword(s):

Machine Learning ◽

Large Scale ◽

Clinical Decision Making ◽

Improve Patient Care ◽

Clinical Decision ◽

Routine Care ◽

Machine Learning Algorithms ◽

Clinical Settings ◽

Machine Learning Applications ◽

Key Issues

Machine learning algorithms are being used to screen and diagnose disease, prognosticate and predict therapeutic responses. Hundreds of new algorithms are being developed, but whether they improve clinical decision making and patient outcomes remains uncertain. If clinicians are to use algorithms, they need to be reassured that key issues relating to their validity, utility, feasibility, safety and ethical use have been addressed. We propose a checklist of 10 questions that clinicians can ask of those advocating for the use of a particular algorithm, but which do not expect clinicians, as non-experts, to demonstrate mastery over what can be highly complex statistical and computational concepts. The questions are: (1) What is the purpose and context of the algorithm? (2) How good were the data used to train the algorithm? (3) Were there sufficient data to train the algorithm? (4) How well does the algorithm perform? (5) Is the algorithm transferable to new clinical settings? (6) Are the outputs of the algorithm clinically intelligible? (7) How will this algorithm fit into and complement current workflows? (8) Has use of the algorithm been shown to improve patient care and outcomes? (9) Could the algorithm cause patient harm? and (10) Does use of the algorithm raise ethical, legal or social concerns? We provide examples where an algorithm may raise concerns and apply the checklist to a recent review of diagnostic imaging applications. This checklist aims to assist clinicians in assessing algorithm readiness for routine care and identify situations where further refinement and evaluation is required prior to large-scale use.

Download Full-text

A First Step towards Learning which uORFs Regulate Gene Expression

Journal of Integrative Bioinformatics ◽

10.1515/jib-2006-31 ◽

2006 ◽

Vol 3 (2) ◽

pp. 109-122 ◽

Cited By ~ 1

Author(s):

◽

Christopher H. Bryant ◽

Graham J.L. Kemp ◽

Marija Cvijovic

Keyword(s):

Gene Expression ◽

Inductive Logic ◽

Preliminary Evidence ◽

Open Reading Frames ◽

Yeast Saccharomyces Cerevisiae ◽

Regulate Gene Expression ◽

Integrative System ◽

Upstream Open Reading Frames ◽

Reading Frames ◽

Regulate Gene

Summary We have taken a first step towards learning which upstream Open Reading Frames (uORFs) regulate gene expression (i.e., which uORFs are functional) in the yeast Saccharomyces cerevisiae. We do this by integrating data from several resources and combining a bioinformatics tool, ORF Finder, with a machine learning technique, inductive logic programming (ILP). Here, we report the challenge of using ILP as part of this integrative system, in order to automatically generate a model that identifies functional uORFs. Our method makes searching for novel functional uORFs more efficient than random sampling. An attempt has been made to predict novel functional uORFs using our method. Some preliminary evidence that our model may be biologically meaningful is presented.

Download Full-text

Automated 16-Plex Plasma Proteomics with Real-Time Search and Ion Mobility Mass Spectrometry Enables Large-Scale Profiling in Naked Mole-Rats and Mice

Journal of Proteome Research ◽

10.1021/acs.jproteome.0c00681 ◽

2021 ◽

Vol 20 (2) ◽

pp. 1280-1295

Author(s):

Aleksandr Gaun ◽

Kaitlyn N. Lewis Hardell ◽

Niclas Olsson ◽

Jonathon J. O’Brien ◽

Sudha Gollapudi ◽

...

Keyword(s):

Mass Spectrometry ◽

Real Time ◽

Ion Mobility ◽

Large Scale ◽

Ion Mobility Mass Spectrometry ◽

Plasma Proteomics ◽

Rats And Mice

Download Full-text