BERT4Bitter: a bidirectional encoder representations from transformers (BERT)-based model for improving the prediction of bitter peptides

Author(s):  
Phasit Charoenkwan ◽  
Chanin Nantasenamat ◽  
Md Mehedi Hasan ◽  
Balachandran Manavalan ◽  
Watshara Shoombuatong

Abstract Motivation The identification of bitter peptides through experimental approaches is an expensive and time-consuming endeavor. Due to the huge number of newly available peptide sequences in the post-genomic era, the development of automated computational models for the identification of novel bitter peptides is highly desirable. Results In this work, we present BERT4Bitter, a bidirectional encoder representation from transformers (BERT)-based model for predicting bitter peptides directly from their amino acid sequence without using any structural information. To the best of our knowledge, this is the first time a BERT-based model has been employed to identify bitter peptides. Compared to widely used machine learning models, BERT4Bitter achieved the best performance with an accuracy of 0.861 and 0.922 for cross-validation and independent tests, respectively. Furthermore, extensive empirical benchmarking experiments on the independent dataset demonstrated that BERT4Bitter clearly outperformed the existing method with improvements of 8.0% accuracy and 16.0% Matthews coefficient correlation, highlighting the effectiveness and robustness of BERT4Bitter. We believe that the BERT4Bitter method proposed herein will be a useful tool for rapidly screening and identifying novel bitter peptides for drug development and nutritional research. Availabilityand implementation The user-friendly web server of the proposed BERT4Bitter is freely accessible at http://pmlab.pythonanywhere.com/BERT4Bitter. Supplementary information Supplementary data are available at Bioinformatics online.

Author(s):  
Yoochan Myung ◽  
Carlos H M Rodrigues ◽  
David B Ascher ◽  
Douglas E V Pires

AbstractMotivationA lack of accurate computational tools to guide rational mutagenesis has made affinity maturation a recurrent challenge in antibody (Ab) development. We previously showed that graph-based signatures can be used to predict the effects of mutations on Ab binding affinity.ResultsHere we present an updated and refined version of this approach, mCSM-AB2, capable of accurately modelling the effects of mutations on Ab–antigen binding affinity, through the inclusion of evolutionary and energetic terms. Using a new and expanded database of over 1800 mutations with experimental binding measurements and structural information, mCSM-AB2 achieved a Pearson’s correlation of 0.73 and 0.77 across training and blind tests, respectively, outperforming available methods currently used for rational Ab engineering.Availability and implementationmCSM-AB2 is available as a user-friendly and freely accessible web server providing rapid analysis of both individual mutations or the entire binding interface to guide rational antibody affinity maturation at http://biosig.unimelb.edu.au/mcsm_ab2Supplementary informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Vol 27 ◽  
Author(s):  
Zaheer Ullah Khan ◽  
Dechang Pi

Background: S-sulfenylation (S-sulphenylation, or sulfenic acid) proteins, are special kinds of post-translation modification, which plays an important role in various physiological and pathological processes such as cytokine signaling, transcriptional regulation, and apoptosis. Despite these aforementioned significances, and by complementing existing wet methods, several computational models have been developed for sulfenylation cysteine sites prediction. However, the performance of these models was not satisfactory due to inefficient feature schemes, severe imbalance issues, and lack of an intelligent learning engine. Objective: In this study, our motivation is to establish a strong and novel computational predictor for discrimination of sulfenylation and non-sulfenylation sites. Methods: In this study, we report an innovative bioinformatics feature encoding tool, named DeepSSPred, in which, resulting encoded features is obtained via n-segmented hybrid feature, and then the resampling technique called synthetic minority oversampling was employed to cope with the severe imbalance issue between SC-sites (minority class) and non-SC sites (majority class). State of the art 2DConvolutional Neural Network was employed over rigorous 10-fold jackknife cross-validation technique for model validation and authentication. Results: Following the proposed framework, with a strong discrete presentation of feature space, machine learning engine, and unbiased presentation of the underline training data yielded into an excellent model that outperforms with all existing established studies. The proposed approach is 6% higher in terms of MCC from the first best. On an independent dataset, the existing first best study failed to provide sufficient details. The model obtained an increase of 7.5% in accuracy, 1.22% in Sn, 12.91% in Sp and 13.12% in MCC on the training data and12.13% of ACC, 27.25% in Sn, 2.25% in Sp, and 30.37% in MCC on an independent dataset in comparison with 2nd best method. These empirical analyses show the superlative performance of the proposed model over both training and Independent dataset in comparison with existing literature studies. Conclusion : In this research, we have developed a novel sequence-based automated predictor for SC-sites, called DeepSSPred. The empirical simulations outcomes with a training dataset and independent validation dataset have revealed the efficacy of the proposed theoretical model. The good performance of DeepSSPred is due to several reasons, such as novel discriminative feature encoding schemes, SMOTE technique, and careful construction of the prediction model through the tuned 2D-CNN classifier. We believe that our research work will provide a potential insight into a further prediction of S-sulfenylation characteristics and functionalities. Thus, we hope that our developed predictor will significantly helpful for large scale discrimination of unknown SC-sites in particular and designing new pharmaceutical drugs in general.


Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2111
Author(s):  
Bo-Wei Zhao ◽  
Zhu-Hong You ◽  
Lun Hu ◽  
Zhen-Hao Guo ◽  
Lei Wang ◽  
...  

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.


2021 ◽  
Vol 14 (2) ◽  
pp. 205979912110307
Author(s):  
Dennis Mathysen ◽  
Ignace Glorieux

Virtual reality (VR) is still very much a niche technology despite its increasing popularity since recent years. VR has now reached a point where it can offer photorealistic experiences, while also being consumer-friendly and affordable. However, so far only a very limited amount of software has been developed for the specific purpose of conducting (social science) research. In this article, we illustrate that integrating virtual reality to good effect in social science research does not necessarily require specialized hardware or software, an abundance of expertise regarding VR-technology or even a large budget. We do this by discussing our use of a method we have come to call ‘VR-assisted interviews’: conducting a (semi-structured) interview while respondents are confronted with a virtual environment viewed via a VR-headset. This method allows respondents to focus on what they are seeing and experiencing, instead of having them worry about how to operate a device and navigate an interface they are using for the first time. ‘VR-assisted interviews’ are very user-friendly for respondents but also limits options for interactiveness. We believe this method can be a valuable alternative, both because of methodological and practical considerations, for more complex applications of VR-technology in social science research.


Author(s):  
Ferhat Alkan ◽  
Joana Silva ◽  
Eric Pintó Barberà ◽  
William J Faller

Abstract Motivation Ribosome Profiling (Ribo-seq) has revolutionized the study of RNA translation by providing information on ribosome positions across all translated RNAs with nucleotide-resolution. Yet several technical limitations restrict the sequencing depth of such experiments, the most common of which is the overabundance of rRNA fragments. Various strategies can be employed to tackle this issue, including the use of commercial rRNA depletion kits. However, as they are designed for more standardized RNAseq experiments, they may perform suboptimally in Ribo-seq. In order to overcome this, it is possible to use custom biotinylated oligos complementary to the most abundant rRNA fragments, however currently no computational framework exists to aid the design of optimal oligos. Results Here, we first show that a major confounding issue is that the rRNA fragments generated via Ribo-seq vary significantly with differing experimental conditions, suggesting that a “one-size-fits-all” approach may be inefficient. Therefore we developed Ribo-ODDR, an oligo design pipeline integrated with a user-friendly interface that assists in oligo selection for efficient experiment-specific rRNA depletion. Ribo-ODDR uses preliminary data to identify the most abundant rRNA fragments, and calculates the rRNA depletion efficiency of potential oligos. We experimentally show that Ribo-ODDR designed oligos outperform commercially available kits and lead to a significant increase in rRNA depletion in Ribo-seq. Availability Ribo-ODDR is freely accessible at https://github.com/fallerlab/Ribo-ODDR Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (12) ◽  
pp. 3913-3915
Author(s):  
Hemi Luan ◽  
Xingen Jiang ◽  
Fenfen Ji ◽  
Zhangzhang Lan ◽  
Zongwei Cai ◽  
...  

Abstract Motivation Liquid chromatography–mass spectrometry-based non-targeted metabolomics is routinely performed to qualitatively and quantitatively analyze a tremendous amount of metabolite signals in complex biological samples. However, false-positive peaks in the datasets are commonly detected as metabolite signals by using many popular software, resulting in non-reliable measurement. Results To reduce false-positive calling, we developed an interactive web tool, termed CPVA, for visualization and accurate annotation of the detected peaks in non-targeted metabolomics data. We used a chromatogram-centric strategy to unfold the characteristics of chromatographic peaks through visualization of peak morphology metrics, with additional functions to annotate adducts, isotopes and contaminants. CPVA is a free, user-friendly tool to help users to identify peak background noises and contaminants, resulting in decrease of false-positive or redundant peak calling, thereby improving the data quality of non-targeted metabolomics studies. Availability and implementation The CPVA is freely available at http://cpva.eastus.cloudapp.azure.com. Source code and installation instructions are available on GitHub: https://github.com/13479776/cpva. Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Zhuohang Yu ◽  
Zengrui Wu ◽  
Weihua Li ◽  
Guixia Liu ◽  
Yun Tang

Abstract Summary MetaADEDB is an online database we developed to integrate comprehensive information on adverse drug events (ADEs). The first version of MetaADEDB was released in 2013 and has been widely used by researchers. However, it has not been updated for more than seven years. Here, we reported its second version by collecting more and newer data from the U.S. FDA Adverse Event Reporting System (FAERS) and Canada Vigilance Adverse Reaction Online Database, in addition to the original three sources. The new version consists of 744 709 drug–ADE associations between 8498 drugs and 13 193 ADEs, which has an over 40% increase in drug–ADE associations compared to the previous version. Meanwhile, we developed a new and user-friendly web interface for data search and analysis. We hope that MetaADEDB 2.0 could provide a useful tool for drug safety assessment and related studies in drug discovery and development. Availability and implementation The database is freely available at: http://lmmd.ecust.edu.cn/metaadedb/. Supplementary information Supplementary data are available at Bioinformatics online.


2006 ◽  
Vol 273 (1592) ◽  
pp. 1407-1414 ◽  
Author(s):  
Joachim Kurtz ◽  
K. Mathias Wegner ◽  
Martin Kalbe ◽  
Thorsten B.H Reusch ◽  
Helmut Schaschl ◽  
...  

Individual variation in the susceptibility to infection may result from the varying ability of hosts to specifically recognize different parasite strains. Alternatively, there could be individual host differences in fitness costs of immune defence. Although, these two explanations are not mutually exclusive, they have so far been treated in separate experimental approaches. To analyse potential relationships, we studied body condition and oxidative stress, which may reflect costs of immunity, in three-spined sticklebacks that had been experimentally exposed to three species of naturally occurring parasite. These sticklebacks differed in a trait, which is crucial to specific parasite defence, i.e. individual genetic diversity at major histocompatibility complex (MHC) class IIB loci. Oxidative stress was quantified as tissue acrolein, a technique that has been applied to questions of immuno-ecology for the first time. We measured gene expression at the MHC and other estimates of immune activation. We found that fish with high levels of MHC expression had poor condition and elevated oxidative stress. These results indicate that MHC-based specific immunity is connected with oxidative stress. They could, thus, also be relevant in the broader context of the evolution of sexually selected signals that are based on carotenoids and are, thus supposed to reflect oxidative stress resistance.


Ars Adriatica ◽  
2016 ◽  
pp. 103
Author(s):  
Barbara Španjol-Pandelo

Matteo Moronzon, a member of the Venetian family of woodcarvers, was mentioned for the first time in 1407 according to the present known archival documents. Probably after being trained in his father's workshop in Venice, he moved to Zadar with his family – his wife Francisca and sons Pietro and Francesco. In 1418 he undertook the commission of furnishing carved choir stalls for the cathedral of St. Anastasia in Zadar. Various archival documents testify that Matteo lived and worked in Zadar for many years. Therefore it can be assumed that he probably founded his own workshop in Zadar where his son Francesco was trained too. Apart from the attempt to reconstruct Matteo's life and career, the aim of this paper is to interpret one important woodcarving work of art preserved in situ: choir stalls in the former cathedral of Rab, today the arch parish church of the Assumption of the Blessed Virgin Mary in Rab. Without doubt Matteo was the master carver in the production of the choir stalls in Zadar. Since he lived in Zadar it was not unusual that he had the main role in carving the stalls. In Zadar the selection of motives is more balanced and there are no significant differences in the modelling of decorative elements. However, the question whether Matteo carved absolutely everything or he had assistants arises. Considering the amount of work that had to be done it must be assumed that he had assistants who participated in work and helped him to shape the stalls. However, in literature Matteo was considered the only and undisputed author of the choir stalls in Zadar, mostly because of the preserved document. The analysis of the choir stalls in Rab by Ivo Petricioli as well as their evident formal and stylistic similarities with the stalls from the cathedral in Zadar have led to the general acceptance of the hypothesis that they were carved at the workshop of Matteo Moronzon. However, a comprehensive comparative analysis that could confirm that hypothesis was still missing. The analysis of the details and the whole led to the overall conclusion that there were a huge number of similarities between the choir stalls in Rab and Zadar. Therefore it was concluded that Matteo was the principal designer of the choir stalls in Rab who also carved the best parts in Rab, while others, less successful parts, were made by his apprentices and assistants who at the time lived on the island of Rab. In this respect, if Matteo was accepted as the author of the choir stalls of the cathedral in Zadar he must also be accepted as the author of the choir stalls from the excathedral in Rab.


2021 ◽  
Author(s):  
Daniyal Kiani ◽  
Sagar Sourav ◽  
Jonas Baltrusaitis ◽  
Israel E Wachs

The experimentally validated computational models developed herein, for the first time, show that Mn-promotion does not enhance the activity of the surface Na2WO4 catalytic active sites for CH4 heterolytic dissociation...


Sign in / Sign up

Export Citation Format

Share Document