scholarly journals Analyzing Fluctuation Properties in Protein Elastic Networks with Sequence-Specific and Distance-Dependent Interactions

Biomolecules ◽  
2019 ◽  
Vol 9 (10) ◽  
pp. 549
Author(s):  
Romain Amyot ◽  
Yuichi Togashi ◽  
Holger Flechsig

Simple protein elastic networks which neglect amino-acid information often yield reasonable predictions of conformational dynamics and are broadly used. Recently, model variants which incorporate sequence-specific and distance-dependent interactions of residue pairs have been constructed and demonstrated to improve agreement with experimental data. We have applied the new variants in a systematic study of protein fluctuation properties and compared their predictions with those of conventional anisotropic network models. We find that the quality of predictions is frequently linked to poor estimations in highly flexible protein regions. An analysis of a large set of protein structures shows that fluctuations of very weakly connected network residues are intrinsically prone to be significantly overestimated by all models. This problem persists in the new models and is not resolved by taking into account sequence information. The effect becomes even enhanced in the model variant which takes into account very soft long-ranged residue interactions. Beyond these shortcomings, we find that model predictions are largely insensitive to the integration of chemical information, at least regarding the fluctuation properties of individual residues. One can furthermore conclude that the inherent drawbacks may present a serious hindrance when improvement of elastic network models are attempted.

2013 ◽  
Vol 184 (12) ◽  
pp. 2860-2865 ◽  
Author(s):  
E. Sarti ◽  
S. Zamuner ◽  
P. Cossio ◽  
A. Laio ◽  
F. Seno ◽  
...  

2020 ◽  
Author(s):  
Qi Zhang ◽  
Shan Li ◽  
Bin Yu ◽  
Yang Li ◽  
Yandan Zhang ◽  
...  

ABSTRACTProteins play a significant part in life processes such as cell growth, development, and reproduction. Exploring protein subcellular localization (SCL) is a direct way to better understand the function of proteins in cells. Studies have found that more and more proteins belong to multiple subcellular locations, and these proteins are called multi-label proteins. They not only play a key role in cell life activities, but also play an indispensable role in medicine and drug development. This article first presents a new prediction model, MpsLDA-ProSVM, to predict the SCL of multi-label proteins. Firstly, the physical and chemical information, evolution information, sequence information and annotation information of protein sequences are fused. Then, for the first time, use a weighted multi-label linear discriminant analysis framework based on entropy weight form (wMLDAe) to refine and purify features, reduce the difficulty of learning. Finally, input the optimal feature subset into the multi-label learning with label-specific features (LIFT) and multi-label k-nearest neighbor (ML-KNN) algorithms to obtain a synthetic ranking of relevant labels, and then use Prediction and Relevance Ordering based SVM (ProSVM) classifier to predict the SCLs. This method can rank and classify related tags at the same time, which greatly improves the efficiency of the model. Tested by jackknife method, the overall actual accuracy (OAA) on virus, plant, Gram-positive bacteria and Gram-negative bacteria datasets are 98.06%, 98.97%, 99.81% and 98.49%, which are 0.56%-9.16%, 5.37%-30.87%, 3.51%-6.91% and 3.99%-8.59% higher than other advanced methods respectively. The source codes and datasets are available at https://github.com/QUST-AIBBDRC/MpsLDA-ProSVM/.


2020 ◽  
Author(s):  
Maxim Ivanov ◽  
Albin Sandelin ◽  
Sebastian Marquardt

Abstract Background: The quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and computational pipelines for de novo transcriptome annotation. Ideally, gene and transcript models should be called from a limited set of key experimental data. Results: We developed TranscriptomeReconstructoR, an R package which implements a pipeline for automated transcriptome annotation. It relies on integrating features from independent and complementary datasets: i) full-length RNA-seq for detection of splicing patterns and ii) high-throughput 5' and 3' tag sequencing data for accurate definition of gene borders. The pipeline can also take a nascent RNA-seq dataset to supplement the called gene model with transient transcripts.We reconstructed de novo the transcriptional landscape of wild type Arabidopsis thaliana seedlings as a proof-of-principle. A comparison to the existing transcriptome annotations revealed that our gene model is more accurate and comprehensive than the two most commonly used community gene models, TAIR10 and Araport11. In particular, we identify thousands of transient transcripts missing from the existing annotations. Our new annotation promises to improve the quality of A.thaliana genome research.Conclusions: Our proof-of-concept data suggest a cost-efficient strategy for rapid and accurate annotation of complex eukaryotic transcriptomes. We combine the choice of library preparation methods and sequencing platforms with the dedicated computational pipeline implemented in the TranscriptomeReconstructoR package. The pipeline only requires prior knowledge on the reference genomic DNA sequence, but not the transcriptome. The package seamlessly integrates with Bioconductor packages for downstream analysis.


2020 ◽  
Vol 34 (05) ◽  
pp. 9282-9289
Author(s):  
Qingyang Wu ◽  
Lei Li ◽  
Hao Zhou ◽  
Ying Zeng ◽  
Zhou Yu

Many social media news writers are not professionally trained. Therefore, social media platforms have to hire professional editors to adjust amateur headlines to attract more readers. We propose to automate this headline editing process through neural network models to provide more immediate writing support for these social media news writers. To train such a neural headline editing model, we collected a dataset which contains articles with original headlines and professionally edited headlines. However, it is expensive to collect a large number of professionally edited headlines. To solve this low-resource problem, we design an encoder-decoder model which leverages large scale pre-trained language models. We further improve the pre-trained model's quality by introducing a headline generation task as an intermediate task before the headline editing task. Also, we propose Self Importance-Aware (SIA) loss to address the different levels of editing in the dataset by down-weighting the importance of easily classified tokens and sentences. With the help of Pre-training, Adaptation, and SIA, the model learns to generate headlines in the professional editor's style. Experimental results show that our method significantly improves the quality of headline editing comparing against previous methods.


10.14311/1121 ◽  
2009 ◽  
Vol 49 (2) ◽  
Author(s):  
M. Chvalina

This article analyses the existing possibilities for using Standard Statistical Methods and Artificial Intelligence Methods for a short-term forecast and simulation of demand in the field of telecommunications. The most widespread methods are based on Time Series Analysis. Nowadays, approaches based on Artificial Intelligence Methods, including Neural Networks, are booming. Separate approaches will be used in the study of Demand Modelling in Telecommunications, and the results of these models will be compared with actual guaranteed values. Then we will examine the quality of Neural Network models. 


2018 ◽  
Vol 111 (1) ◽  
pp. 97-112
Author(s):  
Tim vor der Brück

Abstract Rule-based natural language generation denotes the process of converting a semantic input structure into a surface representation by means of a grammar. In the following, we assume that this grammar is handcrafted and not automatically created for instance by a deep neural network. Such a grammar might comprise of a large set of rules. A single error in these rules can already have a large impact on the quality of the generated sentences, potentially causing even a complete failure of the entire generation process. Searching for errors in these rules can be quite tedious and time-consuming due to potentially complex and recursive dependencies. This work proposes a statistical approach to recognizing errors and providing suggestions for correcting certain kinds of errors by cross-checking the grammar with the semantic input structure. The basic assumption is the correctness of the latter, which is usually a valid hypothesis due to the fact that these input structures are often automatically created. Our evaluation reveals that in many cases an automatic error detection and correction is indeed possible.


2000 ◽  
Vol 14 (7) ◽  
pp. 559-564
Author(s):  
E A Gladkov ◽  
A V Maloletkov ◽  
R A Perkovskii ◽  
A I Gavrilov

Author(s):  
Arun G. Ingale

To predict the structure of protein from a primary amino acid sequence is computationally difficult. An investigation of the methods and algorithms used to predict protein structure and a thorough knowledge of the function and structure of proteins are critical for the advancement of biology and the life sciences as well as the development of better drugs, higher-yield crops, and even synthetic bio-fuels. To that end, this chapter sheds light on the methods used for protein structure prediction. This chapter covers the applications of modeled protein structures and unravels the relationship between pure sequence information and three-dimensional structure, which continues to be one of the greatest challenges in molecular biology. With this resource, it presents an all-encompassing examination of the problems, methods, tools, servers, databases, and applications of protein structure prediction, giving unique insight into the future applications of the modeled protein structures. In this chapter, current protein structure prediction methods are reviewed for a milieu on structure prediction, the prediction of structural fundamentals, tertiary structure prediction, and functional imminent. The basic ideas and advances of these directions are discussed in detail.


2019 ◽  
Vol 33 (9) ◽  
pp. 787-797 ◽  
Author(s):  
Zoltán Orgován ◽  
György G. Ferenczy ◽  
György M. Keserű

Abstract Stabilizing unique receptor conformations, allosteric modulators of G-protein coupled receptors (GPCRs) might open novel treatment options due to their new pharmacological action, their enhanced specificity and selectivity in both binding and signaling. Ligand binding occurs at intrahelical allosteric sites and involves significant induced fit effects that include conformational changes in the local protein environment and water networks. Based on the analysis of available crystal structures of metabotropic glutamate receptor 5 (mGlu5) we investigated these effects in the binding of mGlu5 receptor negative allosteric modulators. A large set of retrospective virtual screens revealed that the use of multiple protein structures and the inclusion of selected water molecules improves virtual screening performance compared to conventional docking strategies. The role of water molecules and protein flexibility in ligand binding can be taken into account efficiently by the proposed docking protocol that provided reasonable enrichment of true positives. This protocol is expected to be useful also for identifying intrahelical allosteric modulators for other GPCR targets.


1998 ◽  
Vol 12 (3) ◽  
pp. 215-219 ◽  
Author(s):  
E A Gladkov ◽  
A V Maloletkov ◽  
R A Perkovskii

Sign in / Sign up

Export Citation Format

Share Document