scholarly journals 4mCpred-EL: An Ensemble Learning Framework for Identification of DNA N4-methylcytosine Sites in the Mouse Genome

Cells ◽  
2019 ◽  
Vol 8 (11) ◽  
pp. 1332 ◽  
Author(s):  
Manavalan ◽  
Basith ◽  
Shin ◽  
Lee ◽  
Wei ◽  
...  

DNA N4-methylcytosine (4mC) is one of the key epigenetic alterations, playing essential roles in DNA replication, differentiation, cell cycle, and gene expression. To better understand 4mC biological functions, it is crucial to gain knowledge on its genomic distribution. In recent times, few computational studies, in particular machine learning (ML) approaches have been applied in the prediction of 4mC site predictions. Although ML-based methods are promising for 4mC identification in other species, none are available for detecting 4mCs in the mouse genome. Our novel computational approach, called 4mCpred-EL, is the first method for identifying 4mC sites in the mouse genome where four different ML algorithms with a wide range of seven feature encodings are utilized. Subsequently, those feature encodings predicted probabilistic values are used as a feature vector and are once again inputted to ML algorithms, whose corresponding models are integrated into ensemble learning. Our benchmarking results demonstrated that 4mCpred-EL achieved an accuracy and MCC values of 0.795 and 0.591, which significantly outperformed seven other classifiers by more than 1.5–5.9% and 3.2–11.7%, respectively. Additionally, 4mCpred-EL attained an overall accuracy of 79.80%, which is 1.8–5.1% higher than that yielded by seven other classifiers in the independent evaluation. We provided a user-friendly web server, namely 4mCpred-EL which could be implemented as a pre-screening tool for the identification of potential 4mC sites in the mouse genome.

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yanjuan Li ◽  
Zhengnan Zhao ◽  
Zhixia Teng

As one of important epigenetic modifications, DNA N4-methylcytosine (4mC) plays a crucial role in controlling gene replication, expression, cell cycle, DNA replication, and differentiation. The accurate identification of 4mC sites is necessary to understand biological functions. In the paper, we use ensemble learning to develop a model named i4mC-EL to identify 4mC sites in the mouse genome. Firstly, a multifeature encoding scheme consisting of Kmer and EIIP was adopted to describe the DNA sequences. Secondly, on the basis of the multifeature encoding scheme, we developed a stacked ensemble model, in which four machine learning algorithms, namely, BayesNet, NaiveBayes, LibSVM, and Voted Perceptron, were utilized to implement an ensemble of base classifiers that produce intermediate results as input of the metaclassifier, Logistic. The experimental results on the independent test dataset demonstrate that the overall rate of predictive accurate of i4mC-EL is 82.19%, which is better than the existing methods. The user-friendly website implementing i4mC-EL can be accessed freely at the following.


2011 ◽  
Vol 2011 ◽  
pp. 1-12 ◽  
Author(s):  
Karim El-Laithy ◽  
Martin Bogdan

An integration of both the Hebbian-based and reinforcement learning (RL) rules is presented for dynamic synapses. The proposed framework permits the Hebbian rule to update the hidden synaptic model parameters regulating the synaptic response rather than the synaptic weights. This is performed using both the value and the sign of the temporal difference in the reward signal after each trial. Applying this framework, a spiking network with spike-timing-dependent synapses is tested to learn the exclusive-OR computation on a temporally coded basis. Reward values are calculated with the distance between the output spike train of the network and a reference target one. Results show that the network is able to capture the required dynamics and that the proposed framework can reveal indeed an integrated version of Hebbian and RL. The proposed framework is tractable and less computationally expensive. The framework is applicable to a wide class of synaptic models and is not restricted to the used neural representation. This generality, along with the reported results, supports adopting the introduced approach to benefit from the biologically plausible synaptic models in a wide range of intuitive signal processing.


Synlett ◽  
2020 ◽  
Author(s):  
Akira Yada ◽  
Kazuhiko Sato ◽  
Tarojiro Matsumura ◽  
Yasunobu Ando ◽  
Kenji Nagata ◽  
...  

AbstractThe prediction of the initial reaction rate in the tungsten-catalyzed epoxidation of alkenes by using a machine learning approach is demonstrated. The ensemble learning framework used in this study consists of random sampling with replacement from the training dataset, the construction of several predictive models (weak learners), and the combination of their outputs. This approach enables us to obtain a reasonable prediction model that avoids the problem of overfitting, even when analyzing a small dataset.


2021 ◽  
Vol 12 (6) ◽  
pp. 1-23
Author(s):  
Shuo Tao ◽  
Jingang Jiang ◽  
Defu Lian ◽  
Kai Zheng ◽  
Enhong Chen

Mobility prediction plays an important role in a wide range of location-based applications and services. However, there are three problems in the existing literature: (1) explicit high-order interactions of spatio-temporal features are not systemically modeled; (2) most existing algorithms place attention mechanisms on top of recurrent network, so they can not allow for full parallelism and are inferior to self-attention for capturing long-range dependence; (3) most literature does not make good use of long-term historical information and do not effectively model the long-term periodicity of users. To this end, we propose MoveNet and RLMoveNet. MoveNet is a self-attention-based sequential model, predicting each user’s next destination based on her most recent visits and historical trajectory. MoveNet first introduces a cross-based learning framework for modeling feature interactions. With self-attention on both the most recent visits and historical trajectory, MoveNet can use an attention mechanism to capture the user’s long-term regularity in a more efficient way. Based on MoveNet, to model long-term periodicity more effectively, we add the reinforcement learning layer and named RLMoveNet. RLMoveNet regards the human mobility prediction as a reinforcement learning problem, using the reinforcement learning layer as the regularization part to drive the model to pay attention to the behavior with periodic actions, which can help us make the algorithm more effective. We evaluate both of them with three real-world mobility datasets. MoveNet outperforms the state-of-the-art mobility predictor by around 10% in terms of accuracy, and simultaneously achieves faster convergence and over 4x training speedup. Moreover, RLMoveNet achieves higher prediction accuracy than MoveNet, which proves that modeling periodicity explicitly from the perspective of reinforcement learning is more effective.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 3675-3693 ◽  
Author(s):  
Salman Salloum ◽  
Joshua Zhexue Huang ◽  
Yulin He ◽  
Xiaojun Chen

2012 ◽  
Vol 522 ◽  
pp. 823-827
Author(s):  
Jian Jiang Fang ◽  
Wen Jun Qi

The gear drive is the wide range of applications and is particularly important as a form of mechanical transmission, but the design process requires large amounts of data access and computation. In the paper, computer integrated technology and object-oriented technology is used to research and develop the intelligent design of Straight gear reducer system with user-friendly interactive platform, easy to use, high design efficiency and reliable data.


Sign in / Sign up

Export Citation Format

Share Document