scholarly journals iEnhancer-MFGBDT: Identifying enhancers and their strength by fusing multiple features and gradient boosting decision tree

2021 ◽  
Vol 18 (6) ◽  
pp. 8797-8814
Author(s):  
Yunyun Liang ◽  
◽  
Shengli Zhang ◽  
Huijuan Qiao ◽  
Yinan Cheng ◽  
...  

<abstract> <p>Enhancer is a non-coding DNA fragment that can be bound with proteins to activate transcription of a gene, hence play an important role in regulating gene expression. Enhancer identification is very challenging and more complicated than other genetic factors due to their position variation and free scattering. In addition, it has been proved that genetic variation in enhancers is related to human diseases. Therefore, identification of enhancers and their strength has important biological meaning. In this paper, a novel model named iEnhancer-MFGBDT is developed to identify enhancer and their strength by fusing multiple features and gradient boosting decision tree (GBDT). Multiple features include k-mer and reverse complement k-mer nucleotide composition based on DNA sequence, and second-order moving average, normalized Moreau-Broto auto-cross correlation and Moran auto-cross correlation based on dinucleotide physical structural property matrix. Then we use GBDT to select features and perform classification successively. The accuracies reach 78.67% and 66.04% for identifying enhancers and their strength on the benchmark dataset, respectively. Compared with other models, the results show that our model is useful and effective intelligent tool to identify enhancers and their strength, of which the datasets and source codes are available at https://github.com/shengli0201/iEnhancer-MFGBDT1.</p> </abstract>

Algorithms ◽  
2021 ◽  
Vol 14 (6) ◽  
pp. 179
Author(s):  
Chunxia Wang ◽  
Jun Bi ◽  
Qiuyue Sai ◽  
Zun Yuan

With the development of the sharing economy, carsharing is a major achievement in the current mode of transportation in sharing economies. Carsharing can effectively alleviate traffic congestion and reduce the travel cost of residents. However, due to the randomness of users’ travel demand, carsharing operators are faced with problems, such as imbalance in vehicle demand at stations. Therefore, scientific prediction of users’ travel demand is important to ensure the efficient operation of carsharing. The main purpose of this study is to use gradient boosting decision tree to predict the travel demand of station-based carsharing users. The case study is conducted in Lanzhou City, Gansu Province, China. To improve the accuracy, gradient boosting decision tree is designed to predict the demands of users at different stations at various times based on the actual operating data of carsharing. The prediction results are compared with results of the autoregressive integrated moving average. The conclusion shows that gradient boosting decision tree has higher prediction accuracy. This study can provide a reference value for user demand prediction in practical application.


2021 ◽  
Vol 11 (4) ◽  
pp. 1378
Author(s):  
Seung Hyun Lee ◽  
Jaeho Son

It has been pointed out that the act of carrying a heavy object that exceeds a certain weight by a worker at a construction site is a major factor that puts physical burden on the worker’s musculoskeletal system. However, due to the nature of the construction site, where there are a large number of workers simultaneously working in an irregular space, it is difficult to figure out the weight of the object carried by the worker in real time or keep track of the worker who carries the excess weight. This paper proposes a prototype system to track the weight of heavy objects carried by construction workers by developing smart safety shoes with FSR (Force Sensitive Resistor) sensors. The system consists of smart safety shoes with sensors attached, a mobile device for collecting initial sensing data, and a web-based server computer for storing, preprocessing and analyzing such data. The effectiveness and accuracy of the weight tracking system was verified through the experiments where a weight was lifted by each experimenter from +0 kg to +20 kg in 5 kg increments. The results of the experiment were analyzed by a newly developed machine learning based model, which adopts effective classification algorithms such as decision tree, random forest, gradient boosting algorithm (GBM), and light GBM. The average accuracy classifying the weight by each classification algorithm showed similar, but high accuracy in the following order: random forest (90.9%), light GBM (90.5%), decision tree (90.3%), and GBM (89%). Overall, the proposed weight tracking system has a significant 90.2% average accuracy in classifying how much weight each experimenter carries.


2011 ◽  
Vol 243-249 ◽  
pp. 6292-6295 ◽  
Author(s):  
Rong Yau Huang ◽  
Li Hsu Yeh ◽  
Hao Hsien Chen ◽  
Jyh Dong Lin ◽  
Ping Fu Chen ◽  
...  

This study examines construction waste generation and management in Taiwan. We verify the factors probable affecting the output of construction wastes by using data for the output of declared construction wastes produced from demolition projects in Taiwan in the last year, expert interviews, and research achievements in the past, and find “ on-site separation” is the factor with effects on the output of construction wastes via cross-correlation by algorithms such as K-Means and Decision Tree C5.0. It can be seen that the output (0.092(t/M3) with on-site separation or 0.329(t/M3) without on-site separation is highly correlated with the composition ratio of construction wastes and referred to as a valid conclusion.


2019 ◽  
Vol 15 (2) ◽  
pp. 201-214 ◽  
Author(s):  
Mahmoud Elish

Purpose Effective and efficient software security inspection is crucial as the existence of vulnerabilities represents severe risks to software users. The purpose of this paper is to empirically evaluate the potential application of Stochastic Gradient Boosting Trees (SGBT) as a novel model for enhanced prediction of vulnerable Web components compared to common, popular and recent machine learning models. Design/methodology/approach An empirical study was conducted where the SGBT and 16 other prediction models have been trained, optimized and cross validated using vulnerability data sets from multiple versions of two open-source Web applications written in PHP. The prediction performance of these models have been evaluated and compared based on accuracy, precision, recall and F-measure. Findings The results indicate that the SGBT models offer improved prediction over the other 16 models and thus are more effective and reliable in predicting vulnerable Web components. Originality/value This paper proposed a novel application of SGBT for enhanced prediction of vulnerable Web components and showed its effectiveness.


Sign in / Sign up

Export Citation Format

Share Document