Prediction of Power Outage Quantity of Distribution Network Users under Typhoon Disaster Based on Random Forest and Important Variables

Mathematical Problems in Engineering ◽

10.1155/2021/6682242 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Min Li ◽

Hui Hou ◽

Jufang Yu ◽

Hao Geng ◽

Ling Zhu ◽

...

Keyword(s):

Random Forest ◽

Explanatory Variable ◽

Model Simulation ◽

Distribution Network ◽

Important Variable ◽

Support Vector ◽

Variable Model ◽

Power Outage ◽

Global Variable ◽

Typhoon Disaster

Typhoons can have disastrous effects on power systems. They may lead to a large number of power outages for distribution network users. Therefore, this paper establishes a model to predict the power outage quantity of distribution network users under a typhoon disaster. Firstly, twenty-six explanatory variables (called global variables) covering meteorological factors, geographical factors, and power grid factors are considered as the input variables. On this basis, the correlation between each explanatory variable and response variable is analyzed. Secondly, we established a global variable model to predict the power outage quantity of distribution network users based on Random Forest (RF) algorithm. Then the importance of each explanatory variable is mined to extract the most important variables. To reduce the complexity of the model and ease the burden of data collection, eight variables are eventually selected as important variables. Afterward, we predict the power outage quantity of distribution network users again using the eight important variables. Thirdly, we compare the prediction accuracy of a model called the No-model that has been used before, Linear Regression (LR), Support Vector Regression (SVR), Decision Tree Regression (DTR), RF-global variable model, and RF-important variable model. Simulation results show that the RF-important variable model proposed in this paper has a better effect. Since fewer variables can save prediction time and make the model simplified, it is recommended to use the RF-important variable model.

Download Full-text

A Machine Learning Approach to Predict Customer Usage of a Home Workout Platform

Applied Sciences ◽

10.3390/app11219927 ◽

2021 ◽

Vol 11 (21) ◽

pp. 9927

Author(s):

Qiuying Chen ◽

SangJoon Lee

Keyword(s):

Machine Learning ◽

Random Forest ◽

Nearest Neighbor ◽

Important Variable ◽

Receiver Operating Curve ◽

Support Vector ◽

Learning Approach ◽

K Nearest Neighbor ◽

Machine Learning Approach ◽

Supervised Learning Algorithms

Health authorities have recommended the use of digital tools for home workouts to stay active and healthy during the COVID-19 pandemic. In this paper, a machine learning approach is proposed to assess the activity of users on a home workout platform. Keep is a home workout application dedicated to providing one-stop exercise solutions such as fitness teaching, cycling, running, yoga, and fitness diet guidance. We used a data crawler to collect the total training set data of 7734 Keep users and compared four supervised learning algorithms: support vector machine, k-nearest neighbor, random forest, and logistic regression. The receiver operating curve analysis indicated that the overall discrimination verification power of random forest was better than that of the other three models. The random forest model was used to classify 850 test samples, and a correct rate of 88% was obtained. This approach can predict the continuous usage of users after installing the home workout application. We considered 18 variables on Keep that were expected to affect the determination of continuous participation. Keep certification is the most important variable that affected the results of this study. Keep certification refers to someone who has verified their identity information and can, therefore, obtain the Keep certification logo. The results show that the platform still needs to be improved in terms of real identity privacy information and other aspects.

Download Full-text

Estimating the Compressive Strength of Cement-Based Materials with Mining Waste Using Support Vector Machine, Decision Tree, and Random Forest Models

Advances in Civil Engineering ◽

10.1155/2021/6629466 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Hongxia Ma ◽

Jiandong Liu ◽

Jia Zhang ◽

Jiandong Huang

Keyword(s):

Support Vector Machine ◽

Compressive Strength ◽

Random Forest ◽

Decision Tree ◽

Experimental Studies ◽

Fine Sand ◽

Important Variable ◽

Mining Waste ◽

Support Vector ◽

Cement Based Materials

To estimate the compressive strength of cement-based materials with mining waste, the dataset based on a series of experimental studies was constructed. The support vector machine (SVM), decision tree (DT), and random forest (RF) models were developed and compared. The beetle antennae search (BAS) algorithm was employed to tune the hyperparameters of the developed machine learning models. The predictive performances of the three models were compared by the evaluation of the values of correlation coefficient (R) and root mean square error (RMSE). The results showed that the BAS algorithm can effectively tune these artificial intelligence models. The SVM model can obtain the minimum RMSE, while the BAS algorithm is inefficient in DT and RF models. The SVM, DT, and RF models can be used to predict the compressive strength of cement-based materials using solid mining waste as aggregate effectively and accurately, with high R values and lower RMSE values. The RF algorithm can obtain the highest value of R and the lowest value of RMSE, demonstrating the highest accuracy. The solid mining waste to cement ratio is the most important variable to affect the compressive strength. Curing time was also an important parameter in the compressive strength of cemented materials, followed by the water-solid ratio of mining waste and fine sand ratio.

Download Full-text

Investigating the use of random forest, gradient boosting machine, support vector machine and their ensemble applied to fault detection

10.26678/abcm.cobem2017.cob17-1600 ◽

2017 ◽

Author(s):

Luis Felipe Nogoseke ◽

Gabriel Herman Bernardim Andrade ◽

Marco Boaretto ◽

Leandro Coelho

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Fault Detection ◽

Gradient Boosting ◽

Support Vector ◽

Gradient Boosting Machine

Download Full-text

Extraction of Arecanut Planting Distribution Based on the Feature Space Optimization of PlanetScope Imagery

Agriculture ◽

10.3390/agriculture11040371 ◽

2021 ◽

Vol 11 (4) ◽

pp. 371

Author(s):

Yu Jin ◽

Jiawei Guo ◽

Huichun Ye ◽

Jinling Zhao ◽

Wenjiang Huang ◽

...

Keyword(s):

Random Forest ◽

Satellite Imagery ◽

Feature Space ◽

Kappa Coefficient ◽

Classification Model ◽

Support Vector ◽

Textural Feature ◽

Monitoring Accuracy ◽

Areca Catechu ◽

High Level

The remote sensing extraction of large areas of arecanut (Areca catechu L.) planting plays an important role in investigating the distribution of arecanut planting area and the subsequent adjustment and optimization of regional planting structures. Satellite imagery has previously been used to investigate and monitor the agricultural and forestry vegetation in Hainan. However, the monitoring accuracy is affected by the cloudy and rainy climate of this region, as well as the high level of land fragmentation. In this paper, we used PlanetScope imagery at a 3 m spatial resolution over the Hainan arecanut planting area to investigate the high-precision extraction of the arecanut planting distribution based on feature space optimization. First, spectral and textural feature variables were selected to form the initial feature space, followed by the implementation of the random forest algorithm to optimize the feature space. Arecanut planting area extraction models based on the support vector machine (SVM), BP neural network (BPNN), and random forest (RF) classification algorithms were then constructed. The overall classification accuracies of the SVM, BPNN, and RF models optimized by the RF features were determined as 74.82%, 83.67%, and 88.30%, with Kappa coefficients of 0.680, 0.795, and 0.853, respectively. The RF model with optimized features exhibited the highest overall classification accuracy and kappa coefficient. The overall accuracy of the SVM, BPNN, and RF models following feature optimization was improved by 3.90%, 7.77%, and 7.45%, respectively, compared with the corresponding unoptimized classification model. The kappa coefficient also improved. The results demonstrate the ability of PlanetScope satellite imagery to extract the planting distribution of arecanut. Furthermore, the RF is proven to effectively optimize the initial feature space, composed of spectral and textural feature variables, further improving the extraction accuracy of the arecanut planting distribution. This work can act as a theoretical and technical reference for the agricultural and forestry industries.

Download Full-text

The transferability of random forest and support vector machine for estimating daily global solar radiation using sunshine duration over different climate zones

Theoretical and Applied Climatology ◽

10.1007/s00704-021-03726-6 ◽

2021 ◽

Author(s):

Wei Wu ◽

Mao-Fen Li ◽

Xia Xu ◽

Xiao-Ping Tang ◽

Chao Yang ◽

...

Keyword(s):

Support Vector Machine ◽

Random Forest ◽

Solar Radiation ◽

Sunshine Duration ◽

Global Solar Radiation ◽

Support Vector ◽

Climate Zones

Download Full-text

Machine Learning-Based Prediction of Air Quality

Applied Sciences ◽

10.3390/app10249151 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9151

Author(s):

Yun-Chia Liang ◽

Yona Maimury ◽

Angela Hsiang-Ling Chen ◽

Josue Rodolfo Cuevas Juarez

Keyword(s):

Machine Learning ◽

Air Quality ◽

Random Forest ◽

Prediction Models ◽

Superior Performance ◽

Support Vector ◽

Economic Activities ◽

Adaptive Boosting ◽

Series Of Experiments ◽

Artificial Neural Network Ann

Air, an essential natural resource, has been compromised in terms of quality by economic activities. Considerable research has been devoted to predicting instances of poor air quality, but most studies are limited by insufficient longitudinal data, making it difficult to account for seasonal and other factors. Several prediction models have been developed using an 11-year dataset collected by Taiwan’s Environmental Protection Administration (EPA). Machine learning methods, including adaptive boosting (AdaBoost), artificial neural network (ANN), random forest, stacking ensemble, and support vector machine (SVM), produce promising results for air quality index (AQI) level predictions. A series of experiments, using datasets for three different regions to obtain the best prediction performance from the stacking ensemble, AdaBoost, and random forest, found the stacking ensemble delivers consistently superior performance for R2 and RMSE, while AdaBoost provides best results for MAE.

Download Full-text

BioSignal modelling for prediction of cardiac diseases using intra group selection method

Intelligent Decision Technologies ◽

10.3233/idt-200058 ◽

2021 ◽

Vol 15 (1) ◽

pp. 151-160

Author(s):

Hemant P. Kasturiwale ◽

Sujata N. Kale

Keyword(s):

Machine Learning ◽

Nervous System ◽

Random Forest ◽

Normal Sinus Rhythm ◽

Heart Defects ◽

Support Vector ◽

Autonomous Nervous System ◽

Cardiac Diseases ◽

Vast Number ◽

Proposed Model

The Autonomous Nervous System (ANS) controls the nervous system and Heart Rate Variability (HRV) can be used as a diagnostic tool to diagnose heart defects. HRV can be classified into linear and nonlinear HRV indices which are used mostly to measure the efficiency of the model. For prediction of cardiac diseases, the selection and extraction features of machine learning model are effective. The available model used till date is based on HRV indices to predict the cardiac diseases accurately. The model could hardly throw light on specifics of indices, selection process and stability of the model. The proposed model is developed considering all facet electrocardiogram amplitude (ECG), frequency components, sampling frequency, extraction methods and acquisition techniques. The machine learning based model and its performance shall be tested using the standard BioSignal method, both on the data available and on the data obtained by the author. This is unique model developed by considering the vast number of mixtures sets and more than four complex cardiac classes. The statistical analysis is performed on a variety of databases such as MIT/BIH Normal Sinus Rhythm (NSR), MIT/BIH Arrhythmia (AR) and MIT/BIH Atrial Fibrillation (AF) and Peripheral Pule Analyser using feature compatibility techniques. The classifiers are trained for prediction with approximately 40000 sets of parameters. The proposed model reaches an average accuracy of 97.87 percent and is sensitive and précised. The best features are chosen from the different HRV features that will be used for classification. The present model was checked under all possible subject scenarios, such as the raw database and the non-ECG signal. In this sense, robustness is defined not only by the specificity parameter, but also by other measuring output parameters. Support Vector Machine (SVM), K-nearest Neighbour (KNN), Ensemble Adaboost (EAB) with Random Forest (RF) are tested in a 5% higher precision band and a lower band configuration. The Random Forest has produced better results, and its robustness has been established.

Download Full-text

A Methodology Based on FT-IR Data Combined with Random Forest Model to Generate Spectralprints for the Characterization of High-Quality Vinegars

Foods ◽

10.3390/foods10061411 ◽

2021 ◽

Vol 10 (6) ◽

pp. 1411

Author(s):

José Luis P. Calle ◽

Marta Ferreiro-González ◽

Ana Ruiz-Rodríguez ◽

Gerardo F. Barbero ◽

José Á. Álvarez ◽

...

Keyword(s):

Random Forest ◽

Raw Materials ◽

Principal Component ◽

Hierarchical Cluster ◽

Raw Material ◽

Support Vector ◽

Protected Designation Of Origin ◽

Ft Ir

Sherry wine vinegar is a Spanish gourmet product under Protected Designation of Origin (PDO). Before a vinegar can be labeled as Sherry vinegar, the product must meet certain requirements as established by its PDO, which, in this case, means that it has been produced following the traditional solera and criadera ageing system. The quality of the vinegar is determined by many factors such as the raw material, the acetification process or the aging system. For this reason, mainly producers, but also consumers, would benefit from the employment of effective analytical tools that allow precisely determining the origin and quality of vinegar. In the present study, a total of 48 Sherry vinegar samples manufactured from three different starting wines (Palomino Fino, Moscatel, and Pedro Ximénez wine) were analyzed by Fourier-transform infrared (FT-IR) spectroscopy. The spectroscopic data were combined with unsupervised exploratory techniques such as hierarchical cluster analysis (HCA) and principal component analysis (PCA), as well as other nonparametric supervised techniques, namely, support vector machine (SVM) and random forest (RF), for the characterization of the samples. The HCA and PCA results present a clear grouping trend of the vinegar samples according to their raw materials. SVM in combination with leave-one-out cross-validation (LOOCV) successfully classified 100% of the samples, according to the type of wine used for their production. The RF method allowed selecting the most important variables to develop the characteristic fingerprint (“spectralprint”) of the vinegar samples according to their starting wine. Furthermore, the RF model reached 100% accuracy for both LOOCV and out-of-bag (OOB) sets.

Download Full-text

Feasibility study of typhoon disaster economic loss assessment based on random forest

IOP Conference Series Earth and Environmental Science ◽

10.1088/1755-1315/546/3/032004 ◽

2020 ◽

Vol 546 ◽

pp. 032004

Author(s):

Fei Wang

Keyword(s):

Random Forest ◽

Feasibility Study ◽

Economic Loss ◽

Loss Assessment ◽

Typhoon Disaster

Download Full-text

Prostate Cancer Classification Using Random Forest and Support Vector Machines

Journal of Physics Conference Series ◽

10.1088/1742-6596/1752/1/012043 ◽

2021 ◽

Vol 1752 (1) ◽

pp. 012043

Author(s):

Z Rustam ◽

N Angie

Keyword(s):

Prostate Cancer ◽

Support Vector Machines ◽

Random Forest ◽

Cancer Classification ◽

Support Vector ◽

Vector Machines ◽

Prostate Cancer Classification

Download Full-text