scholarly journals Energy Consumption Load Forecasting Using a Level-Based Random Forest Classifier

Symmetry ◽  
2019 ◽  
Vol 11 (8) ◽  
pp. 956 ◽  
Author(s):  
Yu-Tung Chen ◽  
Eduardo Piedad ◽  
Cheng-Chien Kuo

Energy consumers may not know whether their next-hour forecasted load is either high or low based on the actual value predicted from their historical data. A conventional method of level prediction with a pattern recognition approach was performed by first predicting the actual numerical values using typical pattern-based regression models, hen classifying them into pattern levels (e.g., low, average, and high). A proposed prediction with pattern recognition scheme was developed to directly predict the desired levels using simpler classifier models without undergoing regression. The proposed pattern recognition classifier was compared to its regression method using a similar algorithm applied to a real-world energy dataset. A random forest (RF) algorithm which outperformed other widely used machine learning (ML) techniques in previous research was used in both methods. Both schemes used similar parameters for training and testing simulations. After 10-time cross training validation and five averaged repeated runs with random permutation per data splitting, the proposed classifier shows better computation speed and higher classification accuracy than the conventional method. However, when the number of its desired levels increases, its prediction accuracy seems to decrease and approaches the accuracy of the conventional method. The developed energy level prediction, which is computationally inexpensive and has a good classification performance, can serve as an alternative forecasting scheme.

Sensors ◽  
2021 ◽  
Vol 21 (18) ◽  
pp. 6020
Author(s):  
Katarzyna Harezlak ◽  
Michal Blasiak ◽  
Pawel Kasprowski

The paper presents studies on biometric identification methods based on the eye movement signal. New signal features were investigated for this purpose. They included its representation in the frequency domain and the largest Lyapunov exponent, which characterizes the dynamics of the eye movement signal seen as a nonlinear time series. These features, along with the velocities and accelerations used in the previously conducted works, were determined for 100-ms eye movement segments. 24 participants took part in the experiment, composed of two sessions. The users’ task was to observe a point appearing on the screen in 29 locations. The eye movement recordings for each point were used to create a feature vector in two variants: one vector for one point and one vector including signal for three consecutive locations. Two approaches for defining the training and test sets were applied. In the first one, 75% of randomly selected vectors were used as the training set, under a condition of equal proportions for each participant in both sets and the disjointness of the training and test sets. Among four classifiers: kNN (k = 5), decision tree, naïve Bayes, and random forest, good classification performance was obtained for decision tree and random forest. The efficiency of the last method reached 100%. The outcomes were much worse in the second scenario when the training and testing sets when defined based on recordings from different sessions; the possible reasons are discussed in the paper.


Author(s):  
Yuejun Liu ◽  
Yifei Xu ◽  
Xiangzheng Meng ◽  
Xuguang Wang ◽  
Tianxu Bai

Background: Medical imaging plays an important role in the diagnosis of thyroid diseases. In the field of machine learning, multiple dimensional deep learning algorithms are widely used in image classification and recognition, and have achieved great success. Objective: The method based on multiple dimensional deep learning is employed for the auxiliary diagnosis of thyroid diseases based on SPECT images. The performances of different deep learning models are evaluated and compared. Methods: Thyroid SPECT images are collected with three types, they are hyperthyroidism, normal and hypothyroidism. In the pre-processing, the region of interest of thyroid is segmented and the amount of data sample is expanded. Four CNN models, including CNN, Inception, VGG16 and RNN, are used to evaluate deep learning methods. Results: Deep learning based methods have good classification performance, the accuracy is 92.9%-96.2%, AUC is 97.8%-99.6%. VGG16 model has the best performance, the accuracy is 96.2% and AUC is 99.6%. Especially, the VGG16 model with a changing learning rate works best. Conclusion: The standard CNN, Inception, VGG16, and RNN four deep learning models are efficient for the classification of thyroid diseases with SPECT images. The accuracy of the assisted diagnostic method based on deep learning is higher than that of other methods reported in the literature.


Author(s):  
Farrikh Alzami ◽  
Erika Devi Udayanti ◽  
Dwi Puji Prabowo ◽  
Rama Aria Megantara

Sentiment analysis in terms of polarity classification is very important in everyday life, with the existence of polarity, many people can find out whether the respected document has positive or negative sentiment so that it can help in choosing and making decisions. Sentiment analysis usually done manually. Therefore, an automatic sentiment analysis classification process is needed. However, it is rare to find studies that discuss extraction features and which learning models are suitable for unstructured sentiment analysis types with the Amazon food review case. This research explores some extraction features such as Word Bags, TF-IDF, Word2Vector, as well as a combination of TF-IDF and Word2Vector with several machine learning models such as Random Forest, SVM, KNN and Naïve Bayes to find out a combination of feature extraction and learning models that can help add variety to the analysis of polarity sentiments. By assisting with document preparation such as html tags and punctuation and special characters, using snowball stemming, TF-IDF results obtained with SVM are suitable for obtaining a polarity classification in unstructured sentiment analysis for the case of Amazon food review with a performance result of 87,3 percent.


Energies ◽  
2021 ◽  
Vol 14 (7) ◽  
pp. 1809
Author(s):  
Mohammed El Amine Senoussaoui ◽  
Mostefa Brahami ◽  
Issouf Fofana

Machine learning is widely used as a panacea in many engineering applications including the condition assessment of power transformers. Most statistics attribute the main cause of transformer failure to insulation degradation. Thus, a new, simple, and effective machine-learning approach was proposed to monitor the condition of transformer oils based on some aging indicators. The proposed approach was used to compare the performance of two machine-learning classifiers: J48 decision tree and random forest. The service-aged transformer oils were classified into four groups: the oils that can be maintained in service, the oils that should be reconditioned or filtered, the oils that should be reclaimed, and the oils that must be discarded. From the two algorithms, random forest exhibited a better performance and high accuracy with only a small amount of data. Good performance was achieved through not only the application of the proposed algorithm but also the approach of data preprocessing. Before feeding the classification model, the available data were transformed using the simple k-means method. Subsequently, the obtained data were filtered through correlation-based feature selection (CFsSubset). The resulting features were again retransformed by conducting the principal component analysis and were passed through the CFsSubset filter. The transformation and filtration of the data improved the classification performance of the adopted algorithms, especially random forest. Another advantage of the proposed method is the decrease in the number of the datasets required for the condition assessment of transformer oils, which is valuable for transformer condition monitoring.


2012 ◽  
Vol 8 (2) ◽  
pp. 44-63 ◽  
Author(s):  
Baoxun Xu ◽  
Joshua Zhexue Huang ◽  
Graham Williams ◽  
Qiang Wang ◽  
Yunming Ye

The selection of feature subspaces for growing decision trees is a key step in building random forest models. However, the common approach using randomly sampling a few features in the subspace is not suitable for high dimensional data consisting of thousands of features, because such data often contains many features which are uninformative to classification, and the random sampling often doesn’t include informative features in the selected subspaces. Consequently, classification performance of the random forest model is significantly affected. In this paper, the authors propose an improved random forest method which uses a novel feature weighting method for subspace selection and therefore enhances classification performance over high-dimensional data. A series of experiments on 9 real life high dimensional datasets demonstrated that using a subspace size of features where M is the total number of features in the dataset, our random forest model significantly outperforms existing random forest models.


2003 ◽  
Vol 36 (5) ◽  
pp. 861-866 ◽  
Author(s):  
A. Marciniak ◽  
C.D. Bocăială ◽  
R. Louro ◽  
J. Sa da Costa ◽  
J. Korbicz

Sign in / Sign up

Export Citation Format

Share Document