Comprehensive and Comparative Analysis of GAM-Based PV Power Forecasting Models Using Multidimensional Tensor Product Splines against Machine Learning Techniques

Takuji Matsumoto; Yuji Yamada

doi:10.3390/en14217146

Comprehensive and Comparative Analysis of GAM-Based PV Power Forecasting Models Using Multidimensional Tensor Product Splines against Machine Learning Techniques

Energies ◽

10.3390/en14217146 ◽

2021 ◽

Vol 14 (21) ◽

pp. 7146

Author(s):

Takuji Matsumoto ◽

Yuji Yamada

Keyword(s):

Machine Learning ◽

Power Generation ◽

Tensor Product ◽

Machine Learning Techniques ◽

Series Data ◽

Support Vector ◽

Forecasting Model ◽

Wide Range ◽

Tensor Product Splines ◽

Power Forecasting

In recent years, as photovoltaic (PV) power generation has rapidly increased on a global scale, there is a growing need for a highly accurate power generation forecasting model that is easy to implement for a wide range of electric utilities. Against this background, this study proposes a PV power forecasting model based on the generalized additive model (GAM) and compares its forecasting accuracy with four popular machine learning methods: k-nearest neighbor, artificial neural networks, support vector regression, and random forest. The empirical analysis provides an intuitive interpretation of the multidimensional smooth trends estimated by the GAM as tensor product splines and confirms the validity of the proposed modeling structure. The effectiveness of GAM is particularly evident in trend completion for missing data, where it is able to flexibly express the tangled trend structure inherent in time series data, and thus has an advantage not only in interpretability but also in improving forecast accuracy.

Download Full-text

Evaluation of Machine Learning Approaches for Automated Diagnosis of COVID-19 using X-Ray images (Preprint)

10.2196/preprints.18947 ◽

2020 ◽

Author(s):

Mazin Mohammed ◽

Karrar Hameed Abdulkareem ◽

Mashael S. Maashi ◽

Salama A. Mostafa A. Mostafa ◽

Abdullah Baz ◽

...

Keyword(s):

Machine Learning ◽

Computational Method ◽

Learning Performance ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approaches ◽

Data Set ◽

X Ray ◽

Wide Range ◽

Artificial Neural Network Ann

BACKGROUND In most recent times, global concern has been caused by a coronavirus (COVID19), which is considered a global health threat due to its rapid spread across the globe. Machine learning (ML) is a computational method that can be used to automatically learn from experience and improve the accuracy of predictions. OBJECTIVE In this study, the use of machine learning has been applied to Coronavirus dataset of 50 X-ray images to enable the development of directions and detection modalities with risk causes.The dataset contains a wide range of samples of COVID-19 cases alongside SARS, MERS, and ARDS. The experiment was carried out using a total of 50 X-ray images, out of which 25 images were that of positive COVIDE-19 cases, while the other 25 were normal cases. METHODS An orange tool has been used for data manipulation. To be able to classify patients as carriers of Coronavirus and non-Coronavirus carriers, this tool has been employed in developing and analysing seven types of predictive models. Models such as , artificial neural network (ANN), support vector machine (SVM), linear kernel and radial basis function (RBF), k-nearest neighbour (k-NN), Decision Tree (DT), and CN2 rule inducer were used in this study.Furthermore, the standard InceptionV3 model has been used for feature extraction target. RESULTS The various machine learning techniques that have been trained on coronavirus disease 2019 (COVID-19) dataset with improved ML techniques parameters. The data set was divided into two parts, which are training and testing. The model was trained using 70% of the dataset, while the remaining 30% was used to test the model. The results show that the improved SVM achieved a F1 of 97% and an accuracy of 98%. CONCLUSIONS :. In this study, seven models have been developed to aid the detection of coronavirus. In such cases, the learning performance can be improved through knowledge transfer, whereby time-consuming data labelling efforts are not required.the evaluations of all the models are done in terms of different parameters. it can be concluded that all the models performed well, but the SVM demonstrated the best result for accuracy metric. Future work will compare classical approaches with deep learning ones and try to obtain better results. CLINICALTRIAL None

Download Full-text

Machine Learning Techniques for Code Smells Detection: A Systematic Mapping Study

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s021819401950013x ◽

2019 ◽

Vol 29 (02) ◽

pp. 285-316 ◽

Cited By ~ 7

Author(s):

Frederico Luiz Caram ◽

Bruno Rafael De Oliveira Rodrigues ◽

Amadeu Silveira Campanelli ◽

Fernando Silva Parreiras

Keyword(s):

Machine Learning ◽

Empirical Studies ◽

Machine Learning Techniques ◽

Support Vector ◽

Systematic Mapping Study ◽

Code Smells ◽

Mapping Study ◽

Learning Techniques ◽

Wide Range ◽

High Level

Code smells or bad smells are an accepted approach to identify design flaws in the source code. Although it has been explored by researchers, the interpretation of programmers is rather subjective. One way to deal with this subjectivity is to use machine learning techniques. This paper provides the reader with an overview of machine learning techniques and code smells found in the literature, aiming at determining which methods and practices are used when applying machine learning for code smells identification and which machine learning techniques have been used for code smells identification. A mapping study was used to identify the techniques used for each smell. We found that the Bloaters was the main kind of smell studied, addressed by 35% of the papers. The most commonly used technique was Genetic Algorithms (GA), used by 22.22% of the papers. Regarding the smells addressed by each technique, there was a high level of redundancy, in a way that the smells are covered by a wide range of algorithms. Nevertheless, Feature Envy stood out, being targeted by 63% of the techniques. When it comes to performance, the best average was provided by Decision Tree, followed by Random Forest, Semi-supervised and Support Vector Machine Classifier techniques. 5 out of the 25 analyzed smells were not handled by any machine learning techniques. Most of them focus on several code smells and in general there is no outperforming technique, except for a few specific smells. We also found a lack of comparable results due to the heterogeneity of the data sources and of the provided results. We recommend the pursuit of further empirical studies to assess the performance of these techniques in a standardized dataset to improve the comparison reliability and replicability.

Download Full-text

Practical foundations of machine learning for addiction research. Part I. Methods and techniques

10.31234/osf.io/ast53 ◽

2021 ◽

Author(s):

Pablo Cresta Morgado ◽

Martín Carusso ◽

Laura Alonso Alemany ◽

Laura Acion

Keyword(s):

Machine Learning ◽

Linear Models ◽

Principal Component ◽

Research Field ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Tools ◽

Wide Range ◽

Methods And Techniques ◽

Research Problems

Machine learning assembles a broad set of methods and techniques to solve a wide range of problems, such as identifying individuals with substance use disorders (SUD), finding patterns in neuroimages, understanding SUD prognostic factors and their association, or determining addiction genetic underpinnings. However, machine learning use in the addiction research field continues to be insufficient. This two-part review focuses on machine learning tools and concepts and provides insights into their capabilities to facilitate their understanding and acquisition by addiction researchers. In this first part, we present supervised and unsupervised methods and techniques such as linear models, naive Bayes, support vector machines, artificial neural networks, k-means, or principal component analysis and examples of how these tools are already in use in addiction research. We also provide open-source programming tools to apply these techniques. Throughout this work, we link machine learning techniques to applied statistics. Machine learning tools and techniques can be applied to many addiction research problems and can improve addiction research.

Download Full-text

Prediction of LQ45 Index in Indonesia Stock Exchange: A Comparative Study of Machine Learning Techniques

International Journal of Intelligent Engineering and Systems ◽

10.22266/ijies2021.0228.42 ◽

2021 ◽

Vol 14 (1) ◽

pp. 453-463

Author(s):

Abdul Syukur ◽

◽

Deden Istiawan ◽

Keyword(s):

Machine Learning ◽

Stock Market ◽

Stock Exchange ◽

Market Price ◽

Machine Learning Techniques ◽

Support Vector ◽

Classification Models ◽

Linear Discriminant ◽

Learning Techniques ◽

Wide Range

LQ45 is an Indonesia Stock Exchange Index (ISX) incorporate of 45 companies that meet certain criteria to target investors for selecting certain stocks. The prediction of stock price direction in the financial world is a major issue. The implementation of machine learning and other algorithms for market price analysis and forecasting is a very promising field. Different types of classification algorithms were used to predict the stock market. However, when individual studies are considered separately there is no clear consensus that algorithms work best. In this research, a comparison framework is proposed, which aims to benchmark the performance of a wide range of classification models and use them to predict the LQ45 index. The data in this research contains the transaction level and capitalization size are obtained from the Indonesian Stock Exchange (ISX). For analysis purposes, we set out 10 classifiers that can be used to build classification models and test their performance in the LQ45 dataset. The performance criterion chosen to measure this effect is accuracy, recall, and precision. The results showed that the random forest algorithm had the best performance for predicting the LQ45 index. Whilst the classification and regression trees, C4.5, support vector machine, and logistic regression algorithms also perform well. Besides, the models based on traditional statisticalbased learners that are Naïve Bayes and linear discriminant analysis seem to underperform for predicting the LQ45 index. These results are not only beneficial to enrichment the machine learning techniques literature but also have a significant influence on the stock market prediction in terms of the ability to predict the LQ45 index.

Download Full-text

A Comparison of the Performance of Supervised Learning Algorithms for Solar Power Prediction

Energies ◽

10.3390/en14154424 ◽

2021 ◽

Vol 14 (15) ◽

pp. 4424

Author(s):

Leidy Gutiérrez ◽

Julian Patiño ◽

Eduardo Duque-Grisales

Keyword(s):

Machine Learning ◽

Power Generation ◽

Large Scale ◽

Fossil Fuels ◽

Machine Learning Techniques ◽

Support Vector ◽

Power Prediction ◽

Electric Networks ◽

K Nearest Neighbors ◽

Supervised Learning Algorithms

Science seeks strategies to mitigate global warming and reduce the negative impacts of the long-term use of fossil fuels for power generation. In this sense, implementing and promoting renewable energy in different ways becomes one of the most effective solutions. The inaccuracy in the prediction of power generation from photovoltaic (PV) systems is a significant concern for the planning and operational stages of interconnected electric networks and the promotion of large-scale PV installations. This study proposes the use of Machine Learning techniques to model the photovoltaic power production for a system in Medellín, Colombia. Four forecasting models were generated from techniques compatible with Machine Learning and Artificial Intelligence methods: K-Nearest Neighbors (KNN), Linear Regression (LR), Artificial Neural Networks (ANN) and Support Vector Machines (SVM). The results obtained indicate that the four methods produced adequate estimations of photovoltaic energy generation. However, the best estimate according to RMSE and MAE is the ANN forecasting model. The proposed Machine Learning-based models were demonstrated to be practical and effective solutions to forecast PV power generation in Medellin.

Download Full-text

Retention Modeling at Scholastic Travel Company (A)

Darden Business Publishing Cases ◽

10.1108/case.darden.2021.000063 ◽

2017 ◽

pp. 1-7

Author(s):

Anton Ovchinnikov ◽

Scotiabank Scholar

Keyword(s):

Machine Learning ◽

Data Science ◽

Analysis Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Business Analytics ◽

Learning Techniques ◽

Wide Range ◽

Upper Level ◽

Undergraduate Programs

This case, along with its B case (UVA-QA-0865), is an effective vehicle for introducing students to the use of machine-learning techniques for classification. The specific context is predicting customer retention based on a wide range of customer attributes/features. The specific techniques could include (but are not limited to): regressions (linear and logistic), variable selection (forward/backward and stepwise), regularizations (e.g., LASSO), classification and regression trees (CART), random forests, graduate boosted trees (xgboost), neural networks, and support vector machines (SVM).The case is suitable for an advanced data analysis (data science, machine learning, and artificial intelligence) class at all levels: upper-level business undergraduate, MBA, EMBA, as well as specialized graduate or undergraduate programs in analytics (e.g., masters of science in business analytics [MSBA] and masters of management analytics [MMA]) and/or in management (e.g., masters of science in management [MScM] and masters in management [MiM, MM]).The teaching note for the case contains the pedagogy and the analyses, alongside the detailed explanations of the various techniques and their implementations in R (code provided in Exhibits and supplementary files). Python code, as well as the spreadsheet implementation in XLMiner, are available upon request.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Structure-Based Virtual Screening of Perfluoroalkyl and Polyfluoroalkyl Substances (PFASs) as Endocrine Disruptors of Androgen Receptor Activity Using Molecular Docking and Machine Learning

10.26434/chemrxiv.11886702.v1 ◽

2020 ◽

Author(s):

Azhagiya Singam Ettayapuram Ramaprasad ◽

Phum Tachachartvanich ◽

Denis Fourches ◽

Anatoly Soshilov ◽

Jennifer C.Y. Hsieh ◽

...

Keyword(s):

Machine Learning ◽

Molecular Docking ◽

Androgen Receptor ◽

Endocrine Disruptors ◽

Hormone Receptors ◽

Steroid Hormone Receptors ◽

Machine Learning Techniques ◽

Support Vector ◽

Polyfluoroalkyl Substances ◽

Perfluoroalkyl And Polyfluoroalkyl Substances

Perfluoroalkyl and Polyfluoroalkyl Substances (PFASs) pose a substantial threat as endocrine disruptors, and thus early identification of those that may interact with steroid hormone receptors, such as the androgen receptor (AR), is critical. In this study we screened 5,206 PFASs from the CompTox database against the different binding sites on the AR using both molecular docking and machine learning techniques. We developed support vector machine models trained on Tox21 data to classify the active and inactive PFASs for AR using different chemical fingerprints as features. The maximum accuracy was 95.01% and Matthew’s correlation coefficient (MCC) was 0.76 respectively, based on MACCS fingerprints (MACCSFP). The combination of docking-based screening and machine learning models identified 29 PFASs that have strong potential for activity against the AR and should be considered priority chemicals for biological toxicity testing.

Download Full-text

Efficient Prediction of Structural and Electronic Properties of Hybrid 2D Materials Using DFT and Machine Learning

10.26434/chemrxiv.6254756.v1 ◽

2018 ◽

Author(s):

Sherif Tawfik ◽

Olexandr Isayev ◽

Catherine Stampfl ◽

Joseph Shapter ◽

David Winkler ◽

...

Keyword(s):

Machine Learning ◽

Band Gap ◽

Density Functional ◽

2D Materials ◽

Van Der Waals ◽

Building Blocks ◽

Machine Learning Techniques ◽

Interlayer Distance ◽

Computational Screening ◽

Wide Range

Materials constructed from different van der Waals two-dimensional (2D) heterostructures offer a wide range of benefits, but these systems have been little studied because of their experimental and computational complextiy, and because of the very large number of possible combinations of 2D building blocks. The simulation of the interface between two different 2D materials is computationally challenging due to the lattice mismatch problem, which sometimes necessitates the creation of very large simulation cells for performing density-functional theory (DFT) calculations. Here we use a combination of DFT, linear regression and machine learning techniques in order to rapidly determine the interlayer distance between two different 2D heterostructures that are stacked in a bilayer heterostructure, as well as the band gap of the bilayer. Our work provides an excellent proof of concept by quickly and accurately predicting a structural property (the interlayer distance) and an electronic property (the band gap) for a large number of hybrid 2D materials. This work paves the way for rapid computational screening of the vast parameter space of van der Waals heterostructures to identify new hybrid materials with useful and interesting properties.

Download Full-text