scholarly journals Ensemble modeling approach for rainfall/groundwater balancing

2007 ◽  
Vol 9 (2) ◽  
pp. 95-106 ◽  
Author(s):  
D. Laucelli ◽  
O. Giustolisi ◽  
V. Babovic ◽  
M. Keijzer

This paper introduces an application of machine learning, on real data. It deals with Ensemble Modeling, a simple averaging method for obtaining more reliable approximations using symbolic regression. Considerations on the contribution of bias and variance to the total error, and ensemble methods to reduce errors due to variance, have been tackled together with a specific application of ensemble modeling to hydrological forecasts. This work provides empirical evidence that genetic programming can greatly benefit from this approach in forecasting and simulating physical phenomena. Further considerations have been taken into account, such as the influence of Genetic Programming parameter settings on the model's performance.

2021 ◽  
Vol 11 (12) ◽  
pp. 5468
Author(s):  
Elizaveta Shmalko ◽  
Askhat Diveev

The problem of control synthesis is considered as machine learning control. The paper proposes a mathematical formulation of machine learning control, discusses approaches of supervised and unsupervised learning by symbolic regression methods. The principle of small variation of the basic solution is presented to set up the neighbourhood of the search and to increase search efficiency of symbolic regression methods. Different symbolic regression methods such as genetic programming, network operator, Cartesian and binary genetic programming are presented in details. It is shown on the computational example the possibilities of symbolic regression methods as unsupervised machine learning control technique to the solution of MLC problem of control synthesis for obtaining the stabilization system for a mobile robot.


2021 ◽  
Vol 14 (1) ◽  
pp. 71-88
Author(s):  
Adane Nega Tarekegn ◽  
Tamir Anteneh Alemu ◽  
Alemu Kumlachew Tegegne

Tuberculosis (TB) remains a global health concern. It commonly spreads through the air and attacks low immune bodies. TB is the most common and known health problem in low and middle-income countries. Genetic programming (GP) is a machine learning model for discovering useful relationships among the variables in complex clinical data. It is more appropriate in a circumstance when the form of the solution model is unknown a priori. The main objective of this study is to develop a model that can detect positive cases of TB suspected patients using genetic programming approach. In this paper, Genetic Programming (GP) is exploited to identify the presence of positive cases of tuberculosis from the real data set of TB suspects and hospitalized patients. First, the dataset is pre-processed, and target variables are identified using cluster analysis. This data-driven cluster analysis identifies two distinct clusters of patients, representing TB positive and TB negative. Then, GP is trained using the training datasets to construct a prediction model and tested with a separate new dataset. With the 30 runs, the median performance of GP on test data was good (sensitivity=0.78, specificity=0.95, accuracy=0.89, AUC=0.91). We find that GP shows better performance in predicting TB compared to other machine learning models. The study demonstrates that the GP model might be used to support clinicians to screen TB patients.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4618
Author(s):  
Francisco Oliveira ◽  
Miguel Luís ◽  
Susana Sargento

Unmanned Aerial Vehicle (UAV) networks are an emerging technology, useful not only for the military, but also for public and civil purposes. Their versatility provides advantages in situations where an existing network cannot support all requirements of its users, either because of an exceptionally big number of users, or because of the failure of one or more ground base stations. Networks of UAVs can reinforce these cellular networks where needed, redirecting the traffic to available ground stations. Using machine learning algorithms to predict overloaded traffic areas, we propose a UAV positioning algorithm responsible for determining suitable positions for the UAVs, with the objective of a more balanced redistribution of traffic, to avoid saturated base stations and decrease the number of users without a connection. The tests performed with real data of user connections through base stations show that, in less restrictive network conditions, the algorithm to dynamically place the UAVs performs significantly better than in more restrictive conditions, reducing significantly the number of users without a connection. We also conclude that the accuracy of the prediction is a very important factor, not only in the reduction of users without a connection, but also on the number of UAVs deployed.


1993 ◽  
Vol 18 (2-4) ◽  
pp. 209-220
Author(s):  
Michael Hadjimichael ◽  
Anita Wasilewska

We present here an application of Rough Set formalism to Machine Learning. The resulting Inductive Learning algorithm is described, and its application to a set of real data is examined. The data consists of a survey of voter preferences taken during the 1988 presidential election in the U.S.A. Results include an analysis of the predictive accuracy of the generated rules, and an analysis of the semantic content of the rules.


2021 ◽  
Vol 10 (1) ◽  
pp. 42
Author(s):  
Kieu Anh Nguyen ◽  
Walter Chen ◽  
Bor-Shiun Lin ◽  
Uma Seeboonruang

Although machine learning has been extensively used in various fields, it has only recently been applied to soil erosion pin modeling. To improve upon previous methods of quantifying soil erosion based on erosion pin measurements, this study explored the possible application of ensemble machine learning algorithms to the Shihmen Reservoir watershed in northern Taiwan. Three categories of ensemble methods were considered in this study: (a) Bagging, (b) boosting, and (c) stacking. The bagging method in this study refers to bagged multivariate adaptive regression splines (bagged MARS) and random forest (RF), and the boosting method includes Cubist and gradient boosting machine (GBM). Finally, the stacking method is an ensemble method that uses a meta-model to combine the predictions of base models. This study used RF and GBM as the meta-models, decision tree, linear regression, artificial neural network, and support vector machine as the base models. The dataset used in this study was sampled using stratified random sampling to achieve a 70/30 split for the training and test data, and the process was repeated three times. The performance of six ensemble methods in three categories was analyzed based on the average of three attempts. It was found that GBM performed the best among the ensemble models with the lowest root-mean-square error (RMSE = 1.72 mm/year), the highest Nash-Sutcliffe efficiency (NSE = 0.54), and the highest index of agreement (d = 0.81). This result was confirmed by the spatial comparison of the absolute differences (errors) between model predictions and observations using GBM and RF in the study area. In summary, the results show that as a group, the bagging method and the boosting method performed equally well, and the stacking method was third for the erosion pin dataset considered in this study.


Mathematics ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 936
Author(s):  
Jianli Shao ◽  
Xin Liu ◽  
Wenqing He

Imbalanced data exist in many classification problems. The classification of imbalanced data has remarkable challenges in machine learning. The support vector machine (SVM) and its variants are popularly used in machine learning among different classifiers thanks to their flexibility and interpretability. However, the performance of SVMs is impacted when the data are imbalanced, which is a typical data structure in the multi-category classification problem. In this paper, we employ the data-adaptive SVM with scaled kernel functions to classify instances for a multi-class population. We propose a multi-class data-dependent kernel function for the SVM by considering class imbalance and the spatial association among instances so that the classification accuracy is enhanced. Simulation studies demonstrate the superb performance of the proposed method, and a real multi-class prostate cancer image dataset is employed as an illustration. Not only does the proposed method outperform the competitor methods in terms of the commonly used accuracy measures such as the F-score and G-means, but also successfully detects more than 60% of instances from the rare class in the real data, while the competitors can only detect less than 20% of the rare class instances. The proposed method will benefit other scientific research fields, such as multiple region boundary detection.


Sign in / Sign up

Export Citation Format

Share Document