Validation of machine learning techniques: decision trees and finite training set

1998 ◽  
Vol 7 (1) ◽  
pp. 94 ◽  
Author(s):  
Geoffrey A. W. West
Author(s):  
M. Carr ◽  
V. Ravi ◽  
G. Sridharan Reddy ◽  
D. Veranna

This paper profiles mobile banking users using machine learning techniques viz. Decision Tree, Logistic Regression, Multilayer Perceptron, and SVM to test a research model with fourteen independent variables and a dependent variable (adoption). A survey was conducted and the results were analysed using these techniques. Using Decision Trees the profile of the mobile banking adopter’s profile was identified. Comparing different machine learning techniques it was found that Decision Trees outperformed the Logistic Regression and Multilayer Perceptron and SVM. Out of all the techniques, Decision Tree is recommended for profiling studies because apart from obtaining high accurate results, it also yields ‘if–then’ classification rules. The classification rules provided here can be used to target potential customers to adopt mobile banking by offering them appropriate incentives.


2017 ◽  
Vol 2017 ◽  
pp. 1-21 ◽  
Author(s):  
Carlos Fernández ◽  
David Fernández-Llorca ◽  
Miguel A. Sotelo

A hybrid vision-map system is presented to solve the road detection problem in urban scenarios. The standardized use of machine learning techniques in classification problems has been merged with digital navigation map information to increase system robustness. The objective of this paper is to create a new environment perception method to detect the road in urban environments, fusing stereo vision with digital maps by detecting road appearance and road limits such as lane markings or curbs. Deep learning approaches make the system hard-coupled to the training set. Even though our approach is based on machine learning techniques, the features are calculated from different sources (GPS, map, curbs, etc.), making our system less dependent on the training set.


Crystals ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1218
Author(s):  
Natasha Dropka ◽  
Klaus Böttcher ◽  
Martin Holena

The aim of this study was to assess the ability of the various data mining and supervised machine learning techniques: correlation analysis, k-means clustering, principal component analysis and decision trees (regression and classification), to derive, optimize and understand the factors influencing VGF-GaAs growth. Training data were generated by Computational Fluid Dynamics (CFD) simulations and consisted of 130 datasets with 6 inputs (growth rate and power of 5 heaters) and 5 outputs (interface position and deflection, and temperatures at various positions in GaAs). Data mining results confirmed a good dispersion of the training data without the feasibility of a dimensionality reduction. Data clustering was observed in relation to the position of the crystallization front relative to the side heaters. Based on the statistical performance criteria and training results, decision trees identified the most decisive inputs and their ranges for a favorable interface shape and to keep GaAs temperature beyond limits for heavy arsenic evaporation. Decision trees are a recommendable machine learning technique with short training times and acceptable predictive accuracy based on small volume of CFD training data, capable of providing guidelines for understanding the crystal growth process, which is a prerequisite for the growth of low-cost, high-quality bulk crystals.


2021 ◽  
Vol 10 (2) ◽  
pp. 35
Author(s):  
M. Encarnación Beato Gutiérrez ◽  
Montserrat Mateos Sánchez ◽  
Roberto Berjón Gallinas ◽  
Ana M. Fermoso García

At present, capacity control in indoor spaces is critical in the current situation in which we are living in, due to the pandemic. In this work, we propose a new solution using machine learning techniques with BLE technology. This study presents a real experiment in a university environment and we study three different prediction models using machine learning techniques—specifically, logistic regression, decision trees and artificial neural networks. As a conclusion, the study shows that machine learning techniques, in particular decision trees, together with BLE technology, provide a solution to the problem. The contribution of this research work shows that the prediction model obtained is capable of detecting when the COVID capacity of an enclosed space is exceeded. In addition, it ensures that no false negatives are produced, i.e., all the people inside the laboratory will be correctly counted.


2020 ◽  
Vol 499 (4) ◽  
pp. 6009-6017
Author(s):  
Y-L Mong ◽  
K Ackley ◽  
D K Galloway ◽  
T Killestein ◽  
J Lyman ◽  
...  

ABSTRACT The amount of observational data produced by time-domain astronomy is exponentially increasing. Human inspection alone is not an effective way to identify genuine transients from the data. An automatic real-bogus classifier is needed and machine learning techniques are commonly used to achieve this goal. Building a training set with a sufficiently large number of verified transients is challenging, due to the requirement of human verification. We present an approach for creating a training set by using all detections in the science images to be the sample of real detections and all detections in the difference images, which are generated by the process of difference imaging to detect transients, to be the samples of bogus detections. This strategy effectively minimizes the labour involved in the data labelling for supervised machine learning methods. We demonstrate the utility of the training set by using it to train several classifiers utilizing as the feature representation the normalized pixel values in 21 × 21 pixel stamps centred at the detection position, observed with the Gravitational-wave Optical Transient Observer (GOTO) prototype. The real-bogus classifier trained with this strategy can provide up to $95{{\ \rm per\ cent}}$ prediction accuracy on the real detections at a false alarm rate of $1{{\ \rm per\ cent}}$.


2019 ◽  
Author(s):  
Wilson Castro ◽  
Jimy Oblitas ◽  
Miguel De-la-Torre ◽  
Carlos Cotrina ◽  
Karen Bazán ◽  
...  

The classification of fresh fruits according to their ripeness is typically a subjective and tedious task; consequently, there is growing interest in the use of non-contact techniques such as those based on computer vision and machine learning. In this paper, we propose the use of non-intrusive techniques for the classification of Cape gooseberry fruits. The proposal is based on the use of machine learning techniques combined with different color spaces. Given the success of techniques such as artificial neural networks,support vector machines, decision trees, and K-nearest neighbors in addressing classification problems, we decided to use these approaches in this research work. A sample of 926 Cape gooseberry fruits was obtained, and fruits were classified manually according to their level of ripeness into seven different classes. Images of each fruit were acquired in the RGB format through a system developed for this purpose. These images were preprocessed, filtered and segmented until the fruits were identified. For each piece of fruit, the median color parameter values in the RGB space were obtained, and these results were subsequently transformed into the HSV and L*a*b* color spaces. The values of each piece of fruit in the three color spaces and their corresponding degrees of ripeness were arranged for use in the creation, testing, and comparison of the developed classification models. The classification of gooseberry fruits by ripening level was found to be sensitive to both the color space used and the classification technique, e.g., the models based on decision trees are the most accurate, and the models based on the L*a*b* color space obtain the best mean accuracy. However, the model that best classifies the cape gooseberry fruits based on ripeness level is that resulting from the combination of the SVM technique and the RGB color space.


Water ◽  
2020 ◽  
Vol 12 (6) ◽  
pp. 1703 ◽  
Author(s):  
Joost P. den Bieman ◽  
Josefine M. Wilms ◽  
Henk F. P. van den Boogaard ◽  
Marcel R. A. van Gent

Wave overtopping is an important design criterion for coastal structures such as dikes, breakwaters and promenades. Hence, the prediction of the expected wave overtopping discharge is an important research topic. Existing prediction tools consist of empirical overtopping formulae, machine learning techniques like neural networks, and numerical models. In this paper, an innovative machine learning method—gradient boosting decision trees—is applied to the prediction of mean wave overtopping discharges. This new machine learning model is trained using the CLASH wave overtopping database. Optimizations to its performance are realized by using feature engineering and hyperparameter tuning. The model is shown to outperform an existing neural network model by reducing the error on the prediction of the CLASH database by a factor of 2.8. The model predictions follow physically realistic trends for variations of important features, and behave regularly in regions of the input parameter space with little or no data coverage.


Author(s):  
Wilson Castro ◽  
Jimy Oblitas ◽  
Miguel De-la-Torre ◽  
Carlos Cotrina ◽  
Karen Bazán ◽  
...  

The classification of fresh fruits according to their ripeness is typically a subjective and tedious task; consequently, there is growing interest in the use of non-contact techniques such as those based on computer vision and machine learning. In this paper, we propose the use of non-intrusive techniques for the classification of Cape gooseberry fruits. The proposal is based on the use of machine learning techniques combined with different color spaces. Given the success of techniques such as artificial neural networks,support vector machines, decision trees, and K-nearest neighbors in addressing classification problems, we decided to use these approaches in this research work. A sample of 926 Cape gooseberry fruits was obtained, and fruits were classified manually according to their level of ripeness into seven different classes. Images of each fruit were acquired in the RGB format through a system developed for this purpose. These images were preprocessed, filtered and segmented until the fruits were identified. For each piece of fruit, the median color parameter values in the RGB space were obtained, and these results were subsequently transformed into the HSV and L*a*b* color spaces. The values of each piece of fruit in the three color spaces and their corresponding degrees of ripeness were arranged for use in the creation, testing, and comparison of the developed classification models. The classification of gooseberry fruits by ripening level was found to be sensitive to both the color space used and the classification technique, e.g., the models based on decision trees are the most accurate, and the models based on the L*a*b* color space obtain the best mean accuracy. However, the model that best classifies the cape gooseberry fruits based on ripeness level is that resulting from the combination of the SVM technique and the RGB color space.


2007 ◽  
Vol 16 (04) ◽  
pp. 683-706 ◽  
Author(s):  
ARNAUD LALLOUET ◽  
ANDREI LEGTCHENKO

Partially Defined Constraints can be used to model the incomplete knowledge of a concept or a relation. Instead of only computing with the known part of the constraint, we propose to complete its definition by using Machine Learning techniques. Since constraints are actively used during solving for pruning domains, building a classifier for instances is not enough: we need a solver able to reduce variable domains. Our technique is composed of two steps: first we learn a classifier for each constraint projections and then we transform the classifiers into a propagator. The first contribution is a generic meta-technique for classifier improvement showing performances comparable to boosting. The second lies in the ability of using the learned concept in constraint-based decision or optimization problems. We presents results using Decision Trees and Artificial Neural Networks for constraint learning and propagation. It opens a new way of integrating Machine Learning in Decision Support Systems.


Sign in / Sign up

Export Citation Format

Share Document