Kernel-Based Ensemble Learning in Python

Benjamin Guedj; Bhargav Srinivasa Desikan

doi:10.3390/info11020063

Kernel-Based Ensemble Learning in Python

Information ◽

10.3390/info11020063 ◽

2020 ◽

Vol 11 (2) ◽

pp. 63 ◽

Cited By ~ 1

Author(s):

Benjamin Guedj ◽

Bhargav Srinivasa Desikan

Keyword(s):

Numerical Experiments ◽

Learning Strategy ◽

Learning Algorithm ◽

Real Life ◽

Training Data ◽

Smoothing Kernel ◽

Regression Problems ◽

Classification And Regression ◽

Synthetic Datasets ◽

Python Package

We propose a new supervised learning algorithm for classification and regression problems where two or more preliminary predictors are available. We introduce KernelCobra, a non-linear learning strategy for combining an arbitrary number of initial predictors. KernelCobra builds on the COBRA algorithm introduced by Biau et al. (2016), which combined estimators based on a notion of proximity of predictions on the training data. While the COBRA algorithm used a binary threshold to declare which training data were close and to be used, we generalise this idea by using a kernel to better encapsulate the proximity information. Such a smoothing kernel provides more representative weights to each of the training points which are used to build the aggregate and final predictor, and KernelCobra systematically outperforms the COBRA algorithm. While COBRA is intended for regression, KernelCobra deals with classification and regression. KernelCobra is included as part of the open source Python package Pycobra (0.2.4 and onward), introduced by Srinivasa Desikan (2018). Numerical experiments were undertaken to assess the performance (in terms of pure prediction and computational complexity) of KernelCobra on real-life and synthetic datasets.

Download Full-text

The Proposal of Service Oriented Data Mining System for Solving Real-Life Classification and Regression Problems

Technological Innovation for Sustainability - IFIP Advances in Information and Communication Technology ◽

10.1007/978-3-642-19170-1_9 ◽

2011 ◽

pp. 83-90 ◽

Cited By ~ 9

Author(s):

Agnieszka Prusiewicz ◽

Maciej Zięba

Keyword(s):

Data Mining ◽

Real Life ◽

Mining System ◽

Service Oriented ◽

Regression Problems ◽

Classification And Regression ◽

Data Mining System

Download Full-text

Scalable Approach to High Coverages on Oxides via Iterative Training of a Machine-Learning Algorithm

10.26434/chemrxiv.10288514.v1 ◽

2019 ◽

Author(s):

Andrew Medford ◽

Shengchun Yang ◽

Fuzhu Liu

Keyword(s):

Machine Learning ◽

Chemical Potential ◽

Learning Algorithm ◽

Absolute Error ◽

Low Energy ◽

Training Data ◽

High Coverage ◽

Metal Compounds ◽

Adsorption Energies ◽

The Stability

Understanding the interaction of multiple types of adsorbate molecules on solid surfaces is crucial to establishing the stability of catalysts under various chemical environments. Computational studies on the high coverage and mixed coverages of reaction intermediates are still challenging, especially for transition-metal compounds. In this work, we present a framework to predict differential adsorption energies and identify low-energy structures under high- and mixed-adsorbate coverages on oxide materials. The approach uses Gaussian process machine-learning models with quantified uncertainty in conjunction with an iterative training algorithm to actively identify the training set. The framework is demonstrated for the mixed adsorption of CHx, NHx and OHx species on the oxygen vacancy and pristine rutile TiO2(110) surface sites. The results indicate that the proposed algorithm is highly efficient at identifying the most valuable training data, and is able to predict differential adsorption energies with a mean absolute error of ~0.3 eV based on <25% of the total DFT data. The algorithm is also used to identify 76% of the low-energy structures based on <30% of the total DFT data, enabling construction of surface phase diagrams that account for high and mixed coverage as a function of the chemical potential of C, H, O, and N. Furthermore, the computational scaling indicates the algorithm scales nearly linearly (N1.12) as the number of adsorbates increases. This framework can be directly extended to metals, metal oxides, and other materials, providing a practical route toward the investigation of the behavior of catalysts under high-coverage conditions.

Download Full-text

On-Shelf Utility Mining of Sequence Data

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3457570 ◽

2021 ◽

Vol 16 (2) ◽

pp. 1-31

Author(s):

Chunkai Zhang ◽

Zilin Du ◽

Yuting Yang ◽

Wensheng Gan ◽

Philip S. Yu

Keyword(s):

High Efficiency ◽

Sequence Data ◽

Real Life ◽

Search Space ◽

Upper Bounds ◽

Utility Mining ◽

Limited Memory ◽

Time Periods ◽

High Utility ◽

Synthetic Datasets

Utility mining has emerged as an important and interesting topic owing to its wide application and considerable popularity. However, conventional utility mining methods have a bias toward items that have longer on-shelf time as they have a greater chance to generate a high utility. To eliminate the bias, the problem of on-shelf utility mining (OSUM) is introduced. In this article, we focus on the task of OSUM of sequence data, where the sequential database is divided into several partitions according to time periods and items are associated with utilities and several on-shelf time periods. To address the problem, we propose two methods, OSUM of sequence data (OSUMS) and OSUMS + , to extract on-shelf high-utility sequential patterns. For further efficiency, we also design several strategies to reduce the search space and avoid redundant calculation with two upper bounds time prefix extension utility ( TPEU ) and time reduced sequence utility ( TRSU ). In addition, two novel data structures are developed for facilitating the calculation of upper bounds and utilities. Substantial experimental results on certain real and synthetic datasets show that the two methods outperform the state-of-the-art algorithm. In conclusion, OSUMS may consume a large amount of memory and is unsuitable for cases with limited memory, while OSUMS + has wider real-life applications owing to its high efficiency.

Download Full-text

Improved SinGAN Integrated with an Attentional Mechanism for Remote Sensing Image Classification

Remote Sensing ◽

10.3390/rs13091713 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1713

Author(s):

Songwei Gu ◽

Rui Zhang ◽

Hongxia Luo ◽

Mengyao Li ◽

Huamei Feng ◽

...

Keyword(s):

Remote Sensing ◽

Real Life ◽

Attention Mechanism ◽

Training Data ◽

Generative Adversarial Networks ◽

Natural Image ◽

Remote Sensing Images ◽

Training Time ◽

Adversarial Networks ◽

Remote Sensing Image Classification

Deep learning is an important research method in the remote sensing field. However, samples of remote sensing images are relatively few in real life, and those with markers are scarce. Many neural networks represented by Generative Adversarial Networks (GANs) can learn from real samples to generate pseudosamples, rather than traditional methods that often require more time and man-power to obtain samples. However, the generated pseudosamples often have poor realism and cannot be reliably used as the basis for various analyses and applications in the field of remote sensing. To address the abovementioned problems, a pseudolabeled sample generation method is proposed in this work and applied to scene classification of remote sensing images. The improved unconditional generative model that can be learned from a single natural image (Improved SinGAN) with an attention mechanism can effectively generate enough pseudolabeled samples from a single remote sensing scene image sample. Pseudosamples generated by the improved SinGAN model have stronger realism and relatively less training time, and the extracted features are easily recognized in the classification network. The improved SinGAN can better identify sub-jects from images with complex ground scenes compared with the original network. This mechanism solves the problem of geographic errors of generated pseudosamples. This study incorporated the generated pseudosamples into training data for the classification experiment. The result showed that the SinGAN model with the integration of the attention mechanism can better guarantee feature extraction of the training data. Thus, the quality of the generated samples is improved and the classification accuracy and stability of the classification network are also enhanced.

Download Full-text

Information-Theoretic Generalization Bounds for Meta-Learning and Applications

Entropy ◽

10.3390/e23010126 ◽

2021 ◽

Vol 23 (1) ◽

pp. 126

Author(s):

Sharu Theresa Jose ◽

Osvaldo Simeone

Keyword(s):

Learning Algorithm ◽

Broad Class ◽

Performance Measure ◽

Training Data ◽

Learning To Learn ◽

Data Set ◽

Information Theoretic ◽

Meta Learning ◽

Task Training ◽

Test Sets

Meta-learning, or “learning to learn”, refers to techniques that infer an inductive bias from data corresponding to multiple related tasks with the goal of improving the sample efficiency for new, previously unobserved, tasks. A key performance measure for meta-learning is the meta-generalization gap, that is, the difference between the average loss measured on the meta-training data and on a new, randomly selected task. This paper presents novel information-theoretic upper bounds on the meta-generalization gap. Two broad classes of meta-learning algorithms are considered that use either separate within-task training and test sets, like model agnostic meta-learning (MAML), or joint within-task training and test sets, like reptile. Extending the existing work for conventional learning, an upper bound on the meta-generalization gap is derived for the former class that depends on the mutual information (MI) between the output of the meta-learning algorithm and its input meta-training data. For the latter, the derived bound includes an additional MI between the output of the per-task learning procedure and corresponding data set to capture within-task uncertainty. Tighter bounds are then developed for the two classes via novel individual task MI (ITMI) bounds. Applications of the derived bounds are finally discussed, including a broad class of noisy iterative algorithms for meta-learning.

Download Full-text

Investigating the Potential of Network Optimization for a Constrained Object Detection Problem

Journal of Imaging ◽

10.3390/jimaging7040064 ◽

2021 ◽

Vol 7 (4) ◽

pp. 64

Author(s):

Tanguy Ophoff ◽

Cédric Gullentops ◽

Kristof Van Beeck ◽

Toon Goedemé

Keyword(s):

Computational Complexity ◽

Object Detection ◽

Network Optimization ◽

Real Life ◽

Optimization Techniques ◽

Training Data ◽

Single Shot ◽

Standard Object ◽

Number Of Classes

Object detection models are usually trained and evaluated on highly complicated, challenging academic datasets, which results in deep networks requiring lots of computations. However, a lot of operational use-cases consist of more constrained situations: they have a limited number of classes to be detected, less intra-class variance, less lighting and background variance, constrained or even fixed camera viewpoints, etc. In these cases, we hypothesize that smaller networks could be used without deteriorating the accuracy. However, there are multiple reasons why this does not happen in practice. Firstly, overparameterized networks tend to learn better, and secondly, transfer learning is usually used to reduce the necessary amount of training data. In this paper, we investigate how much we can reduce the computational complexity of a standard object detection network in such constrained object detection problems. As a case study, we focus on a well-known single-shot object detector, YoloV2, and combine three different techniques to reduce the computational complexity of the model without reducing its accuracy on our target dataset. To investigate the influence of the problem complexity, we compare two datasets: a prototypical academic (Pascal VOC) and a real-life operational (LWIR person detection) dataset. The three optimization steps we exploited are: swapping all the convolutions for depth-wise separable convolutions, perform pruning and use weight quantization. The results of our case study indeed substantiate our hypothesis that the more constrained a problem is, the more the network can be optimized. On the constrained operational dataset, combining these optimization techniques allowed us to reduce the computational complexity with a factor of 349, as compared to only a factor 9.8 on the academic dataset. When running a benchmark on an Nvidia Jetson AGX Xavier, our fastest model runs more than 15 times faster than the original YoloV2 model, whilst increasing the accuracy by 5% Average Precision (AP).

Download Full-text

Estimating Simultaneous Equation Models through an Entropy-Based Incremental Variational Bayes Learning Algorithm

Entropy ◽

10.3390/e23040384 ◽

2021 ◽

Vol 23 (4) ◽

pp. 384

Author(s):

Rocío Hernández-Sanjaime ◽

Martín González ◽

Antonio Peñalver ◽

Jose J. López-Espín

Keyword(s):

Statistical Theory ◽

Learning Algorithm ◽

Real Life ◽

Simulated Data ◽

Simultaneous Equation ◽

Variational Bayes ◽

Parameter Estimates ◽

Step Method ◽

The One ◽

Simultaneous Equation Models

The presence of unaccounted heterogeneity in simultaneous equation models (SEMs) is frequently problematic in many real-life applications. Under the usual assumption of homogeneity, the model can be seriously misspecified, and it can potentially induce an important bias in the parameter estimates. This paper focuses on SEMs in which data are heterogeneous and tend to form clustering structures in the endogenous-variable dataset. Because the identification of different clusters is not straightforward, a two-step strategy that first forms groups among the endogenous observations and then uses the standard simultaneous equation scheme is provided. Methodologically, the proposed approach is based on a variational Bayes learning algorithm and does not need to be executed for varying numbers of groups in order to identify the one that adequately fits the data. We describe the statistical theory, evaluate the performance of the suggested algorithm by using simulated data, and apply the two-step method to a macroeconomic problem.

Download Full-text

New Multi-View Classification Method with Uncertain Data

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3458282 ◽

2021 ◽

Vol 16 (1) ◽

pp. 1-23

Author(s):

Bo Liu ◽

Haowen Zhong ◽

Yanshan Xiao

Keyword(s):

Learning Strategy ◽

State Of The Art ◽

Uncertain Data ◽

Real Life ◽

Support Vector ◽

Classification Methods ◽

Complementary Information ◽

Novel Approach ◽

Svm Model ◽

Iterative Framework

Multi-view classification aims at designing a multi-view learning strategy to train a classifier from multi-view data, which are easily collected in practice. Most of the existing works focus on multi-view classification by assuming the multi-view data are collected with precise information. However, we always collect the uncertain multi-view data due to the collection process is corrupted with noise in real-life application. In this case, this article proposes a novel approach, called uncertain multi-view learning with support vector machine (UMV-SVM) to cope with the problem of multi-view learning with uncertain data. The method first enforces the agreement among all the views to seek complementary information of multi-view data and takes the uncertainty of the multi-view data into consideration by modeling reachability area of the noise. Then it proposes an iterative framework to solve the proposed UMV-SVM model such that we can obtain the multi-view classifier for prediction. Extensive experiments on real-life datasets have shown that the proposed UMV-SVM can achieve a better performance for uncertain multi-view classification in comparison to the state-of-the-art multi-view classification methods.

Download Full-text

Artifact-Free Single Image Defogging

Atmosphere ◽

10.3390/atmos12050577 ◽

2021 ◽

Vol 12 (5) ◽

pp. 577

Author(s):

Gabriele Graffieti ◽

Davide Maltoni

Keyword(s):

Learning Strategy ◽

Real Data ◽

Training Procedure ◽

Single Image ◽

Paired Data ◽

Manual Inspection ◽

Synthetic Datasets ◽

Visibility Enhancement ◽

Difficult Cases ◽

Good Contrast

In this paper, we present a novel defogging technique, named CurL-Defog, with the aim of minimizing the insertion of artifacts while maintaining good contrast restoration and visibility enhancement. Many learning-based defogging approaches rely on paired data, where fog is artificially added to clear images; this usually provides good results on mildly fogged images but is not effective for difficult cases. On the other hand, the models trained with real data can produce visually impressive results, but unwanted artifacts are often present. We propose a curriculum learning strategy and an enhanced CycleGAN model to reduce the number of produced artifacts, where both synthetic and real data are used in the training procedure. We also introduce a new metric, called HArD (Hazy Artifact Detector), to numerically quantify the number of artifacts in the defogged images, thus avoiding the tedious and subjective manual inspection of the results. HArD is then combined with other defogging indicators to produce a solid metric that is not deceived by the presence of artifacts. The proposed approach compares favorably with state-of-the-art techniques on both real and synthetic datasets.

Download Full-text

k-Nearest Neighbor Learning with Graph Neural Networks

Mathematics ◽

10.3390/math9080830 ◽

2021 ◽

Vol 9 (8) ◽

pp. 830

Author(s):

Seokho Kang

Keyword(s):

Neural Network ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Weighting Function ◽

High Sensitivity ◽

Training Data ◽

K Nearest Neighbor ◽

Main Challenge ◽

Benchmark Datasets ◽

Graph Neural Networks

k-nearest neighbor (kNN) is a widely used learning algorithm for supervised learning tasks. In practice, the main challenge when using kNN is its high sensitivity to its hyperparameter setting, including the number of nearest neighbors k, the distance function, and the weighting function. To improve the robustness to hyperparameters, this study presents a novel kNN learning method based on a graph neural network, named kNNGNN. Given training data, the method learns a task-specific kNN rule in an end-to-end fashion by means of a graph neural network that takes the kNN graph of an instance to predict the label of the instance. The distance and weighting functions are implicitly embedded within the graph neural network. For a query instance, the prediction is obtained by performing a kNN search from the training data to create a kNN graph and passing it through the graph neural network. The effectiveness of the proposed method is demonstrated using various benchmark datasets for classification and regression tasks.

Download Full-text