Data augmentation methods for machine-learning-based classification of bio-signals

Author(s):  
Asuka Sakai ◽  
Yuki Minoda ◽  
Koji Morikawa
2019 ◽  
Vol 43 (4) ◽  
pp. 677-691
Author(s):  
A.A. Sirota ◽  
A.O. Donskikh ◽  
A.V. Akimov ◽  
D.A. Minakov

A problem of non-parametric multivariate density estimation for machine learning and data augmentation is considered. A new mixed density estimation method based on calculating the convolution of independently obtained kernel density estimates for unknown distributions of informative features and a known (or independently estimated) density for non-informative interference occurring during measurements is proposed. Properties of the mixed density estimates obtained using this method are analyzed. The method is compared with a conventional Parzen-Rosenblatt window method applied directly to the training data. The equivalence of the mixed kernel density estimator and the data augmentation procedure based on the known (or estimated) statistical model of interference is theoretically and experimentally proven. The applicability of the mixed density estimators for training of machine learning algorithms for the classification of biological objects (elements of grain mixtures) based on spectral measurements in the visible and near-infrared regions is evaluated.


2020 ◽  
Vol 10 (23) ◽  
pp. 8481
Author(s):  
Cesar Federico Caiafa ◽  
Jordi Solé-Casals ◽  
Pere Marti-Puig ◽  
Sun Zhe ◽  
Toshihisa Tanaka

In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.


Author(s):  
Sridharan Naveen Venkatesh ◽  
Vaithiyanathan Sugumaran

Fault diagnosis plays a significant role in enhancing the useful lifetime, power output, and reliability of photovoltaic modules (PVM). Visual faults such as burn marks, delamination, discoloration, glass breakage, and snail trails make detection of faults difficult under harsh environmental conditions. Various researchers have made several attempts to identify visual faults in a PVM. However, much of the previous studies were centered on the identification and analysis of limited number of faults. This article presents the use of a deep convolutional neural network (CNN) to extract image features and perform an effective classification of faults by machine learning (ML) algorithms. In contrast to the present-day work, five different fault conditions were considered in the study. The proposed solution consists of three phases, to effectively analyze various PVM defects. First, the module images are acquired using unmanned aerial vehicles (UAVs) and data augmentation is performed to generate a uniform dataset. Afterward, a pre-trained deep CNN is adopted for image feature extraction. Finally, the extracted image features are classified with the help of various ML classifiers. The final results show the effectiveness of pre-trained deep CNN and accurate performance of ML classifiers. The best-in-class ML classifier for multiple fault classification is suggested based on the performance comparison.


2021 ◽  
Author(s):  
Arif Jahangir

Traumatic Brain Injury is the primary cause of death and disability all over the world. Monitoring the intracranial pressure (ICP) and classifying it for hypertension signals is of crucial importance. This thesis explores the possibility of a better classification of the ICP signal and detection of hypertensive signal prior to the actual occurrence of the hypertensive episodes. This study differ from other approaches astime series is converted into images by Gramian angular field and Markov transition matrix and augmented with data. Due to unbalanced data, the effect of smote extended nearest neighbour algorithm for balancing the data is examined. We use various machine learning algorithms to classify the ICP signals. The results obtained shoe that Ada boost performance is the best among compared algorithms. F1 score of the Ada boost is 0.95 on original dataset, and 0.9967 on balanced and augmented dataset. Quadratic Discriminant Analysis F1 score is 1 when data is augmented and balanced.


2021 ◽  
Author(s):  
Arif Jahangir

Traumatic Brain Injury is the primary cause of death and disability all over the world. Monitoring the intracranial pressure (ICP) and classifying it for hypertension signals is of crucial importance. This thesis explores the possibility of a better classification of the ICP signal and detection of hypertensive signal prior to the actual occurrence of the hypertensive episodes. This study differ from other approaches astime series is converted into images by Gramian angular field and Markov transition matrix and augmented with data. Due to unbalanced data, the effect of smote extended nearest neighbour algorithm for balancing the data is examined. We use various machine learning algorithms to classify the ICP signals. The results obtained shoe that Ada boost performance is the best among compared algorithms. F1 score of the Ada boost is 0.95 on original dataset, and 0.9967 on balanced and augmented dataset. Quadratic Discriminant Analysis F1 score is 1 when data is augmented and balanced.


2021 ◽  
Author(s):  
Luiz Felipe Cavalcanti ◽  
Lilian Berton

Image classification has been applied to several real problems. However, getting labeled data is a costly task, since it demands time, resources and experts. Furthermore, some domains like disease detection suffer from unbalanced classes. These scenarios are challenging and degrade the performance of machine learning algorithms. In these cases, we can use Data Augmentation (DA) approaches to increase the number of labeled examples in a dataset. The objective of this work is to analyze the use of Generative Adversarial Networks (GANs) as DA, which are capable of synthesizing artificial data from the original data, under an adversarial process of two neural networks. The GANs are applied in the classification of unbalanced Covid-19 radiological images. Increasing the number of images led to better accuracy for all the GANs tested, especially in the multi-label dataset, mitigating the bias for unbalanced classes.


2020 ◽  
Vol 43 ◽  
Author(s):  
Myrthe Faber

Abstract Gilead et al. state that abstraction supports mental travel, and that mental travel critically relies on abstraction. I propose an important addition to this theoretical framework, namely that mental travel might also support abstraction. Specifically, I argue that spontaneous mental travel (mind wandering), much like data augmentation in machine learning, provides variability in mental content and context necessary for abstraction.


Sign in / Sign up

Export Citation Format

Share Document