Data augmentation methods for machine-learning-based classification of bio-signals

Multivariate mixed kernel density estimators and their application in machine learning for classification of biological objects based on spectral measurements

Computer Optics ◽

10.18287/2412-6179-2019-43-4-677-691 ◽

2019 ◽

Vol 43 (4) ◽

pp. 677-691

Author(s):

A.A. Sirota ◽

A.O. Donskikh ◽

A.V. Akimov ◽

D.A. Minakov

Keyword(s):

Machine Learning ◽

Density Estimation ◽

Data Augmentation ◽

Kernel Density ◽

Machine Learning Algorithms ◽

Spectral Measurements ◽

Density Estimates ◽

Biological Objects ◽

Density Estimators

A problem of non-parametric multivariate density estimation for machine learning and data augmentation is considered. A new mixed density estimation method based on calculating the convolution of independently obtained kernel density estimates for unknown distributions of informative features and a known (or independently estimated) density for non-informative interference occurring during measurements is proposed. Properties of the mixed density estimates obtained using this method are analyzed. The method is compared with a conventional Parzen-Rosenblatt window method applied directly to the training data. The equivalence of the mixed kernel density estimator and the data augmentation procedure based on the known (or estimated) statistical model of interference is theoretically and experimentally proven. The applicability of the mixed density estimators for training of machine learning algorithms for the classification of biological objects (elements of grain mixtures) based on spectral measurements in the visible and near-infrared regions is evaluated.

Download Full-text

Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets

Applied Sciences ◽

10.3390/app10238481 ◽

2020 ◽

Vol 10 (23) ◽

pp. 8481

Author(s):

Cesar Federico Caiafa ◽

Jordi Solé-Casals ◽

Pere Marti-Puig ◽

Sun Zhe ◽

Toshihisa Tanaka

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Unsupervised Classification ◽

Decomposition Methods ◽

Signal Decomposition ◽

Learning Performance ◽

Decomposition Approach ◽

Data Completion ◽

Machine Learning Applications

In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.

Download Full-text

A combined approach of convolutional neural networks and machine learning for visual fault classification in photovoltaic modules

Proceedings of the Institution of Mechanical Engineers Part O Journal of Risk and Reliability ◽

10.1177/1748006x211020305 ◽

2021 ◽

pp. 1748006X2110203

Author(s):

Sridharan Naveen Venkatesh ◽

Vaithiyanathan Sugumaran

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Performance Comparison ◽

Image Features ◽

Image Feature ◽

Fault Classification ◽

Photovoltaic Modules ◽

Accurate Performance ◽

Deep Cnn

Fault diagnosis plays a significant role in enhancing the useful lifetime, power output, and reliability of photovoltaic modules (PVM). Visual faults such as burn marks, delamination, discoloration, glass breakage, and snail trails make detection of faults difficult under harsh environmental conditions. Various researchers have made several attempts to identify visual faults in a PVM. However, much of the previous studies were centered on the identification and analysis of limited number of faults. This article presents the use of a deep convolutional neural network (CNN) to extract image features and perform an effective classification of faults by machine learning (ML) algorithms. In contrast to the present-day work, five different fault conditions were considered in the study. The proposed solution consists of three phases, to effectively analyze various PVM defects. First, the module images are acquired using unmanned aerial vehicles (UAVs) and data augmentation is performed to generate a uniform dataset. Afterward, a pre-trained deep CNN is adopted for image feature extraction. Finally, the extracted image features are classified with the help of various ML classifiers. The final results show the effectiveness of pre-trained deep CNN and accurate performance of ML classifiers. The best-in-class ML classifier for multiple fault classification is suggested based on the performance comparison.

Download Full-text

A Comparative Study of the Impact of Data Augmentation in Machine Learning Based Classification Accuracy

10.32920/ryerson.14661300 ◽

2021 ◽

Author(s):

Arif Jahangir

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Machine Learning Algorithms ◽

Quadratic Discriminant Analysis ◽

Crucial Importance ◽

Markov Transition Matrix ◽

Original Dataset ◽

Markov Transition ◽

The Impact

Traumatic Brain Injury is the primary cause of death and disability all over the world. Monitoring the intracranial pressure (ICP) and classifying it for hypertension signals is of crucial importance. This thesis explores the possibility of a better classification of the ICP signal and detection of hypertensive signal prior to the actual occurrence of the hypertensive episodes. This study differ from other approaches astime series is converted into images by Gramian angular field and Markov transition matrix and augmented with data. Due to unbalanced data, the effect of smote extended nearest neighbour algorithm for balancing the data is examined. We use various machine learning algorithms to classify the ICP signals. The results obtained shoe that Ada boost performance is the best among compared algorithms. F1 score of the Ada boost is 0.95 on original dataset, and 0.9967 on balanced and augmented dataset. Quadratic Discriminant Analysis F1 score is 1 when data is augmented and balanced.

Download Full-text

A Comparative Study of the Impact of Data Augmentation in Machine Learning Based Classification Accuracy

10.32920/ryerson.14661300.v1 ◽

2021 ◽

Author(s):

Arif Jahangir

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Machine Learning Algorithms ◽

Quadratic Discriminant Analysis ◽

Crucial Importance ◽

Markov Transition Matrix ◽

Original Dataset ◽

Markov Transition ◽

The Impact

Traumatic Brain Injury is the primary cause of death and disability all over the world. Monitoring the intracranial pressure (ICP) and classifying it for hypertension signals is of crucial importance. This thesis explores the possibility of a better classification of the ICP signal and detection of hypertensive signal prior to the actual occurrence of the hypertensive episodes. This study differ from other approaches astime series is converted into images by Gramian angular field and Markov transition matrix and augmented with data. Due to unbalanced data, the effect of smote extended nearest neighbour algorithm for balancing the data is examined. We use various machine learning algorithms to classify the ICP signals. The results obtained shoe that Ada boost performance is the best among compared algorithms. F1 score of the Ada boost is 0.95 on original dataset, and 0.9967 on balanced and augmented dataset. Quadratic Discriminant Analysis F1 score is 1 when data is augmented and balanced.

Download Full-text

CLASSIFICATION OF USER COMMENTS IN A MOBILE APPLICATION USING DATA AUGMENTATION WITH MACHINE LEARNING TECHNIQUES

Mühendislik Bilimleri ve Tasarım Dergisi ◽

10.21923/jesd.906211 ◽

2021 ◽

Vol 9 (4) ◽

pp. 1398-1407

Author(s):

Özer ÇELİK ◽

Gürkan KAPLAN

Keyword(s):

Machine Learning ◽

Mobile Application ◽

Data Augmentation ◽

Machine Learning Techniques ◽

User Comments ◽

Learning Techniques ◽

Using Data

Download Full-text

Comparison of GANs for Covid-19 X-ray classification

10.5753/eniac.2021.18238 ◽

2021 ◽

Author(s):

Luiz Felipe Cavalcanti ◽

Lilian Berton

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Original Data ◽

Machine Learning Algorithms ◽

Generative Adversarial Networks ◽

Artificial Data ◽

X Ray ◽

Adversarial Networks ◽

Radiological Images

Image classification has been applied to several real problems. However, getting labeled data is a costly task, since it demands time, resources and experts. Furthermore, some domains like disease detection suffer from unbalanced classes. These scenarios are challenging and degrade the performance of machine learning algorithms. In these cases, we can use Data Augmentation (DA) approaches to increase the number of labeled examples in a dataset. The objective of this work is to analyze the use of Generative Adversarial Networks (GANs) as DA, which are capable of synthesizing artificial data from the original data, under an adversarial process of two neural networks. The GANs are applied in the classification of unbalanced Covid-19 radiological images. Increasing the number of images led to better accuracy for all the GANs tested, especially in the multi-label dataset, mitigating the bias for unbalanced classes.

Download Full-text

Mind wandering as data augmentation: How mental travel supports abstraction

Behavioral and Brain Sciences ◽

10.1017/s0140525x1900311x ◽

2020 ◽

Vol 43 ◽

Author(s):

Myrthe Faber

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Mental Content ◽

Mind Wandering ◽

Theoretical Framework ◽

Important Addition

Abstract Gilead et al. state that abstraction supports mental travel, and that mental travel critically relies on abstraction. I propose an important addition to this theoretical framework, namely that mental travel might also support abstraction. Specifically, I argue that spontaneous mental travel (mind wandering), much like data augmentation in machine learning, provides variability in mental content and context necessary for abstraction.

Download Full-text

Machine Learning Classification of Spinal Lesions: Compared Accuracy of Texture Parameters Extracted by Different Software

10.1055/s-0039-1692578 ◽

2019 ◽

Author(s):

V. Chianca ◽

D. Albano ◽

R. Cuocolo ◽

C. Messina ◽

S. Gitto ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Classification ◽

Spinal Lesions ◽

Texture Parameters

Download Full-text

Machine Learning Classification of Low-grade and High-grade Chondrosarcomas Based on MRI-based Texture Analysis

10.1055/s-0039-1692575 ◽

2019 ◽

Author(s):

S. Gitto ◽

D. Albano ◽

V. Chianca ◽

R. Cuocolo ◽

L. Ugga ◽

...

Keyword(s):

Machine Learning ◽

Texture Analysis ◽

Low Grade ◽

High Grade ◽

Machine Learning Classification

Download Full-text