Double-Criteria Active Learning for Multiclass Brain-Computer Interfaces

Computational Intelligence and Neuroscience ◽

10.1155/2020/3287589 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Qingshan She ◽

Kang Chen ◽

Zhizeng Luo ◽

Thinh Nguyen ◽

Thomas Potter ◽

...

Keyword(s):

Active Learning ◽

Training Data ◽

Computer Interfaces ◽

Technological Advances ◽

Training Samples ◽

Eeg Data ◽

Classifier Performance ◽

Active Learning Method ◽

Learning Machine ◽

Elm Classifier

Recent technological advances have enabled researchers to collect large amounts of electroencephalography (EEG) signals in labeled and unlabeled datasets. It is expensive and time consuming to collect labeled EEG data for use in brain-computer interface (BCI) systems, however. In this paper, a novel active learning method is proposed to minimize the amount of labeled, subject-specific EEG data required for effective classifier training, by combining measures of uncertainty and representativeness within an extreme learning machine (ELM). Following this approach, an ELM classifier was first used to select a relatively large batch of unlabeled examples, whose uncertainty was measured through the best-versus-second-best (BvSB) strategy. The diversity of each sample was then measured between the limited labeled training data and previously selected unlabeled samples, and similarity is measured among the previously selected samples. Finally, a tradeoff parameter is introduced to control the balance between informative and representative samples, and these samples are then used to construct a powerful ELM classifier. Extensive experiments were conducted using benchmark and multiclass motor imagery EEG datasets to evaluate the efficacy of the proposed method. Experimental results show that the performance of the new algorithm exceeds or matches those of several state-of-the-art active learning algorithms. It is thereby shown that the proposed method improves classifier performance and reduces the need for training samples in BCI applications.

Download Full-text

New active learning algorithms for near-infrared spectroscopy in agricultural applications

at - Automatisierungstechnik ◽

10.1515/auto-2020-0143 ◽

2021 ◽

Vol 69 (4) ◽

pp. 297-306

Author(s):

Julius Krause ◽

Maurice Günder ◽

Daniel Schulz ◽

Robin Gruna

Keyword(s):

Active Learning ◽

Near Infrared ◽

Agricultural Products ◽

Training Data ◽

Calibration Model ◽

Learning Approaches ◽

Training Samples ◽

Agricultural Applications ◽

Selection Of

Abstract The selection of training data determines the quality of a chemometric calibration model. In order to cover the entire parameter space of known influencing parameters, an experimental design is usually created. Nevertheless, even with a carefully prepared Design of Experiment (DoE), redundant reference analyses are often performed during the analysis of agricultural products. Because the number of possible reference analyses is usually very limited, the presented active learning approaches are intended to provide a tool for better selection of training samples.

Download Full-text

Reduced Annotation Based on Deep Active Learning for Arabic Text Detection in Natural Scene Images

10.36227/techrxiv.17327963 ◽

2021 ◽

Author(s):

Khalil Boukthir ◽

Abdulrahman M. Qahtani ◽

Omar Almutiry ◽

habib dhahri ◽

Adel Alimi

Keyword(s):

Active Learning ◽

Text Detection ◽

Training Data ◽

Arabic Text ◽

Natural Scene ◽

Novel Approach ◽

Training Samples ◽

Scene Text ◽

Text Images ◽

Natural Scene Images

<div>- A novel approach is presented to reduced annotation based on Deep Active Learning for Arabic text detection in Natural Scene Images.</div><div>- A new Arabic text images dataset (7k images) using the Google Street View service named TSVD.</div><div>- A new semi-automatic method for generating natural scene text images from the streets.</div><div>- Training samples is reduced to 1/5 of the original training size on average.</div><div>- Much less training data to achieve better dice index : 0.84</div>

Download Full-text

Reduced Annotation Based on Deep Active Learning for Arabic Text Detection in Natural Scene Images

10.36227/techrxiv.17327963.v1 ◽

2021 ◽

Author(s):

Khalil Boukthir ◽

Abdulrahman M. Qahtani ◽

Omar Almutiry ◽

habib dhahri ◽

Adel Alimi

Keyword(s):

Active Learning ◽

Text Detection ◽

Training Data ◽

Arabic Text ◽

Natural Scene ◽

Novel Approach ◽

Training Samples ◽

Scene Text ◽

Text Images ◽

Natural Scene Images

Download Full-text

Transfer Kernel Common Spatial Patterns for Motor Imagery Brain-Computer Interface Classification

Computational and Mathematical Methods in Medicine ◽

10.1155/2018/9871603 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 16

Author(s):

Mengxi Dai ◽

Dezhi Zheng ◽

Shucong Liu ◽

Pengju Zhang

Keyword(s):

Motor Imagery ◽

State Of The Art ◽

Classification Performance ◽

Training Data ◽

Common Spatial Pattern ◽

Computer Interfaces ◽

Training Samples ◽

Invariant Kernel ◽

The Common ◽

Effectiveness And Efficiency

Motor-imagery-based brain-computer interfaces (BCIs) commonly use the common spatial pattern (CSP) as preprocessing step before classification. The CSP method is a supervised algorithm. Therefore a lot of time-consuming training data is needed to build the model. To address this issue, one promising approach is transfer learning, which generalizes a learning model can extract discriminative information from other subjects for target classification task. To this end, we propose a transfer kernel CSP (TKCSP) approach to learn a domain-invariant kernel by directly matching distributions of source subjects and target subjects. The dataset IVa of BCI Competition III is used to demonstrate the validity by our proposed methods. In the experiment, we compare the classification performance of the TKCSP against CSP, CSP for subject-to-subject transfer (CSP SJ-to-SJ), regularizing CSP (RCSP), stationary subspace CSP (ssCSP), multitask CSP (mtCSP), and the combined mtCSP and ssCSP (ss + mtCSP) method. The results indicate that the superior mean classification performance of TKCSP can achieve 81.14%, especially in case of source subjects with fewer number of training samples. Comprehensive experimental evidence on the dataset verifies the effectiveness and efficiency of the proposed TKCSP approach over several state-of-the-art methods.

Download Full-text

SYNERGISTIC USE OF SENTINEL-1 AND SENTINEL-2 TIME SERIES FOR POPLAR PLANTATIONS MONITORING AT LARGE SCALE

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b3-2020-1457-2020 ◽

2020 ◽

Vol XLIII-B3-2020 ◽

pp. 1457-1461

Author(s):

Y. Hamrouni ◽

É. Paillassa ◽

V. Chéret ◽

C. Monteil ◽

D. Sheeren

Keyword(s):

Time Series ◽

Active Learning ◽

Supervised Classification ◽

Large Scale ◽

Training Data ◽

Passive Learning ◽

Training Samples ◽

Poplar Plantations ◽

Annual Means ◽

Sentinel 2

Abstract. The current context of availability of Earth Observation satellite data at high spatial and temporal resolutions makes it possible to map large areas. Although supervised classification is the most widely adopted approach, its performance is highly dependent on the availability and the quality of training data. However, gathering samples from field surveys or through photo interpretation is often expensive and time-consuming especially when the area to be classified is large. In this paper we propose the use of an active learning-based technique to address this issue by reducing the labelling effort required for supervised classification while increasing the generalisation capabilities of the classifier across space. Experiments were conducted to identify poplar plantations in three different sites in France using Sentinel-2 time series. In order to characterise the age of the identified poplar stands, temporal means of Sentinel-1 backscatter coefficients were computed. The results are promising and show the good capacities of the active learning-based approach to achieve similar performance (Poplar F-score &geq; 90%) to traditional passive learning (i.e. with random selection of samples) with up to 50% fewer training samples. Sentinel-1 annual means have demonstrated their potential to differentiate two stand ages with an overall accuracy of 83% regardless of the cultivar considered.

Download Full-text

Speech Recognition for Task Domains with Sparse Matched Training Data

Applied Sciences ◽

10.3390/app10186155 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6155

Author(s):

Byung Ok Kang ◽

Hyeong Bae Jeon ◽

Jeon Gue Park

Keyword(s):

Speech Recognition ◽

Active Learning ◽

Latent Variables ◽

Data Augmentation ◽

Integrated System ◽

Training Data ◽

Target Domain ◽

Teacher Student ◽

Speech Data ◽

Active Learning Method

We propose two approaches to handle speech recognition for task domains with sparse matched training data. One is an active learning method that selects training data for the target domain from another general domain that already has a significant amount of labeled speech data. This method uses attribute-disentangled latent variables. For the active learning process, we designed an integrated system consisting of a variational autoencoder with an encoder that infers latent variables with disentangled attributes from the input speech, and a classifier that selects training data with attributes matching the target domain. The other method combines data augmentation methods for generating matched target domain speech data and transfer learning methods based on teacher/student learning. To evaluate the proposed method, we experimented with various task domains with sparse matched training data. The experimental results show that the proposed method has qualitative characteristics that are suitable for the desired purpose, it outperforms random selection, and is comparable to using an equal amount of additional target domain data.

Download Full-text

A Novel Ensemble Credit Scoring Model Based on Extreme Learning Machine and Generalized Fuzzy Soft Sets

Mathematical Problems in Engineering ◽

10.1155/2020/7504764 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Dayu Xu ◽

Xuyao Zhang ◽

Junguo Hu ◽

Jiahao Chen

Keyword(s):

Feature Selection ◽

Credit Scoring ◽

Training Data ◽

Soft Sets ◽

Scoring Model ◽

Fuzzy Soft Sets ◽

Credit Data ◽

Learning Machine ◽

Elm Classifier ◽

Credit Scoring Model

This paper mainly discusses the hybrid application of ensemble learning, classification, and feature selection (FS) algorithms simultaneously based on training data balancing for helping the proposed credit scoring model perform more effectively, which comprises three major stages. Firstly, it conducts preprocessing for collected credit data. Then, an efficient feature selection algorithm based on adaptive elastic net is employed to reduce the weakly related or uncorrelated variables to get high-quality training data. Thirdly, a novel ensemble strategy is proposed to make the imbalanced training data set balanced for each extreme learning machine (ELM) classifier. Finally, a new weighting method for single ELM classifiers in the ensemble model is established with respect to their classification accuracy based on generalized fuzzy soft sets (GFSS) theory. A novel cosine-based distance measurement algorithm of GFSS is also proposed to calculate the weights of each ELM classifier. To confirm the efficiency of the proposed ensemble credit scoring model, we implemented experiments with real-world credit data sets for comparison. The process of analysis, outcomes, and mathematical tests proved that the proposed model is capable of improving the effectiveness of classification in average accuracy, area under the curve (AUC), H-measure, and Brier’s score compared to all other single classifiers and ensemble approaches.

Download Full-text

Stream-Based Extreme Learning Machine Approach for Big Data Problems

Mathematical Problems in Engineering ◽

10.1155/2015/126452 ◽

2015 ◽

Vol 2015 ◽

pp. 1-17 ◽

Cited By ~ 2

Author(s):

Euler Guimarães Horta ◽

Cristiano Leite de Castro ◽

Antônio Pádua Braga

Keyword(s):

Big Data ◽

Active Learning ◽

Learning Strategies ◽

Data Streams ◽

High Performance ◽

Hebbian Learning ◽

Active Learning Method ◽

Learning Machine ◽

Hidden Layer ◽

Few Data

Big Data problems demand data models with abilities to handle time-varying, massive, and high dimensional data. In this context, Active Learning emerges as an attractive technique for the development of high performance models using few data. The importance of Active Learning for Big Data becomes more evident when labeling cost is high and data is presented to the learner via data streams. This paper presents a novel Active Learning method based on Extreme Learning Machines (ELMs) and Hebbian Learning. Linearization of input data by a large size ELM hidden layer turns our method little sensitive to parameter setting. Overfitting is inherently controlled via the Hebbian Learning crosstalk term. We also demonstrate that a simple convergence test can be used as an effective labeling criterion since it points out to the amount of labels necessary for learning. The proposed method has inherent properties that make it highly attractive to handle Big Data: incremental learning via data streams, elimination of redundant patterns, and learning from a reduced informative training set. Experimental results have shown that our method is competitive with some large-margin Active Learning strategies and also with a linear SVM.

Download Full-text

A Batch-mode Active Learning Method Based on the Nearest Average-class Distance (NACD) for Multiclass Brain-Computer Interfaces

Journal of Fiber Bioengineering and Informatics ◽

10.3993/jfbi12201415 ◽

2014 ◽

Vol 7 (4) ◽

pp. 627-636 ◽

Cited By ~ 7

Author(s):

Xuemin Tan

Keyword(s):

Active Learning ◽

Learning Method ◽

Brain Computer Interfaces ◽

Batch Mode ◽

Computer Interfaces ◽

Active Learning Method

Download Full-text

Partial Annotation Scheme for Active Learning on Named Entity Recognition Tasks

journal of Data Intelligence ◽

10.26421/jdi1.3-2 ◽

2020 ◽

Vol 1 (3) ◽

pp. 319-332

Author(s):

Koga Kobayashi ◽

Kei Wakabayashi

Keyword(s):

Active Learning ◽

Named Entity Recognition ◽

Training Data ◽

Entity Recognition ◽

Instance Selection ◽

Annotation Scheme ◽

Learning Methods ◽

Named Entity ◽

Active Learning Method ◽

Specific Part

Active learning is a promising approach to alleviate the expensive annotation cost for making training data on named entity recognition (NER) tasks. However, since existing active learning methods on NER tasks implicitly assume the full annotation scheme of which the unit of an annotation request is the whole sentence, the efficiency of the data instance selection is limited. In this paper, we propose a new active learning method based on a partial annotation scheme, which selects a part of the sentences to be annotated and asks human annotators to label a specific part of the target sentences. In the experiment, we show that the partial annotation scheme can quickly train the proposed point-wise prediction model compared to the existing active learning methods on NER tasks.

Download Full-text