scholarly journals Open Set Audio Classification Using Autoencoders Trained on Few Data

Sensors ◽  
2020 ◽  
Vol 20 (13) ◽  
pp. 3741 ◽  
Author(s):  
Javier Naranjo-Alcazar ◽  
Sergi Perez-Castanos ◽  
Pedro Zuccarello ◽  
Fabio Antonacci ◽  
Maximo Cobos

Open-set recognition (OSR) is a challenging machine learning problem that appears when classifiers are faced with test instances from classes not seen during training. It can be summarized as the problem of correctly identifying instances from a known class (seen during training) while rejecting any unknown or unwanted samples (those belonging to unseen classes). Another problem arising in practical scenarios is few-shot learning (FSL), which appears when there is no availability of a large number of positive samples for training a recognition system. Taking these two limitations into account, a new dataset for OSR and FSL for audio data was recently released to promote research on solutions aimed at addressing both limitations. This paper proposes an audio OSR/FSL system divided into three steps: a high-level audio representation, feature embedding using two different autoencoder architectures and a multi-layer perceptron (MLP) trained on latent space representations to detect known classes and reject unwanted ones. An extensive set of experiments is carried out considering multiple combinations of openness factors (OSR condition) and number of shots (FSL condition), showing the validity of the proposed approach and confirming superior performance with respect to a baseline system based on transfer learning.

Author(s):  
Sudipto Mukherjee ◽  
Himanshu Asnani ◽  
Eugene Lin ◽  
Sreeram Kannan

Generative Adversarial networks (GANs) have obtained remarkable success in many unsupervised learning tasks and unarguably, clustering is an important unsupervised learning problem. While one can potentially exploit the latent-space back-projection in GANs to cluster, we demonstrate that the cluster structure is not retained in the GAN latent space. In this paper, we propose ClusterGAN as a new mechanism for clustering using GANs. By sampling latent variables from a mixture of one-hot encoded variables and continuous latent variables, coupled with an inverse network (which projects the data to the latent space) trained jointly with a clustering specific loss, we are able to achieve clustering in the latent space. Our results show a remarkable phenomenon that GANs can preserve latent space interpolation across categories, even though the discriminator is never exposed to such vectors. We compare our results with various clustering baselines and demonstrate superior performance on both synthetic and real datasets.


2021 ◽  
Vol 18 (5) ◽  
pp. 6620-6637
Author(s):  
Yan Tang ◽  
◽  
Zhijin Zhao ◽  
Chun Li ◽  
Xueyi Ye ◽  
...  

<abstract> <p>For the existing Closed Set Recognition (CSR) methods mistakenly identify unknown jamming signals as a known class, a Conditional Gaussian Encoder (CG-Encoder) for 1-dimensional signal Open Set Recognition (OSR) is designed. The network retains the original form of the signal as much as possible and deep neural network is used to extract useful information. CG-Encoder adopts residual network structure and a new Kullback-Leibler (KL) divergence is defined. In the training phase, the known classes are approximated to different Gaussian distributions in the latent space and the discrimination between classes is increased to improve the recognition performance of the known classes. In the testing phase, a specific and effective OSR algorithm flow is designed. Simulation experiments are carried out on 9 jamming types. The results show that the CSR and OSR performance of CG-Encoder is better than that of the other three kinds of network structures. When the openness is the maximum, the open set average accuracy of CG-Encoder is more than 70%, which is about 30% higher than the worst algorithm, and about 20% higher than the better one. When the openness is the minimum, the average accuracy of OSR is more than 95%.</p> </abstract>


2021 ◽  
Vol 11 (2) ◽  
pp. 23
Author(s):  
Duy-Anh Nguyen ◽  
Xuan-Tu Tran ◽  
Francesca Iacopi

Deep Learning (DL) has contributed to the success of many applications in recent years. The applications range from simple ones such as recognizing tiny images or simple speech patterns to ones with a high level of complexity such as playing the game of Go. However, this superior performance comes at a high computational cost, which made porting DL applications to conventional hardware platforms a challenging task. Many approaches have been investigated, and Spiking Neural Network (SNN) is one of the promising candidates. SNN is the third generation of Artificial Neural Networks (ANNs), where each neuron in the network uses discrete spikes to communicate in an event-based manner. SNNs have the potential advantage of achieving better energy efficiency than their ANN counterparts. While generally there will be a loss of accuracy on SNN models, new algorithms have helped to close the accuracy gap. For hardware implementations, SNNs have attracted much attention in the neuromorphic hardware research community. In this work, we review the basic background of SNNs, the current state and challenges of the training algorithms for SNNs and the current implementations of SNNs on various hardware platforms.


Author(s):  
Kristin Krahl ◽  
Mark W. Scerbo

The present study examined team performance on an adaptive pursuit tracking task with human-human and human-computer teams. The participants were randomly assigned to one of three team conditions where their partner was either a computer novice, computer expert, or human. Participants began the experiment with control over either the horizontal or vertical axis, but had the option of taking control of their teammate's axis if they achieved superior performance on the previous trial. A control condition was also run where a single participant controlled both axes. Performance was assessed by RMSE scores over 100 trials. The results showed that performance along the horizontal axis improved over the session regardless of the experimental condition, but the degree of improvement was dependent upon group assignment. Individuals working alone or paired with an expert computer maintained a high level of performance throughout the experiment. Those paired with a computer-novice or another human performed poorly initially, but eventually reached the level of those in the other conditions. The results showed that team training can be as effective as individual training, but that the quality of training is moderated by the skill level of one's teammate. Moreover, these findings suggest that task partitioning of high performance skills between a human and a computer is not only possible but may be considered a viable option in the design of adaptive systems.


Author(s):  
Pooja R Moolchandani ◽  
Anirban Mazumdar ◽  
Aaron Young

Abstract In this study, we developed an offline, hierarchical intent recognition system for inferring the timing and direction of motion intent of a human operator when operating in an unstructured environment. There has been an increasing demand for robot agents to assist in these dynamic, rapid motions that are constantly evolving and require quick, accurate estimation of a user's direction of travel.An experiment was conducted in a motion capture space with six subjects performing threat-evasion in 8 directions, and their mechanical and neuromuscular signals were recorded for use in our intent recognition system (XGBoost). Investigated against current, analytical methods, our system demonstrated superior performance with quicker direction of travel estimation occurring 140 ms earlier in the movement and a 11.6 degree reduction of error. The results showed that we could even predict movement start 100 ms prior to the actual, thus allowing any physical systems to start up. Our direction estimation had an optimal performance of 8.8 degrees, or 2.4% of the 360 degrees range of travel, using 3-axis kinetic data. The performance of other sensors and their combinations indicate that there are additional possibilities to obtain low estimation error. These findings are promising as they can be used to inform the design of a wearable robot aimed at assisting users in dynamic motions, while in environments with oncoming threats.


2021 ◽  
Author(s):  
Shuyuan Xu ◽  
Linsen Li ◽  
Hangjun Yang ◽  
Junhua Tang

2020 ◽  
Vol 10 (21) ◽  
pp. 7619
Author(s):  
Jucheol Moon ◽  
Nhat Anh Le ◽  
Nelson Hebert Minaya ◽  
Sang-Il Choi

A person’s gait is a behavioral trait that is uniquely associated with each individual and can be used to recognize the person. As information about the human gait can be captured by wearable devices, a few studies have led to the proposal of methods to process gait information for identification purposes. Despite recent advances in gait recognition, an open set gait recognition problem presents challenges to current approaches. To address the open set gait recognition problem, a system should be able to deal with unseen subjects who have not included in the training dataset. In this paper, we propose a system that learns a mapping from a multimodal time series collected using insole to a latent (embedding vector) space to address the open set gait recognition problem. The distance between two embedding vectors in the latent space corresponds to the similarity between two multimodal time series. Using the characteristics of the human gait pattern, multimodal time series are sliced into unit steps. The system maps unit steps to embedding vectors using an ensemble consisting of a convolutional neural network and a recurrent neural network. To recognize each individual, the system learns a decision function using a one-class support vector machine from a few embedding vectors of the person in the latent space, then the system determines whether an unknown unit step is recognized as belonging to a known individual. Our experiments demonstrate that the proposed framework recognizes individuals with high accuracy regardless they have been registered or not. If we could have an environment in which all people would be wearing the insole, the framework would be used for user verification widely.


Sign in / Sign up

Export Citation Format

Share Document