scholarly journals An Innovative Multi-Model Neural Network Approach for Feature Selection in Emotion Recognition Using Deep Feature Clustering

Sensors ◽  
2020 ◽  
Vol 20 (13) ◽  
pp. 3765 ◽  
Author(s):  
Muhammad Adeel Asghar ◽  
Muhammad Jamil Khan ◽  
Muhammad Rizwan ◽  
Raja Majid Mehmood ◽  
Sun-Hee Kim

Emotional awareness perception is a largely growing field that allows for more natural interactions between people and machines. Electroencephalography (EEG) has emerged as a convenient way to measure and track a user’s emotional state. The non-linear characteristic of the EEG signal produces a high-dimensional feature vector resulting in high computational cost. In this paper, characteristics of multiple neural networks are combined using Deep Feature Clustering (DFC) to select high-quality attributes as opposed to traditional feature selection methods. The DFC method shortens the training time on the network by omitting unusable attributes. First, Empirical Mode Decomposition (EMD) is applied as a series of frequencies to decompose the raw EEG signal. The spatiotemporal component of the decomposed EEG signal is expressed as a two-dimensional spectrogram before the feature extraction process using Analytic Wavelet Transform (AWT). Four pre-trained Deep Neural Networks (DNN) are used to extract deep features. Dimensional reduction and feature selection are achieved utilising the differential entropy-based EEG channel selection and the DFC technique, which calculates a range of vocabularies using k-means clustering. The histogram characteristic is then determined from a series of visual vocabulary items. The classification performance of the SEED, DEAP and MAHNOB datasets combined with the capabilities of DFC show that the proposed method improves the performance of emotion recognition in short processing time and is more competitive than the latest emotion recognition methods.

Biomimetics ◽  
2019 ◽  
Vol 5 (1) ◽  
pp. 1 ◽  
Author(s):  
Michelle Gutiérrez-Muñoz ◽  
Astryd González-Salazar ◽  
Marvin Coto-Jiménez

Speech signals are degraded in real-life environments, as a product of background noise or other factors. The processing of such signals for voice recognition and voice analysis systems presents important challenges. One of the conditions that make adverse quality difficult to handle in those systems is reverberation, produced by sound wave reflections that travel from the source to the microphone in multiple directions. To enhance signals in such adverse conditions, several deep learning-based methods have been proposed and proven to be effective. Recently, recurrent neural networks, especially those with long short-term memory (LSTM), have presented surprising results in tasks related to time-dependent processing of signals, such as speech. One of the most challenging aspects of LSTM networks is the high computational cost of the training procedure, which has limited extended experimentation in several cases. In this work, we present a proposal to evaluate the hybrid models of neural networks to learn different reverberation conditions without any previous information. The results show that some combinations of LSTM and perceptron layers produce good results in comparison to those from pure LSTM networks, given a fixed number of layers. The evaluation was made based on quality measurements of the signal’s spectrum, the training time of the networks, and statistical validation of results. In total, 120 artificial neural networks of eight different types were trained and compared. The results help to affirm the fact that hybrid networks represent an important solution for speech signal enhancement, given that reduction in training time is on the order of 30%, in processes that can normally take several days or weeks, depending on the amount of data. The results also present advantages in efficiency, but without a significant drop in quality.


Author(s):  
Michelle Gutiérrez-Muñoz ◽  
Astryd González-Salazar ◽  
Marvin Coto-Jiménez

Speech signals are degraded in real-life environments, product of background noise or other factors. The processing of such signals for voice recognition and voice analysis systems presents important challenges. One of the conditions that make adverse quality difficult to handle in those systems is reverberation, produced by sound wave reflections that travel from the source to the microphone in multiple directions.To enhance signals in such adverse conditions, several deep learning-based methods have been proposed and proven to be effective. Recently, recurrent neural networks, especially those with long and short-term memory (LSTM), have presented surprising results in tasks related to time-dependent processing of signals, such as speech. One of the most challenging aspects of LSTM networks is the high computational cost of the training procedure, which has limited extended experimentation in several cases. In this work, we present a proposal to evaluate the hybrid models of neural networks to learn different reverberation conditions without any previous information. The results show that some combination of LSTM and perceptron layers produce good results in comparison to those from pure LSTM networks, given a fixed number of layers. The evaluation has been made based on quality measurements of the signal's spectrum, training time of the networks and statistical validation of results. Results help to affirm the fact that hybrid networks represent an important solution for speech signal enhancement, with advantages in efficiency, but without a significan drop in quality.


2021 ◽  
Author(s):  
Hugo Mitre-Hernandez ◽  
Rodolfo Ferro-Perez ◽  
Francisco Gonzalez-Hernandez

BACKGROUND Mental health effects during COVID-19 quarantine need to be handled because patients, relatives, and healthcare workers are living with negative emotional behaviors. The clinical disorders of depression and anxiety are evoking anger, fear, sadness, disgust, and reducing happiness. Therefore, track emotions with the help of psychologists on online consultations –to reduce the risk of contagion– will go a long way in assisting with mental health. The human micro-expressions can describe genuine emotions of people and can be captured by Deep Neural Networks (DNNs) models. But the challenge is to implement it under the poor performance of a part of society's computers and the low speed of internet connection. OBJECTIVE This study aimed to create a useful and usable web application to record emotions in a patient’s card in real-time, achieving a small data transfer, and a Convolutional Neural Networks (CNN) model with a low computational cost. METHODS To validate the low computational cost premise, firstly, we compare DNN architectures results, collecting the floating-point operations per second (FLOPS), the Number of Parameters (NP) and accuracy from the MobileNet, PeleeNet, Extended Deep Neural Network (EDNN), Inception- Based Deep Neural Network (IDNN) and our proposed Residual mobile-based Network (ResmoNet) model. Secondly, we compare the trained models' results in terms of Main Memory Utilization (MMU) and Response Time to complete the Emotion recognition (RTE). Finally, we design a data transfer that includes the raw data of emotions and the basic text information of the patient. The web application was evaluated with the System Usability Scale (SUS) and a utility questionnaire by psychologists and psychiatrists (experts). RESULTS All CNN models were set up using 150 epochs for training and testing comparing the results for each variable in ResmoNet with the best model. It was obtained that ResmoNet has 115,976 NP less than MobileNet, 243,901 FLOPS less than MobileNet, and 5% less accuracy than EDNN (95%). Moreover, ResmoNet used less MMU than any model, only EDNN overcomes ResmoNet in 0.01 seconds for RTE. Finally, with our model, we develop a web application to collect emotions in real-time during a psychological consultation. For data transfer, the patient’s card and raw emotional data have 2 kb with a UTF-8 encoding approximately. Finally, according to the experts, the web application has good usability (73.8 of 100) and utility (3.94 of 5). CONCLUSIONS A usable and useful web application for psychologists and psychiatrists is presented. This tool includes an efficient and light facial emotion recognition model. Its purpose is to be a complementary tool for diagnostic processes.


Author(s):  
Chunyuan Li ◽  
Changyou Chen ◽  
Yunchen Pu ◽  
Ricardo Henao ◽  
Lawrence Carin

Learning probability distributions on the weights of neural networks has recently proven beneficial in many applications. Bayesian methods such as Stochastic Gradient Markov Chain Monte Carlo (SG-MCMC) offer an elegant framework to reason about model uncertainty in neural networks. However, these advantages usually come with a high computational cost. We propose accelerating SG-MCMC under the masterworker framework: workers asynchronously and in parallel share responsibility for gradient computations, while the master collects the final samples. To reduce communication overhead, two protocols (downpour and elastic) are developed to allow periodic interaction between the master and workers. We provide a theoretical analysis on the finite-time estimation consistency of posterior expectations, and establish connections to sample thinning. Our experiments on various neural networks demonstrate that the proposed algorithms can greatly reduce training time while achieving comparable (or better) test accuracy/log-likelihood levels, relative to traditional SG-MCMC. When applied to reinforcement learning, it naturally provides exploration for asynchronous policy optimization, with encouraging performance improvement.


2020 ◽  
Vol 12 (12) ◽  
pp. 219
Author(s):  
Pin Yang ◽  
Huiyu Zhou ◽  
Yue Zhu ◽  
Liang Liu ◽  
Lei Zhang

The emergence of a large number of new malicious code poses a serious threat to network security, and most of them are derivative versions of existing malicious code. The classification of malicious code is helpful to analyze the evolutionary trend of malicious code families and trace the source of cybercrime. The existing methods of malware classification emphasize the depth of the neural network, which has the problems of a long training time and large computational cost. In this work, we propose the shallow neural network-based malware classifier (SNNMAC), a malware classification model based on shallow neural networks and static analysis. Our approach bridges the gap between precise but slow methods and fast but less precise methods in existing works. For each sample, we first generate n-grams from their opcode sequences of the binary file with a decompiler. An improved n-gram algorithm based on control transfer instructions is designed to reduce the n-gram dataset. Then, the SNNMAC exploits a shallow neural network, replacing the full connection layer and softmax with the average pooling layer and hierarchical softmax, to learn from the dataset and perform classification. We perform experiments on the Microsoft malware dataset. The evaluation result shows that the SNNMAC outperforms most of the related works with 99.21% classification precision and reduces the training time by more than half when compared with the methods using DNN (Deep Neural Networks).


2019 ◽  
Vol 44 (3) ◽  
pp. 303-330 ◽  
Author(s):  
Shallu Sharma ◽  
Rajesh Mehra

Abstract Convolutional neural networks (CNN) is a contemporary technique for computer vision applications, where pooling implies as an integral part of the deep CNN. Besides, pooling provides the ability to learn invariant features and also acts as a regularizer to further reduce the problem of overfitting. Additionally, the pooling techniques significantly reduce the computational cost and training time of networks which are equally important to consider. Here, the performances of pooling strategies on different datasets are analyzed and discussed qualitatively. This study presents a detailed review of the conventional and the latest strategies which would help in appraising the readers with the upsides and downsides of each strategy. Also, we have identified four fundamental factors namely network architecture, activation function, overlapping and regularization approaches which immensely affect the performance of pooling operations. It is believed that this work would help in extending the scope of understanding the significance of CNN along with pooling regimes for solving computer vision problems.


Sign in / Sign up

Export Citation Format

Share Document