A Novel DBN Feature Fusion Model for Cross-Corpus Speech Emotion Recognition

Journal of Electrical and Computer Engineering ◽

10.1155/2016/7437860 ◽

2016 ◽

Vol 2016 ◽

pp. 1-11 ◽

Cited By ~ 3

Author(s):

Zou Cairong ◽

Zhang Xinran ◽

Zha Cheng ◽

Zhao Li

Keyword(s):

Emotion Recognition ◽

Feature Fusion ◽

Image Features ◽

Speech Emotion Recognition ◽

Feature Subset ◽

Fusion Model ◽

Emotional Information ◽

Separate Source ◽

Speech Spectrum ◽

New Feature

The feature fusion from separate source is the current technical difficulties of cross-corpus speech emotion recognition. The purpose of this paper is to, based on Deep Belief Nets (DBN) in Deep Learning, use the emotional information hiding in speech spectrum diagram (spectrogram) as image features and then implement feature fusion with the traditional emotion features. First, based on the spectrogram analysis by STB/Itti model, the new spectrogram features are extracted from the color, the brightness, and the orientation, respectively; then using two alternative DBN models they fuse the traditional and the spectrogram features, which increase the scale of the feature subset and the characterization ability of emotion. Through the experiment on ABC database and Chinese corpora, the new feature subset compared with traditional speech emotion features, the recognition result on cross-corpus, distinctly advances by 8.8%. The method proposed provides a new idea for feature fusion of emotion recognition.

Download Full-text

Speech Emotion Recognition Based on Three-Channel Feature Fusion of CNN and BiLSTM

Proceedings of the 2020 the 4th International Conference on Innovation in Artificial Intelligence ◽

10.1145/3390557.3394317 ◽

2020 ◽

Author(s):

Lilong Huang ◽

Jing Dong ◽

Dongsheng Zhou ◽

Qiang Zhang

Keyword(s):

Emotion Recognition ◽

Feature Fusion ◽

Speech Emotion Recognition

Download Full-text

A Feature Fusion Method Based on Extreme Learning Machine for Speech Emotion Recognition

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2018.8462219 ◽

2018 ◽

Cited By ~ 11

Author(s):

Lili Guo ◽

Longbiao Wang ◽

Jianwu Dang ◽

Linjuan Zhang ◽

Haotian Guan

Keyword(s):

Emotion Recognition ◽

Extreme Learning Machine ◽

Feature Fusion ◽

Speech Emotion Recognition ◽

Fusion Method ◽

Learning Machine

Download Full-text

Feature Fusion of Speech Emotion Recognition Based on Deep Learning

2018 International Conference on Network Infrastructure and Digital Content (IC-NIDC) ◽

10.1109/icnidc.2018.8525706 ◽

2018 ◽

Cited By ~ 4

Author(s):

Gang Liu ◽

Wei He ◽

Bicheng Jin

Keyword(s):

Deep Learning ◽

Emotion Recognition ◽

Feature Fusion ◽

Speech Emotion Recognition

Download Full-text

Feature Fusion Methods for Robust Speech Emotion Recognition Based on Deep Belief Networks

Proceedings of the Fifth International Conference on Network, Communication and Computing - ICNCC '16 ◽

10.1145/3033288.3033295 ◽

2016 ◽

Cited By ~ 1

Author(s):

Ao Wu ◽

Yongming Huang ◽

Guobao Zhang

Keyword(s):

Emotion Recognition ◽

Feature Fusion ◽

Speech Emotion Recognition ◽

Belief Networks ◽

Deep Belief Networks ◽

Fusion Methods

Download Full-text

Convolutional Recurrent Neural Networks Based Speech Emotion Recognition

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9321 ◽

2020 ◽

Vol 17 (8) ◽

pp. 3786-3789

Author(s):

P. Gayathri ◽

P. Gowri Priya ◽

L. Sravani ◽

Sandra Johnson ◽

Visanth Sampath

Keyword(s):

Neural Networks ◽

Emotion Recognition ◽

Recurrent Neural Networks ◽

Machine Learning Techniques ◽

Speech Emotion Recognition ◽

Emotional Information ◽

Feature Representations ◽

Emotional Factors ◽

Learning Techniques ◽

The Impact

Recognition of emotions is the aspect of speech recognition that is gaining more attention and the need for it is growing enormously. Although there are methods to identify emotion using machine learning techniques, we assume in this paper that calculating deltas and delta-deltas for customized features not only preserves effective emotional information, but also that the impact of irrelevant emotional factors, leading to a reduction in misclassification. Furthermore, Speech Emotion Recognition (SER) often suffers from the silent frames and irrelevant emotional frames. Meanwhile, the process of attention has demonstrated exceptional performance in learning related feature representations for specific tasks. Inspired by this, propose a Convolutionary Recurrent Neural Networks (ACRNN) based on Attention to learn discriminative features for SER, where the Mel-spectrogram with deltas and delta-deltas is used as input. Finally, experimental results show the feasibility of the proposed method and attain state-of-the-art performance in terms of unweighted average recall.

Download Full-text

Metric Learning Based Feature Representation with Gated Fusion Model for Speech Emotion Recognition

10.21437/interspeech.2021-1133 ◽

2021 ◽

Author(s):

Yuan Gao ◽

Jiaxing Liu ◽

Longbiao Wang ◽

Jianwu Dang

Keyword(s):

Emotion Recognition ◽

Metric Learning ◽

Feature Representation ◽

Speech Emotion Recognition ◽

Fusion Model

Download Full-text

Research on Speech Emotion Recognition Based on Weighted Euclidean Distance

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.543-547.2192 ◽

2014 ◽

Vol 543-547 ◽

pp. 2192-2195 ◽

Cited By ~ 1

Author(s):

Chen Chen Huang ◽

Wei Gong ◽

Wen Long Fu ◽

Dong Yu Feng

Keyword(s):

Emotion Recognition ◽

Template Matching ◽

Euclidean Distance ◽

Recognition Rate ◽

Speech Emotion Recognition ◽

Human Beings ◽

Analysis Method ◽

Characteristic Parameters ◽

Emotional Information ◽

Voice Data

As the most important medium of communication in human beings life, speech carries abundant emotional information. In recent years, how to recognize the speakers emotional state automatically from the speech is attracting extensive attention of researchers in various fields. In this paper, we studied the method of speech emotion recognition. We collected a total of 360 sentences from four speakers with the emotional statement about happiness, anger, surprise, sadness, and extracted eight emotional characteristics from these voice data. Contribution analysis method is proposed to determine the value of emotion characteristic parameters. We also have used the weighted Euclidean distance template matching to identify the speech emotion, got more than 80% of the average emotional recognition rate.

Download Full-text

Speech Emotion Recognition Based on Convolutional Neural Network and Feature Fusion

2019 IEEE 14th International Conference on Intelligent Systems and Knowledge Engineering (ISKE) ◽

10.1109/iske47853.2019.9170369 ◽

2019 ◽

Cited By ~ 1

Author(s):

Mengna Gao ◽

Jing Dong ◽

Dongsheng Zhou ◽

Xiaopeng Wei ◽

Qiang Zhang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Emotion Recognition ◽

Feature Fusion ◽

Speech Emotion Recognition

Download Full-text

Stressed speech emotion recognition using feature fusion of teager energy operator and MFCC

2017 8th International Conference on Computing, Communication and Networking Technologies (ICCCNT) ◽

10.1109/icccnt.2017.8204149 ◽

2017 ◽

Cited By ~ 12

Author(s):

Surekha Reddy Bandela ◽

T. Kishore Kumar

Keyword(s):

Emotion Recognition ◽

Feature Fusion ◽

Energy Operator ◽

Speech Emotion Recognition ◽

Teager Energy Operator ◽

Teager Energy

Download Full-text

Speech Emotion Recognition based on Multiple Feature Fusion

2019 Chinese Automation Congress (CAC) ◽

10.1109/cac48633.2019.8996487 ◽

2019 ◽

Author(s):

Changjiang Jiang ◽

Rong Mao ◽

Geng Liu ◽

Mingyi Wang

Keyword(s):

Emotion Recognition ◽

Feature Fusion ◽

Speech Emotion Recognition ◽

Multiple Feature ◽

Multiple Feature Fusion

Download Full-text