scholarly journals A Robust Method for Speech Emotion Recognition Based on Infinite Student’st-Mixture Model

2015 ◽  
Vol 2015 ◽  
pp. 1-10
Author(s):  
Xinran Zhang ◽  
Huawei Tao ◽  
Cheng Zha ◽  
Xinzhou Xu ◽  
Li Zhao

Speech emotion classification method, proposed in this paper, is based on Student’st-mixture model with infinite component number (iSMM) and can directly conduct effective recognition for various kinds of speech emotion samples. Compared with the traditional GMM (Gaussian mixture model), speech emotion model based on Student’st-mixture can effectively handle speech sample outliers that exist in the emotion feature space. Moreover,t-mixture model could keep robust to atypical emotion test data. In allusion to the high data complexity caused by high-dimensional space and the problem of insufficient training samples, a global latent space is joined to emotion model. Such an approach makes the number of components divided infinite and forms an iSMM emotion model, which can automatically determine the best number of components with lower complexity to complete various kinds of emotion characteristics data classification. Conducted over one spontaneous (FAU Aibo Emotion Corpus) and two acting (DES and EMO-DB) universal speech emotion databases which have high-dimensional feature samples and diversiform data distributions, the iSMM maintains better recognition performance than the comparisons. Thus, the effectiveness and generalization to the high-dimensional data and the outliers are verified. Hereby, the iSMM emotion model is verified as a robust method with the validity and generalization to outliers and high-dimensional emotion characters.

2015 ◽  
Vol 781 ◽  
pp. 551-554 ◽  
Author(s):  
Chaidiaw Thiangtham ◽  
Jakkree Srinonchat

Speech Emotion Recognition has widely researched and applied to some appllication such as for communication with robot, E-learning system and emergency call etc.Speech emotion feature extraction is an importance key to achieve the speech emotion recognition which can be classify for personal identity. Speech emotion features are extracted into several coefficients such as Linear Predictive Coefficients (LPCs), Linear Spectral Frequency (LSF), Zero-Crossing (ZC), Mel-Frequency Cepstrum Coefficients (MFCC) [1-6] etc. There are some of research works which have been done in the speech emotion recgnition. A study of zero-crossing with peak-amplitudes in speech emotion classification is introduced in [4]. The results shown that it provides the the technique to extract the emotion feature in time-domain, which still got the problem in amplitude shifting. The emotion recognition from speech is descrpited in [5]. It used the Gaussian Mixture Model (GMM) for extractor of feature speech. The GMM is provided the good results to reduce the back ground noise, howere it still have to focus on random noise in GMM for recognition model. The speech emotion recognition using hidden markov model and support vector machine is explained in [6]. The results shown the average performance of recognition system according to the features of speech emotion still has got the error information. Thus [1-6] provides the recognition performance which still requiers more focus on speech features.


2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Hariharan Muthusamy ◽  
Kemal Polat ◽  
Sazali Yaacob

Recently, researchers have paid escalating attention to studying the emotional state of an individual from his/her speech signals as the speech signal is the fastest and the most natural method of communication between individuals. In this work, new feature enhancement using Gaussian mixture model (GMM) was proposed to enhance the discriminatory power of the features extracted from speech and glottal signals. Three different emotional speech databases were utilized to gauge the proposed methods. Extreme learning machine (ELM) andk-nearest neighbor (kNN) classifier were employed to classify the different types of emotions. Several experiments were conducted and results show that the proposed methods significantly improved the speech emotion recognition performance compared to research works published in the literature.


Symmetry ◽  
2020 ◽  
Vol 13 (1) ◽  
pp. 19
Author(s):  
Hsiuying Wang

High-dimensional data recognition problem based on the Gaussian Mixture model has useful applications in many area, such as audio signal recognition, image analysis, and biological evolution. The expectation-maximization algorithm is a popular approach to the derivation of the maximum likelihood estimators of the Gaussian mixture model (GMM). An alternative solution is to adopt a generalized Bayes estimator for parameter estimation. In this study, an estimator based on the generalized Bayes approach is established. A simulation study shows that the proposed approach has a performance competitive to that of the conventional method in high-dimensional Gaussian mixture model recognition. We use a musical data example to illustrate this recognition problem. Suppose that we have audio data of a piece of music and know that the music is from one of four compositions, but we do not know exactly which composition it comes from. The generalized Bayes method shows a higher average recognition rate than the conventional method. This result shows that the generalized Bayes method is a competitor to the conventional method in this real application.


Aerospace ◽  
2021 ◽  
Vol 8 (12) ◽  
pp. 374
Author(s):  
Langfu Cui ◽  
Chaoqi Zhang ◽  
Qingzhen Zhang ◽  
Junle Wang ◽  
Yixuan Wang ◽  
...  

There are some problems such as uncertain thresholds, high dimension of monitoring parameters and unclear parameter relationships in the anomaly detection of aero-engine gas path. These problems make it difficult for the high accuracy of anomaly detection. In order to improve the accuracy of aero-engine gas path anomaly detection, a method based on Markov Transition Field and LSTM is proposed in this paper. The correlation among high-dimensional QAR data is obtained based on Markov Transition Field and hierarchical clustering. According to the correlation analysis of high-dimensional QAR data, a multi-input and multi-output LSTM network is constructed to realize one-step rolling prediction. A Gaussian mixture model of the residuals between predicted value and true value is constructed. The three-sigma rule is applied to detect outliers based on the Gaussian mixture model of the residuals. The experimental results show that the proposed method has high accuracy for aero-engine gas path anomaly detection.


Author(s):  
Yan Li ◽  
Simon Williams ◽  
Bill Moran ◽  
Allison Kealy ◽  
Guenther Retscher

The extensive deployment of wireless infrastructure provides a low-cost way to track mobile users in indoor environment. This paper demonstrates a prototype model of an accurate and reliable room location awareness system in a real public environment, where three typical problems arise. First, a massive number of access points (APs) can be sensed leading to a high-dimensional classification problem. Second, heterogeneous devices record different received signal strength (RSS) levels due to the variations in chip-set and antenna attenuation. Third, APs are not necessarily visible in every scanning cycle leading to missing data. This paper presents a probabilistic Wi-Fi fingerprinting method in a hidden Markov model (HMM) framework for mobile user tracking. Considering the spatial correlation of the signal strengths from multiple APs, a Multivariate Gaussian Mixture Model (MVGMM) is fitted to model the probability distribution of RSS measurements in each cell. Furthermore, the unseen property of invisible AP has been investigated in this research, and demonstrated the efficiency of differentiation between cells. The proposed system is able to achieve comparable localization performance. The filed test results present a reliable 97% localization room level accuracy of multiple mobile users in a real university campus WiFi network without any prior knowledge of the environment.


Sign in / Sign up

Export Citation Format

Share Document