One speaker recognition method based on feature fusion

Speaker Recognition Based on Fusion of a Deep and Shallow Recombination Gaussian Supervector

Electronics ◽

10.3390/electronics10010020 ◽

2020 ◽

Vol 10 (1) ◽

pp. 20

Author(s):

Linhui Sun ◽

Yunyi Bu ◽

Bo Zou ◽

Sheng Fu ◽

Pingan Li

Keyword(s):

Speaker Recognition ◽

Feature Fusion ◽

Recognition Rate ◽

Gaussian Mixture ◽

Recognition Method ◽

Different Types ◽

Feature Based ◽

Mel Frequency Cepstral Coefficient ◽

Fusion Feature ◽

Weight Coefficients

Extracting speaker’s personalized feature parameters is vital for speaker recognition. Only one kind of feature cannot fully reflect the speaker’s personality information. In order to represent the speaker’s identity more comprehensively and improve speaker recognition rate, we propose a speaker recognition method based on the fusion feature of a deep and shallow recombination Gaussian supervector. In this method, the deep bottleneck features are first extracted by Deep Neural Network (DNN), which are used for the input of the Gaussian Mixture Model (GMM) to obtain the deep Gaussian supervector. On the other hand, we input the Mel-Frequency Cepstral Coefficient (MFCC) to GMM directly to extract the traditional Gaussian supervector. Finally, the two categories of features are combined in the form of horizontal dimension augmentation. In addition, when the number of speakers to be recognized increases, in order to prevent the system recognition rate from falling sharply, we introduce the optimization algorithm to find the optimal weight before the feature fusion. The experiment results indicate that the speaker recognition rate based on the feature which is fused directly can reach 98.75%, which is 5% and 0.62% higher than the traditional feature and deep bottleneck feature, respectively. When the number of speakers increases, the fusion feature based on optimized weight coefficients can improve the recognition rate by 0.81%. It is validated that our proposed fusion method can effectively consider the complementarity of the different types of features and improve the speaker recognition rate.

Download Full-text

Speaker Identity Recognition by Acoustic and Visual Data Fusion through Personal Privacy for Smart Care and Service Applications

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2020.64.4.040404 ◽

2020 ◽

Vol 64 (4) ◽

pp. 40404-1-40404-16

Author(s):

I.-J. Ding ◽

C.-M. Ruan

Keyword(s):

Face Detection ◽

Speaker Recognition ◽

Visual Information ◽

Classification Tree ◽

Gaussian Mixture ◽

Recognition Method ◽

Indoor Space ◽

Identity Recognition ◽

Visual Identity ◽

Speaker Classification

Abstract With rapid developments in techniques related to the internet of things, smart service applications such as voice-command-based speech recognition and smart care applications such as context-aware-based emotion recognition will gain much attention and potentially be a requirement in smart home or office environments. In such intelligence applications, identity recognition of the specific member in indoor spaces will be a crucial issue. In this study, a combined audio-visual identity recognition approach was developed. In this approach, visual information obtained from face detection was incorporated into acoustic Gaussian likelihood calculations for constructing speaker classification trees to significantly enhance the Gaussian mixture model (GMM)-based speaker recognition method. This study considered the privacy of the monitored person and reduced the degree of surveillance. Moreover, the popular Kinect sensor device containing a microphone array was adopted to obtain acoustic voice data from the person. The proposed audio-visual identity recognition approach deploys only two cameras in a specific indoor space for conveniently performing face detection and quickly determining the total number of people in the specific space. Such information pertaining to the number of people in the indoor space obtained using face detection was utilized to effectively regulate the accurate GMM speaker classification tree design. Two face-detection-regulated speaker classification tree schemes are presented for the GMM speaker recognition method in this study—the binary speaker classification tree (GMM-BT) and the non-binary speaker classification tree (GMM-NBT). The proposed GMM-BT and GMM-NBT methods achieve excellent identity recognition rates of 84.28% and 83%, respectively; both values are higher than the rate of the conventional GMM approach (80.5%). Moreover, as the extremely complex calculations of face recognition in general audio-visual speaker recognition tasks are not required, the proposed approach is rapid and efficient with only a slight increment of 0.051 s in the average recognition time.

Download Full-text

Feature Fusion Based Hand Gesture Recognition Method for Automotive Interfaces

Chinese Journal of Electronics ◽

10.1049/cje.2020.06.008 ◽

2020 ◽

Vol 29 (6) ◽

pp. 1153-1164

Author(s):

Qianyi Xu ◽

Guihe Qin ◽

Minghui Sun ◽

Jie Yan ◽

Huiming Jiang ◽

...

Keyword(s):

Gesture Recognition ◽

Feature Fusion ◽

Hand Gesture Recognition ◽

Hand Gesture ◽

Recognition Method

Download Full-text

Target Recognition Method based on Feature Fusion

Proceedings of the 2019 2nd International Conference on Algorithms, Computing and Artificial Intelligence ◽

10.1145/3377713.3377734 ◽

2019 ◽

Author(s):

Chunqian He ◽

Dongsheng Li ◽

Siqi Wang

Keyword(s):

Feature Fusion ◽

Target Recognition ◽

Recognition Method

Download Full-text

Music emotion recognition method based on multi feature fusion

International Journal of Arts and Technology ◽

10.1504/ijart.2021.10043883 ◽

2021 ◽

Vol 13 (4) ◽

pp. 1

Author(s):

Yali Zhang

Keyword(s):

Emotion Recognition ◽

Feature Fusion ◽

Recognition Method

Download Full-text

A Proposed Speaker Recognition Method B Based on Long-Term Voice Features and Fuzzy Logic

Engineering and Technology Journal ◽

10.30684/etj.v39i1b.343 ◽

2021 ◽

Vol 39 (1B) ◽

pp. 1-10

Author(s):

Iman H. Hadi ◽

Alia K. Abdul-Hassan

Keyword(s):

Fuzzy Logic ◽

Speaker Recognition ◽

Recognition Accuracy ◽

Inner Product ◽

Maximum Frequency ◽

Recognition Method ◽

Data Set ◽

Zero Crossing ◽

Zero Crossing Rate

Speaker recognition depends on specific predefined steps. The most important steps are feature extraction and features matching. In addition, the category of the speaker voice features has an impact on the recognition process. The proposed speaker recognition makes use of biometric (voice) attributes to recognize the identity of the speaker. The long-term features were used such that maximum frequency, pitch and zero crossing rate (ZCR). In features matching step, the fuzzy inner product was used between feature vectors to compute the matching value between a claimed speaker voice utterance and test voice utterances. The experiments implemented using (ELSDSR) data set. These experiments showed that the recognition accuracy is 100% when using text dependent speaker recognition.

Download Full-text

Unmanned vehicle dynamic obstacle detection, tracking and recognition method based on laser sensor

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-10-2020-0143 ◽

2021 ◽

Vol 14 (2) ◽

pp. 239-251

Author(s):

Hualei Zhang ◽

Mohammad Asif Ikbal

Keyword(s):

Feature Fusion ◽

Obstacle Detection ◽

Geometric Features ◽

Recognition Method ◽

Content Type ◽

Detection And Tracking ◽

Spatio Temporal ◽

Dynamic Obstacle ◽

Real Vehicle ◽

Temporal Feature

PurposeIn response to these shortcomings, this paper proposes a dynamic obstacle detection and tracking method based on multi-feature fusion and a dynamic obstacle recognition method based on spatio-temporal feature vectors.Design/methodology/approachThe existing dynamic obstacle detection and tracking methods based on geometric features have a high false detection rate. The recognition methods based on the geometric features and motion status of dynamic obstacles are greatly affected by distance and scanning angle, and cannot meet the requirements of real traffic scene applications.FindingsFirst, based on the geometric features of dynamic obstacles, the obstacles are considered The echo pulse width feature is used to improve the accuracy of obstacle detection and tracking; second, the space-time feature vector is constructed based on the time dimension and space dimension information of the obstacle, and then the support vector machine method is used to realize the recognition of dynamic obstacles to improve the obstacle The accuracy of object recognition. Finally, the accuracy and effectiveness of the proposed method are verified by real vehicle tests.Originality/valueThe paper proposes a dynamic obstacle detection and tracking method based on multi-feature fusion and a dynamic obstacle recognition method based on spatio-temporal feature vectors. The accuracy and effectiveness of the proposed method are verified by real vehicle tests.

Download Full-text

Accurate recognition method of plant leaves based on multi-feature fusion

10.1117/12.2611757 ◽

2021 ◽

Author(s):

Ruikai Lin ◽

Junwei Ma ◽

Huiling Yu ◽

Yizhuo Zhang

Keyword(s):

Feature Fusion ◽

Plant Leaves ◽

Recognition Method

Download Full-text

AN EFFECTIVE COLOR FACE RECOGNITION BASED ON BEST COLOR FEATURE SELECTION ALGORITHM USING WEIGHTED FEATURES FUSION SYSTEM

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v8i2.3386 ◽

2013 ◽

Vol 8 (2) ◽

pp. 787-795

Author(s):

Sasi Kumar Balasundaram ◽

J. Umadevi ◽

B. Sankara Gomathi

Keyword(s):

Feature Selection ◽

Face Recognition ◽

Feature Fusion ◽

Recognition Performance ◽

Feature Selection Method ◽

Recognition Method ◽

Color Feature ◽

Color Component ◽

Pose Variation ◽

Color Face Recognition

This paper aims to achieve the best color face recognition performance. The newly introduced feature selection method takes advantage of novel learning which is used to find the optimal set of color-component features for the purpose of achieving the best face recognition result. The proposed color face recognition method consists of two parts namely color-component feature selection with boosting and color face recognition solution using selected color component features. This method is better than existing color face recognition methods with illumination, pose variation and low resolution face images. This system is based on the selection of the best color component features from various color models using the novel boosting learning framework. These selected color component features are then combined into a single concatenated color feature using weighted feature fusion. The effectiveness of color face recognition method has been successfully evaluated by the public face databases.

Download Full-text

Nonparametric Speaker Recognition Method Using Earth Mover's Distance

IEICE Transactions on Information and Systems ◽

10.1093/ietisy/e89-d.3.1074 ◽

2006 ◽

Vol E89-D (3) ◽

pp. 1074-1081 ◽

Cited By ~ 4

Author(s):

S. KUROIWA

Keyword(s):

Speaker Recognition ◽

Earth Mover’S Distance ◽

Recognition Method ◽

Earth Mover's Distance

Download Full-text