Non-negative matrix factorization based compensation of music for automatic speech recognition

Source separation is popular problem in which English datasets is used by default. Besides, source separation or speech enhancement is an important pre-processing step for following processes e.g. automatic speech recognition, automatic answering machine or hearing ads…However, experiments of source separation on Vietnamese dataset is quite modest as well as lack of Vietnamese standard datasets for source separation. To deal these issues, we build a Vietnamese dataset for source separation by collecting utterances of broadcasters from VTV’s official website. Moreover, a novel method was proposed by using sparse non-negative matrix factorization and graph regularization. Experiments showed that the proposed method is outperformed baseline.

Download Full-text

Non-negative Matrix Factorization Based Noise Reduction for Noise Robust Automatic Speech Recognition

Latent Variable Analysis and Signal Separation - Lecture Notes in Computer Science ◽

10.1007/978-3-642-28551-6_42 ◽

2012 ◽

pp. 338-346 ◽

Cited By ~ 7

Author(s):

Seon Man Kim ◽

Ji Hun Park ◽

Hong Kook Kim ◽

Sung Joo Lee ◽

Yun Keun Lee

Keyword(s):

Speech Recognition ◽

Noise Reduction ◽

Automatic Speech Recognition ◽

Matrix Factorization ◽

Noise Robust ◽

Non Negative Matrix Factorization

Download Full-text

Multi-channel Non-negative Matrix Factorization Initialized with Full-rank and Rank-1 Spatial Correlation Matrix for Speech Recognition

2018 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS) ◽

10.1109/ispacs.2018.8923534 ◽

2018 ◽

Author(s):

Yuuki Tachioka

Keyword(s):

Speech Recognition ◽

Spatial Correlation ◽

Matrix Factorization ◽

Correlation Matrix ◽

Full Rank ◽

Non Negative Matrix Factorization

Download Full-text

Speech enhancement using beamforming and non negative matrix factorization for robust speech recognition in the CHiME-3 challenge

2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) ◽

10.1109/asru.2015.7404826 ◽

2015 ◽

Cited By ~ 3

Author(s):

Thanh T Vu ◽

Benjamin Bigot ◽

Eng Siong Chng

Keyword(s):

Speech Recognition ◽

Speech Enhancement ◽

Matrix Factorization ◽

Robust Speech Recognition ◽

Non Negative Matrix Factorization

Download Full-text

Ego-Noise Suppression for Robots Based on Semi-Blind Infinite Non-Negative Matrix Factorization

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2017.p0114 ◽

2017 ◽

Vol 29 (1) ◽

pp. 114-124

Author(s):

Kazuhiro Nakadai ◽

◽

Taiki Tezuka ◽

Takami Yoshida ◽

Keyword(s):

Speech Recognition ◽

Matrix Factorization ◽

Noise Suppression ◽

Single Channel ◽

Signal To Noise Ratio ◽

Humanoid Robots ◽

Motion Information ◽

Target Signal ◽

And Performance ◽

Non Negative Matrix Factorization

[abstFig src='/00290001/11.jpg' width='300' text='Ego-noise suppression achieves speech recognition even during motion' ] This paper addresses ego-motion noise suppression for a robot. Many ego-motion noise suppression methods use motion information such as position, velocity, and the acceleration of each joint to infer ego-motion noise. However, such inferences are not reliable, since motion information and ego-motion noise are not always correlated. We propose a new framework for ego-motion noise suppression based on single channel processing using only acoustic signals captured with a microphone. In the proposed framework, ego-motion noise features and their numbers are automatically estimated in advance from an ego-motion noise input using Infinite Non-negative Matrix Factorization (INMF), which is a non-parametric Bayesian model that does not use explicit motion information. After that, the proposed Semi-Blind INMF (SB-INMF) is applied to an input signal that consists of both the target and ego-motion noise signals. Ego-motion noise features, which are obtained with INMF, are used as inputs to the SB-INMF, and are treated as the fixed features for extracting the target signal. Finally, the target signal is extracted with SB-INMF using these newly-estimated features. The proposed framework was applied to ego-motion noise suppression on two types of humanoid robots. Experimental results showed that ego-motion noise was effectively and efficiently suppressed in terms of both signal-to-noise ratio and performance of automatic speech recognition compared to a conventional template-based ego-motion noise suppression method using motion information. Thus, the proposed method worked properly on a robot without a motion information interface.**This work is an extension of our publication “Taiki Tezuka, Takami Yoshida, Kazuhiro Nakadai: Ego-motion noise suppression for robots based on Semi-Blind Infinite Non-negative Matrix Factorization, ICRA 2014, pp.6293-6298, 2014.”

Download Full-text