scholarly journals Non-negative matrix factorization based compensation of music for automatic speech recognition

Author(s):  
Bhiksha Raj ◽  
Tuomas Virtanen ◽  
Sourish Chaudhuri ◽  
Rita Singh
2016 ◽  
Vol 140 (4) ◽  
pp. 3450-3450
Author(s):  
Iori Miura ◽  
Yuuki Tachioka ◽  
Tomohiro Narita ◽  
Jun Ishii ◽  
Fuminori Yoshiyama ◽  
...  

Author(s):  
Tuan Pham

Source separation is popular problem in which English datasets is used by default. Besides, source separation or speech enhancement is an important pre-processing step for following processes e.g. automatic speech recognition, automatic answering machine or hearing ads…However, experiments of source separation on Vietnamese dataset is quite modest as well as lack of Vietnamese standard datasets for source separation. To deal these issues, we build a Vietnamese dataset for source separation by collecting utterances of broadcasters from VTV’s official website. Moreover, a novel method was proposed by using sparse non-negative matrix factorization and graph regularization. Experiments showed that the proposed method is outperformed baseline.      


2017 ◽  
Vol 29 (1) ◽  
pp. 114-124
Author(s):  
Kazuhiro Nakadai ◽  
◽  
Taiki Tezuka ◽  
Takami Yoshida ◽  

[abstFig src='/00290001/11.jpg' width='300' text='Ego-noise suppression achieves speech recognition even during motion' ] This paper addresses ego-motion noise suppression for a robot. Many ego-motion noise suppression methods use motion information such as position, velocity, and the acceleration of each joint to infer ego-motion noise. However, such inferences are not reliable, since motion information and ego-motion noise are not always correlated. We propose a new framework for ego-motion noise suppression based on single channel processing using only acoustic signals captured with a microphone. In the proposed framework, ego-motion noise features and their numbers are automatically estimated in advance from an ego-motion noise input using Infinite Non-negative Matrix Factorization (INMF), which is a non-parametric Bayesian model that does not use explicit motion information. After that, the proposed Semi-Blind INMF (SB-INMF) is applied to an input signal that consists of both the target and ego-motion noise signals. Ego-motion noise features, which are obtained with INMF, are used as inputs to the SB-INMF, and are treated as the fixed features for extracting the target signal. Finally, the target signal is extracted with SB-INMF using these newly-estimated features. The proposed framework was applied to ego-motion noise suppression on two types of humanoid robots. Experimental results showed that ego-motion noise was effectively and efficiently suppressed in terms of both signal-to-noise ratio and performance of automatic speech recognition compared to a conventional template-based ego-motion noise suppression method using motion information. Thus, the proposed method worked properly on a robot without a motion information interface.**This work is an extension of our publication “Taiki Tezuka, Takami Yoshida, Kazuhiro Nakadai: Ego-motion noise suppression for robots based on Semi-Blind Infinite Non-negative Matrix Factorization, ICRA 2014, pp.6293-6298, 2014.”


Sign in / Sign up

Export Citation Format

Share Document