Improved noise robustness of word HMMs based on weighted variance expansion for noisy speech recognition

Sukeyasu Kanno; Tetsuo Funada

doi:10.1002/scj.20290

Improved noise robustness of word HMMs based on weighted variance expansion for noisy speech recognition

Systems and Computers in Japan ◽

10.1002/scj.20290 ◽

2005 ◽

Vol 36 (13) ◽

pp. 57-68

Author(s):

Sukeyasu Kanno ◽

Tetsuo Funada

Keyword(s):

Speech Recognition ◽

Noise Robustness ◽

Noisy Speech ◽

Noisy Speech Recognition

Download Full-text

End-to-End Noisy Speech Recognition Using Fourier and Hilbert Spectrum Features

Electronics ◽

10.3390/electronics9071157 ◽

2020 ◽

Vol 9 (7) ◽

pp. 1157 ◽

Cited By ~ 1

Author(s):

Daria Vazhenina ◽

Konstantin Markov

Keyword(s):

Speech Recognition ◽

Speech Processing ◽

Data Augmentation ◽

Recognition System ◽

Noise Robustness ◽

Hilbert Spectrum ◽

Noisy Speech ◽

Hilbert Huang Transform ◽

Noisy Speech Recognition ◽

Non Stationary Signal

Despite the progress of deep neural networks over the last decade, the state-of-the-art speech recognizers in noisy environment conditions are still far from reaching satisfactory performance. Methods to improve noise robustness usually include adding components to the recognition system that often need optimization. For this reason, data augmentation of the input features derived from the Short-Time Fourier Transform (STFT) has become a popular approach. However, for many speech processing tasks, there is an evidence that the combination of STFT-based and Hilbert–Huang transform (HHT)-based features improves the overall performance. The Hilbert spectrum can be obtained using adaptive mode decomposition (AMD) techniques, which are noise-robust and suitable for non-linear and non-stationary signal analysis. In this study, we developed a DeepSpeech2-based recognition system by adding a combination of STFT and HHT spectrum-based features. We propose several ways to combine those features at different levels of the neural network. All evaluations were performed using the WSJ and CHiME-4 databases. Experimental results show that combining STFT and HHT spectra leads to a 5–7% relative improvement in noisy speech recognition.

Download Full-text

Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-Target Learning for Noisy Speech Recognition

10.21437/interspeech.2016-388 ◽

2016 ◽

Cited By ~ 10

Author(s):

Masato Mimura ◽

Shinsuke Sakai ◽

Tatsuya Kawahara

Keyword(s):

Speech Recognition ◽

Joint Optimization ◽

Acoustic Model ◽

Denoising Autoencoder ◽

Noisy Speech ◽

Model Based ◽

Noisy Speech Recognition

Download Full-text

An improved parallel model combination method for noisy speech recognition

2009 IEEE Workshop on Automatic Speech Recognition & Understanding ◽

10.1109/asru.2009.5373332 ◽

2009 ◽

Cited By ~ 2

Author(s):

Hadi Veisi ◽

Hossein Sameti

Keyword(s):

Speech Recognition ◽

Combination Method ◽

Parallel Model ◽

Model Combination ◽

Noisy Speech ◽

Noisy Speech Recognition

Download Full-text

Noisy speech recognition based on HMMs, Wiener filters and re-evaluation of most likely candidates

IEEE International Conference on Acoustics Speech and Signal Processing ◽

10.1109/icassp.1993.319241 ◽

1993 ◽

Cited By ~ 2

Author(s):

S.V. Vaseghi ◽

B.P. Milner

Keyword(s):

Speech Recognition ◽

Noisy Speech ◽

Wiener Filters ◽

Noisy Speech Recognition

Download Full-text

A novel classifier modification approach to missing data problem for noisy speech recognition

7'th International Symposium on Telecommunications (IST'2014) ◽

10.1109/istel.2014.7000747 ◽

2014 ◽

Cited By ~ 1

Author(s):

Kian Ebrahim Kafoori ◽

Seyed Mohammad Ahadi

Keyword(s):

Speech Recognition ◽

Missing Data ◽

Noisy Speech ◽

Missing Data Problem ◽

Data Problem ◽

Noisy Speech Recognition

Download Full-text

Word Graph Based Feature Enhancement for Noisy Speech Recognition

2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 ◽

10.1109/icassp.2007.366927 ◽

2007 ◽

Cited By ~ 5

Author(s):

Zhi-Jie Yan ◽

Frank K. Soong ◽

Ren-Hua Wang

Keyword(s):

Speech Recognition ◽

Noisy Speech ◽

Feature Enhancement ◽

Noisy Speech Recognition

Download Full-text

Unsupervised Beamforming Based on Multichannel Nonnegative Matrix Factorization for Noisy Speech Recognition

2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2018.8462642 ◽

2018 ◽

Cited By ~ 4

Author(s):

Kazuki Shimada ◽

Yoshiaki Bando ◽

Masato Mimura ◽

Katsutoshi Itoyama ◽

Kazuyoshi Yoshii ◽

...

Keyword(s):

Speech Recognition ◽

Matrix Factorization ◽

Nonnegative Matrix Factorization ◽

Nonnegative Matrix ◽

Noisy Speech ◽

Noisy Speech Recognition

Download Full-text

Cepstral Normalization Combined with CSFN for Noisy Speech Recognition

Journal of Korea Multimedia Society ◽

10.9717/kmms.2011.14.10.1221 ◽

2011 ◽

Vol 14 (10) ◽

pp. 1221-1228

Author(s):

Sook-Nam Choi ◽

Guang-Hu Shen ◽

Hyun-Yeol Chung

Keyword(s):

Speech Recognition ◽

Noisy Speech ◽

Noisy Speech Recognition

Download Full-text

Discriminative approach to dynamic variance adaptation for noisy speech recognition

2011 Joint Workshop on Hands-free Speech Communication and Microphone Arrays ◽

10.1109/hscma.2011.5942414 ◽

2011 ◽

Cited By ~ 1

Author(s):

Marc Delcroix ◽

Shinji Watanabe ◽

Tomohiro Nakatani ◽

Atsushi Nakamura

Keyword(s):

Speech Recognition ◽

Noisy Speech ◽

Discriminative Approach ◽

Noisy Speech Recognition

Download Full-text

Complex spectrum circle centroid for microphone-array-based noisy speech recognition

10.21437/interspeech.2004-306 ◽

2004 ◽

Author(s):

Shigeki Sagayama ◽

Okajima Takashi ◽

Kamamoto Yutaka ◽

Nishimoto Takuya

Keyword(s):

Speech Recognition ◽

Microphone Array ◽

Complex Spectrum ◽

Noisy Speech ◽

Noisy Speech Recognition

Download Full-text