Single-channel speech enhancement using spectral subtraction in the short-time modulation domain

Kuldip Paliwal; Kamil Wójcicki; Belinda Schwerin

doi:10.1016/j.specom.2010.02.004

Single-channel speech enhancement using spectral subtraction in the short-time modulation domain

Speech Communication ◽

10.1016/j.specom.2010.02.004 ◽

2010 ◽

Vol 52 (5) ◽

pp. 450-475 ◽

Cited By ~ 80

Author(s):

Kuldip Paliwal ◽

Kamil Wójcicki ◽

Belinda Schwerin

Keyword(s):

Speech Enhancement ◽

Single Channel ◽

Spectral Subtraction ◽

Time Modulation ◽

Modulation Domain ◽

Short Time

Download Full-text

Single channel speech enhancement using MMSE estimation of short-time modulation magnitude spectrum

10.21437/interspeech.2011-425 ◽

2011 ◽

Author(s):

Kuldip Paliwal ◽

Belinda Schwerin ◽

Kamil Wójcicki

Keyword(s):

Speech Enhancement ◽

Single Channel ◽

Magnitude Spectrum ◽

Time Modulation ◽

Mmse Estimation ◽

Short Time

Download Full-text

Single-channel speech enhancement based on improved frame-iterative spectral subtraction in the modulation domain

China Communications ◽

10.23919/jcc.2021.09.009 ◽

2021 ◽

Vol 18 (9) ◽

pp. 100-115

Author(s):

Chao Li ◽

Ting Jiang ◽

Sheng Wu

Keyword(s):

Speech Enhancement ◽

Single Channel ◽

Spectral Subtraction ◽

Modulation Domain

Download Full-text

Speech intelligibility enhancement for Thai-speaking cochlear implant listeners

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v13.i3.pp866-875 ◽

2019 ◽

Vol 13 (3) ◽

pp. 866

Author(s):

Siriporn Dachasilaruk ◽

Niphat Jantharamin ◽

Apichai Rungruang

Keyword(s):

Cochlear Implant ◽

Speech Enhancement ◽

Speech Intelligibility ◽

English Language ◽

Single Channel ◽

Spectral Subtraction ◽

Monosyllabic Words ◽

Listening Environments ◽

Babble Noise ◽

Vocoded Speech

Cochlear implant (CI) listeners encounter difficulties in communicating with other persons in noisy listening environments. However, most CI research has been carried out using the English language. In this study, single-channel speech enhancement (SE) strategies as a pre-processing approach for the CI system were investigated in terms of Thai speech intelligibility improvement. Two SE algorithms, namely multi-band spectral subtraction (MBSS) and Weiner filter (WF) algorithms, were evaluated. Speech signals consisting of monosyllabic and bisyllabic Thai words were degraded by speech-shaped noise and babble noise at SNR levels of 0, 5, and 10 dB. Then the noisy words were enhanced using SE algorithms. The enhanced words were fed into the CI system to synthesize vocoded speech. The vocoded speech was presented to twenty normal-hearing listeners. The results indicated that speech intelligibility was marginally improved by the MBSS algorithm and significantly improved by the WF algorithm in some conditions. The enhanced bisyllabic words showed a noticeably higher intelligibility improvement than the enhanced monosyllabic words in all conditions, particularly in speech-shaped noise. Such outcomes may be beneficial to Thai-speaking CI listeners.

Download Full-text

Single-channel speech enhancement: Using recurrent neuro-fuzzy voice activity detector and spectral subtraction algorithms

2008 IEEE International Conference on Systems, Man and Cybernetics ◽

10.1109/icsmc.2008.4811764 ◽

2008 ◽

Author(s):

Fang-Chen Chuang ◽

Jeen-Shing Wang ◽

Li-Ying Wu

Keyword(s):

Speech Enhancement ◽

Single Channel ◽

Spectral Subtraction ◽

Voice Activity Detector ◽

Neuro Fuzzy ◽

Voice Activity

Download Full-text

New Results in Modulation-Domain Single-Channel Speech Enhancement

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2017.2747082 ◽

2017 ◽

Vol 25 (11) ◽

pp. 2125-2137 ◽

Cited By ~ 2

Author(s):

Pejman Mowlaee ◽

Martin Blass ◽

W. Bastiaan Kleijn

Keyword(s):

Speech Enhancement ◽

Single Channel ◽

Modulation Domain

Download Full-text

Single-channel speech enhancement using kalman filtering in the modulation domain

10.21437/interspeech.2010-330 ◽

2010 ◽

Author(s):

Stephen So ◽

Kamil K. Wójcicki ◽

Kuldip K. Paliwal

Keyword(s):

Speech Enhancement ◽

Kalman Filtering ◽

Single Channel ◽

Modulation Domain

Download Full-text

Modulation domain spectral subtraction for speech enhancement

10.21437/interspeech.2009-413 ◽

2009 ◽

Author(s):

Kuldip Paliwal ◽

Belinda Schwerin ◽

Kamil Wójcicki

Keyword(s):

Speech Enhancement ◽

Spectral Subtraction ◽

Modulation Domain

Download Full-text

Single-Channel Speech Enhancement Using Critical-Band Rate Scale Based Improved Multi-Band Spectral Subtraction

Journal of Signal and Information Processing ◽

10.4236/jsip.2013.43040 ◽

2013 ◽

Vol 04 (03) ◽

pp. 314-326 ◽

Cited By ~ 1

Author(s):

Navneet Upadhyay ◽

Abhijit Karmakar

Keyword(s):

Speech Enhancement ◽

Single Channel ◽

Critical Band ◽

Spectral Subtraction ◽

Multi Band

Download Full-text

Phase-Aware Single-Channel Speech Enhancement With Modulation-Domain Kalman Filtering

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2018.2800525 ◽

2018 ◽

Vol 26 (5) ◽

pp. 937-950 ◽

Cited By ~ 9

Author(s):

Nikolaos Dionelis ◽

Mike Brookes

Keyword(s):

Speech Enhancement ◽

Kalman Filtering ◽

Single Channel ◽

Modulation Domain

Download Full-text

Phase-Sensitive Decision-Directed SNR Estimator for Single-Channel Speech Enhancement

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001417580034 ◽

2017 ◽

Vol 31 (08) ◽

pp. 1758003

Author(s):

Shifeng Ou ◽

Peng Song ◽

Ying Gao

Keyword(s):

Speech Enhancement ◽

Speech Processing ◽

Single Channel ◽

Signal To Noise Ratio ◽

A Priori ◽

Processing System ◽

Phase Information ◽

Amplitude Spectra ◽

Phase Sensitive ◽

Short Time

The a priori signal-to-noise ratio (SNR) plays an essential role in many speech enhancement systems. Most of the existing approaches to estimate the a priori SNR only exploit the amplitude spectra while making the phase neglected. Considering the fact that incorporating phase information into a speech processing system can significantly improve the speech quality, this paper proposes a phase-sensitive decision-directed (DD) approach for the a priori SNR estimate. By representing the short-time discrete Fourier transform (STFT) signal spectra geometrically in a complex plane, the proposed approach estimates the a priori SNR using both the magnitude and phase information while making no assumptions about the phase difference between clean speech and noise spectra. Objective evaluations in terms of the spectrograms, segmental SNR, log-spectral distance (LSD) and short-time objective intelligibility (STOI) measures are presented to demonstrate the superiority of the proposed approach compared to several competitive methods at different noise conditions and input SNR levels.

Download Full-text