Speech enhancement based on soft audible noise masking and noise power estimation

Rongshan Yu

doi:10.1016/j.specom.2013.05.006

Speech Enhancement Based on Adaptive Noise Power Estimation Using Spectral Difference

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1587/transfun.e94.a.2031 ◽

2011 ◽

Vol E94-A (10) ◽

pp. 2031-2034 ◽

Cited By ~ 2

Author(s):

Jae-Hun CHOI ◽

Joon-Hyuk CHANG ◽

Dong Kook KIM ◽

Suhyun KIM

Keyword(s):

Speech Enhancement ◽

Power Estimation ◽

Noise Power ◽

Spectral Difference ◽

Adaptive Noise

Download Full-text

Adaptive noise power estimation using spectral difference for robust speech enhancement

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2012.6288955 ◽

2012 ◽

Author(s):

Jae-Hun Choi ◽

Sang-Kyun Kim ◽

Joon-Hyuk Chang

Keyword(s):

Speech Enhancement ◽

Power Estimation ◽

Noise Power ◽

Spectral Difference ◽

Adaptive Noise

Download Full-text

A Probabilistic Combination Method of Minimum Statistics and Soft Decision for Robust Noise Power Estimation in Speech Enhancement

IEEE Signal Processing Letters ◽

10.1109/lsp.2007.910309 ◽

2008 ◽

Vol 15 ◽

pp. 95-98 ◽

Cited By ~ 7

Author(s):

Yun-Sik Park ◽

Joon-Hyuk Chang

Keyword(s):

Speech Enhancement ◽

Power Estimation ◽

Combination Method ◽

Noise Power ◽

Soft Decision ◽

Minimum Statistics

Download Full-text

Low-complexity artificial noise suppression methods for deep learning-based speech enhancement algorithms

EURASIP Journal on Audio Speech and Music Processing ◽

10.1186/s13636-021-00204-9 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Yuxuan Ke ◽

Andong Li ◽

Chengshi Zheng ◽

Renhua Peng ◽

Xiaodong Li

Keyword(s):

Deep Learning ◽

Speech Enhancement ◽

Noise Suppression ◽

Signal To Noise Ratio ◽

Low Complexity ◽

Speech Quality ◽

Artificial Noise ◽

Noise Power ◽

Noise Masking ◽

Residual Noise

AbstractDeep learning-based speech enhancement algorithms have shown their powerful ability in removing both stationary and non-stationary noise components from noisy speech observations. But they often introduce artificial residual noise, especially when the training target does not contain the phase information, e.g., ideal ratio mask, or the clean speech magnitude and its variations. It is well-known that once the power of the residual noise components exceeds the noise masking threshold of the human auditory system, the perceptual speech quality may degrade. One intuitive way is to further suppress the residual noise components by a postprocessing scheme. However, the highly non-stationary nature of this kind of residual noise makes the noise power spectral density (PSD) estimation a challenging problem. To solve this problem, the paper proposes three strategies to estimate the noise PSD frame by frame, and then the residual noise can be removed effectively by applying a gain function based on the decision-directed approach. The objective measurement results show that the proposed postfiltering strategies outperform the conventional postfilter in terms of segmental signal-to-noise ratio (SNR) as well as speech quality improvement. Moreover, the AB subjective listening test shows that the preference percentages of the proposed strategies are over 60%.

Download Full-text

Degree and Noise Power Estimation from Noisy Polynomial Data via AR Modelling

Digital Signal Processing ◽

10.1016/j.dsp.2021.103071 ◽

2021 ◽

pp. 103071

Author(s):

Asoke K. Nandi

Keyword(s):

Power Estimation ◽

Noise Power

Download Full-text

Noise masking method based on an effective ratio mask estimation in Gammatone channels

APSIPA Transactions on Signal and Information Processing ◽

10.1017/atsip.2018.7 ◽

2018 ◽

Vol 7 ◽

Cited By ~ 1

Author(s):

Feng Bao ◽

Waleed H. Abdulla

Keyword(s):

Signal To Noise Ratio ◽

Estimation Method ◽

Wiener Filter ◽

Power Spectra ◽

Auditory Scene Analysis ◽

Accurate Estimation ◽

Noise Power ◽

Noise Masking ◽

Time Frequency ◽

Mask Estimation

In computational auditory scene analysis, the accurate estimation of binary mask or ratio mask plays a key role in noise masking. An inaccurate estimation often leads to some artifacts and temporal discontinuity in the synthesized speech. To overcome this problem, we propose a new ratio mask estimation method in terms of Wiener filtering in each Gammatone channel. In the reconstruction of Wiener filter, we utilize the relationship of the speech and noise power spectra in each Gammatone channel to build the objective function for the convex optimization of speech power. To improve the accuracy of estimation, the estimated ratio mask is further modified based on its adjacent time–frequency units, and then smoothed by interpolating with the estimated binary masks. The objective tests including the signal-to-noise ratio improvement, spectral distortion and intelligibility, and subjective listening test demonstrate the superiority of the proposed method compared with the reference methods.

Download Full-text