Wearable Hearing Device Spectral Enhancement Driven by Non-Negative Sparse Coding-Based Residual Noise Reduction

Seon Man Kim

doi:10.3390/s20205751

Wearable Hearing Device Spectral Enhancement Driven by Non-Negative Sparse Coding-Based Residual Noise Reduction

Sensors ◽

10.3390/s20205751 ◽

2020 ◽

Vol 20 (20) ◽

pp. 5751

Author(s):

Seon Man Kim

Keyword(s):

Speech Enhancement ◽

Hearing Aids ◽

Sparse Coding ◽

Signal To Noise Ratio ◽

A Priori ◽

Wiener Filter ◽

Unified Framework ◽

Perceptual Evaluation ◽

Residual Noise ◽

Hearing Device

This paper proposes a novel technique to improve a spectral statistical filter for speech enhancement, to be applied in wearable hearing devices such as hearing aids. The proposed method is implemented considering a 32-channel uniform polyphase discrete Fourier transform filter bank, for which the overall algorithm processing delay is 8 ms in accordance with the hearing device requirements. The proposed speech enhancement technique, which exploits the concepts of both non-negative sparse coding (NNSC) and spectral statistical filtering, provides an online unified framework to overcome the problem of residual noise in spectral statistical filters under noisy environments. First, the spectral gain attenuator of the statistical Wiener filter is obtained using the a priori signal-to-noise ratio (SNR) estimated through a decision-directed approach. Next, the spectrum estimated using the Wiener spectral gain attenuator is decomposed by applying the NNSC technique to the target speech and residual noise components. These components are used to develop an NNSC-based Wiener spectral gain attenuator to achieve enhanced speech. The performance of the proposed NNSC–Wiener filter was evaluated through a perceptual evaluation of the speech quality scores under various noise conditions with SNRs ranging from -5 to 20 dB. The results indicated that the proposed NNSC–Wiener filter can outperform the conventional Wiener filter and NNSC-based speech enhancement methods at all SNRs.

Download Full-text

Noise Estimation and Suppression Using Nonlinear Function withA PrioriSpeech Absence Probability in Speech Enhancement

Journal of Sensors ◽

10.1155/2016/5352437 ◽

2016 ◽

Vol 2016 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Soojeong Lee ◽

Gangseong Lee

Keyword(s):

Speech Enhancement ◽

Signal To Noise Ratio ◽

A Priori ◽

Nonlinear Function ◽

Noise Estimation ◽

Noise Power ◽

Sigmoid Function ◽

Noisy Environments ◽

Adaptive Parameter ◽

Residual Noise

This paper proposes a noise-biased compensation of minimum statistics (MS) method using a nonlinear function anda priorispeech absence probability (SAP) for speech enhancement in highly nonstationary noisy environments. The MS method is a well-known technique for noise power estimation in nonstationary noisy environments; however, it tends to bias noise estimation below that of the true noise level. The proposed method is combined with an adaptive parameter based on a sigmoid function anda prioriSAP for residual noise reduction. Additionally, our method uses an autoparameter to control the trade-off between speech distortion and residual noise. We evaluate the estimation of noise power in highly nonstationary and varying noise environments. The improvement can be confirmed in terms of signal-to-noise ratio (SNR) and the Itakura-Saito Distortion Measure (ISDM).

Download Full-text

Dual-Mic Speech Enhancement Based on TF-GSC with Leakage Suppression and Signal Recovery

Applied Sciences ◽

10.3390/app11062816 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2816

Author(s):

Hansol Kim ◽

Jong Won Shin

Keyword(s):

Speech Enhancement ◽

Wiener Filter ◽

Signal Recovery ◽

Gain Function ◽

Microphone Signal ◽

Perceptual Evaluation ◽

Blocking Matrix ◽

Adaptive Noise ◽

Adaptive Noise Canceller ◽

Sidelobe Canceller

The transfer function-generalized sidelobe canceller (TF-GSC) is one of the most popular structures for the adaptive beamformer used in multi-channel speech enhancement. Although the TF-GSC has shown decent performance, a certain amount of steering error is inevitable, which causes leakage of speech components through the blocking matrix (BM) and distortion in the fixed beamformer (FBF) output. In this paper, we propose to suppress the leaked signal in the output of the BM and restore the desired signal in the FBF output of the TF-GSC. To reduce the risk of attenuating speech in the adaptive noise canceller (ANC), the speech component in the output of the BM is suppressed by applying a gain function similar to the square-root Wiener filter, assuming that a certain portion of the desired speech should be leaked into the BM output. Additionally, we propose to restore the attenuated desired signal in the FBF output by adding some of the microphone signal components back, depending on how microphone signals are related to the FBF and BM outputs. The experimental results showed that the proposed TF-GSC outperformed conventional TF-GSC in terms of the perceptual evaluation of speech quality (PESQ) scores under various noise conditions and the direction of arrivals for the desired and interfering sources.

Download Full-text

Improving Speech Quality for Hearing Aid Applications Based on Wiener Filter and Composite of Deep Denoising Autoencoders

Signals ◽

10.3390/signals1020008 ◽

2020 ◽

Vol 1 (2) ◽

pp. 138-156

Author(s):

Raghad Yaseen Lazim ◽

Zhu Yun ◽

Xiaojun Wu

Keyword(s):

Hearing Loss ◽

Signal To Noise Ratio ◽

Hearing Aid ◽

Contextual Information ◽

Hybrid Approach ◽

Wiener Filter ◽

Speech Quality ◽

Denoising Autoencoder ◽

Noise Component ◽

Perceptual Evaluation

In hearing aid devices, speech enhancement techniques are a critical component to enable users with hearing loss to attain improved speech quality under noisy conditions. Recently, the deep denoising autoencoder (DDAE) was adopted successfully for recovering the desired speech from noisy observations. However, a single DDAE cannot extract contextual information sufficiently due to the poor generalization in an unknown signal-to-noise ratio (SNR), the local minima, and the fact that the enhanced output shows some residual noise and some level of discontinuity. In this paper, we propose a hybrid approach for hearing aid applications based on two stages: (1) the Wiener filter, which attenuates the noise component and generates a clean speech signal; (2) a composite of three DDAEs with different window lengths, each of which is specialized for a specific enhancement task. Two typical high-frequency hearing loss audiograms were used to test the performance of the approach: Audiogram 1 = (0, 0, 0, 60, 80, 90) and Audiogram 2 = (0, 15, 30, 60, 80, 85). The hearing-aid speech perception index, the hearing-aid speech quality index, and the perceptual evaluation of speech quality were used to evaluate the performance. The experimental results show that the proposed method achieved significantly better results compared with the Wiener filter or a single deep denoising autoencoder alone.

Download Full-text

Enhancement of Speech Intelligibility using Binary Mask Based on Noise Constraints

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c5260.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 3509-3516

Keyword(s):

Hearing Aids ◽

Speech Intelligibility ◽

Signal To Noise Ratio ◽

Random Noise ◽

Wiener Filter ◽

Binary Mask ◽

Gain Function ◽

Magnitude Spectrum ◽

Objective Tests ◽

Babble Noise

The primary aim of this paper is to examine the application of binary mask to improve intelligibility in most unfavorable conditions where hearing impaired/normal listeners find it difficult to understand what is being told. Most of the existing noise reduction algorithms are known to improve the speech quality but they hardly improve speech intelligibility. The paper proposed by Gibak Kim and Philipos C. Loizou uses the Weiner gain function for improving speech intelligibility. Here, in this paper we have proposed to apply the same approach in magnitude spectrum using the parametric wiener filter in order to study its effects on overall speech intelligibility. Subjective and objective tests were conducted to evaluate the performance of the enhanced speech for various types of noises. The results clearly indicate that there is an improvement in average segmental signal-to-noise ratio for the speech corrupted at -5dB, 0dB, 5dB and 10dB SNR values for random noise, babble noise, car noise and helicopter noise. This technique can be used in real time applications, such as mobile, hearing aids and speech–activated machines

Download Full-text

Incoherent Discriminative Dictionary Learning for Speech Enhancement

Journal of Telecommunications and Information Technology ◽

10.26636/jtit.2018.121317 ◽

2018 ◽

Vol 3 ◽

pp. 42-54

Author(s):

Dima Shaheen ◽

Oumayma Al Dakkak ◽

Mohiedin Wainakh

Keyword(s):

Speech Enhancement ◽

Dictionary Learning ◽

Learning Algorithm ◽

Amplitude Spectrum ◽

Wiener Filter ◽

Perceptual Evaluation ◽

Source Confusion ◽

The Many ◽

The Cost ◽

Discriminative Dictionary Learning

Speech enhancement is one of the many challenging tasks in signal processing, especially in the case of nonstationary speech-like noise. In this paper a new incoherent discriminative dictionary learning algorithm is proposed to model both speech and noise, where the cost function accounts for both “source confusion” and “source distortion” errors, with a regularization term that penalizes the coherence between speech and noise sub-dictionaries. At the enhancement stage, we use sparse coding on the learnt dictionary to ﬁnd an estimate for both clean speech and noise amplitude spectrum. In the ﬁnal phase, the Wiener ﬁlter is used to reﬁne the clean speech estimate. Experiments on the Noizeus dataset, using two objective speech enhancement measures: frequency-weighted segmental SNR and Perceptual Evaluation of Speech Quality (PESQ) demonstrate that the proposed algorithm outperforms other speech enhancement methods tested.

Download Full-text

Dual-Channel Speech Enhancement Based on Speech Presence Probability

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8796.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2198-2200

Keyword(s):

Noise Reduction ◽

Speech Enhancement ◽

Hearing Aids ◽

Speech Processing ◽

Speech Signal ◽

Reduction Process ◽

Coherent Noise ◽

Residual Noise ◽

Speech Distortion ◽

Reduction Methods

This paper introduces technology to improve sound quality, which serves the needs of media and entertainment. Major challenging problem in the speech processing applications like mobile phones, hands-free phones, car communication, teleconference systems, hearing aids, voice coders, automatic speech recognition and forensics etc., is to eliminate the background noise. Speech enhancement algorithms are widely used for these applications in order to remove the noise from degraded speech in the noisy environment. Hence, the conventional noise reduction methods introduce more residual noise and speech distortion. So, it has been found that the noise reduction process is more effective to improve the speech quality but it affects the intelligibility of the clean speech signal. In this paper, we introduce a new model of coherence-based noise reduction method for the complex noise environment in which a target speech coexists with a coherent noise around. From the coherence model, the information of speech presence probability is added to better track noise variation accurately; and during the speech presence and speech absent period, adaptive coherence-based method is adjusted. The performance of suggested method is evaluated in condition of diffuse and real street noise, and it improves the speech signal quality less speech distortion and residual noise.

Download Full-text

Speech enhancement employing a sigmoid -type gain function with a modified a priori signal-to-noise ratio (SNR) estimator

2008 Canadian Conference on Electrical and Computer Engineering ◽

10.1109/ccece.2008.4564612 ◽

2008 ◽

Cited By ~ 1

Author(s):

Md. Jahangir Alam ◽

Douglas O'Shaughnessy ◽

Sid-Ahmed Selouani

Keyword(s):

Speech Enhancement ◽

Signal To Noise Ratio ◽

A Priori ◽

Gain Function ◽

Signal To Noise ◽

Noise Ratio

Download Full-text

Phase-Sensitive Decision-Directed SNR Estimator for Single-Channel Speech Enhancement

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001417580034 ◽

2017 ◽

Vol 31 (08) ◽

pp. 1758003

Author(s):

Shifeng Ou ◽

Peng Song ◽

Ying Gao

Keyword(s):

Speech Enhancement ◽

Speech Processing ◽

Single Channel ◽

Signal To Noise Ratio ◽

A Priori ◽

Processing System ◽

Phase Information ◽

Amplitude Spectra ◽

Phase Sensitive ◽

Short Time

The a priori signal-to-noise ratio (SNR) plays an essential role in many speech enhancement systems. Most of the existing approaches to estimate the a priori SNR only exploit the amplitude spectra while making the phase neglected. Considering the fact that incorporating phase information into a speech processing system can significantly improve the speech quality, this paper proposes a phase-sensitive decision-directed (DD) approach for the a priori SNR estimate. By representing the short-time discrete Fourier transform (STFT) signal spectra geometrically in a complex plane, the proposed approach estimates the a priori SNR using both the magnitude and phase information while making no assumptions about the phase difference between clean speech and noise spectra. Objective evaluations in terms of the spectrograms, segmental SNR, log-spectral distance (LSD) and short-time objective intelligibility (STOI) measures are presented to demonstrate the superiority of the proposed approach compared to several competitive methods at different noise conditions and input SNR levels.

Download Full-text

Speech enhancement algorithm of improved OMLSA based on bilateral spectrogram filtering

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-192088 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6881-6889

Author(s):

Jie Wang ◽

Linhuang Yan ◽

Jiayi Tian ◽

Minmin Yuan

Keyword(s):

Speech Enhancement ◽

Visual Processing ◽

Single Channel ◽

Signal To Noise Ratio ◽

Spectral Amplitude ◽

Signal To Noise ◽

Noisy Speech ◽

Time Frequency ◽

Perceptual Evaluation ◽

Noise Ratio

In this paper, a bilateral spectrogram filtering (BSF)-based optimally modified log-spectral amplitude (OMLSA) estimator for single-channel speech enhancement is proposed, which can significantly improve the performance of OMLSA, especially in highly non-stationary noise environments, by taking advantage of bilateral filtering (BF), a widely used technology in image and visual processing, to preprocess the spectrogram of the noisy speech. BSF is capable of not only sharpening details, removing unwanted textures or background noise from the noisy speech spectrogram, but also preserving edges when considering a speech spectrogram as an image. The a posteriori signal-to-noise ratio (SNR) of OMLSA algorithm is estimated after applying BSF to the noisy speech. Besides, in order to reduce computing costs, a fast and accurate BF is adopted to reduce the algorithm complexity O(1) for each time-frequency bin. Finally, the proposed algorithm is compared with the original OMLSA and other classic denoising methods using various types of noise with different signal-to-noise ratios in terms of objective evaluation metrics such as segmental signal-to-noise ratio improvement and perceptual evaluation of speech quality. The results show the validity of the improved BSF-based OMLSA algorithm.

Download Full-text

Speech Enhancement on Over-Complete Dictionary and Threshold Orthogonal Matching Pursuit

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1044-1045.1463 ◽

2014 ◽

Vol 1044-1045 ◽

pp. 1463-1468

Author(s):

Xiao Cui ◽

Wu Qing Zhang

Keyword(s):

Discrete Cosine Transform ◽

Speech Enhancement ◽

Speech Signal ◽

Matching Pursuit ◽

Signal To Noise Ratio ◽

Orthogonal Matching Pursuit ◽

Cosine Transform ◽

Perceptual Evaluation ◽

Transform Coefficients

In order to suppress the noise, improve equipment's ability to further process information and improve the quality of voice, speech enhancement is often an important part of the speech signal preprocess. Contrastively analyze the characteristic that the clean speech signal coefficients in over-complete discrete cosine dictionary are much sparser than the traditional discrete cosine transform coefficients. Under noisy conditions, by setting the iterative threshold of orthogonal matching pursuit (OMP) algorithm, clean speech can be gotten, thus realize the speech enhancement. Simulation results of the signal waveform and spectrogram enhanced by the proposed algorithm are very similar to the original signal,comparative experiments also indicate that the signal to noise ratio (SNR) and the perceptual evaluation of speech quality (PESQ) score of the processed signal are superior to traditional discrete cosine transform (DCT).

Download Full-text