scholarly journals The Effect of a Voice Activity Detector on the Speech Enhancement Performance of the Binaural Multichannel Wiener Filter

2010 ◽  
Vol 2010 ◽  
pp. 1-12 ◽  
Author(s):  
Jasmina Catic ◽  
Torsten Dau ◽  
Jörg M. Buchholz ◽  
Fredrik Gran
Electronics ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 897 ◽  
Author(s):  
Hilman Pardede ◽  
Kalamullah Ramli ◽  
Yohan Suryanto ◽  
Nur Hayati ◽  
Alfan Presekal

The encryption process for secure voice communication may degrade the speech quality when it is applied to the speech signals before encoding them through a conventional communication system such as GSM or radio trunking. This is because the encryption process usually includes a randomization of the speech signals, and hence, when the speech is decrypted, it may perceptibly be distorted, so satisfactory speech quality for communication is not achieved. To deal with this, we could apply a speech enhancement method to improve the quality of decrypted speech. However, many speech enhancement methods work by assuming noise is present all the time, so the voice activity detector (VAD) is applied to detect the non-speech period to update the noise estimate. Unfortunately, this assumption is not valid for the decrypted speech. Since the encryption process is applied only when speech is detected, distortions from the secure communication system are characteristically different. They exist when speech is present. Therefore, a noise estimator that is able to update noise even when speech is present is needed. However, most noise estimator techniques only adapt to slow changes of noise to avoid over-estimation of noise, making them unsuitable for this task. In this paper, we propose a speech enhancement technique to improve the quality of speech from secure communication. We use a combination of the Wiener filter and spectral subtraction for the noise estimator, so our method is better at tracking fast changes of noise without over-estimating them. Our experimental results on various communication channels indicate that our method is better than other popular noise estimators and speech enhancement methods.


2014 ◽  
Vol 2014 ◽  
pp. 1-8 ◽  
Author(s):  
Yan Zhang ◽  
Zhen-min Tang ◽  
Yan-ping Li ◽  
Yang Luo

Accurate and effective voice activity detection (VAD) is a fundamental step for robust speech or speaker recognition. In this study, we proposed a hierarchical framework approach for VAD and speech enhancement. The modified Wiener filter (MWF) approach is utilized for noise reduction in the speech enhancement block. For the feature selection and voting block, several discriminating features were employed in a voting paradigm for the consideration of reliability and discriminative power. Effectiveness of the proposed approach is compared and evaluated to other VAD techniques by using two well-known databases, namely, TIMIT database and NOISEX-92 database. Experimental results show that the proposed method performs well under a variety of noisy conditions.


2021 ◽  
Vol 11 (6) ◽  
pp. 2816
Author(s):  
Hansol Kim ◽  
Jong Won Shin

The transfer function-generalized sidelobe canceller (TF-GSC) is one of the most popular structures for the adaptive beamformer used in multi-channel speech enhancement. Although the TF-GSC has shown decent performance, a certain amount of steering error is inevitable, which causes leakage of speech components through the blocking matrix (BM) and distortion in the fixed beamformer (FBF) output. In this paper, we propose to suppress the leaked signal in the output of the BM and restore the desired signal in the FBF output of the TF-GSC. To reduce the risk of attenuating speech in the adaptive noise canceller (ANC), the speech component in the output of the BM is suppressed by applying a gain function similar to the square-root Wiener filter, assuming that a certain portion of the desired speech should be leaked into the BM output. Additionally, we propose to restore the attenuated desired signal in the FBF output by adding some of the microphone signal components back, depending on how microphone signals are related to the FBF and BM outputs. The experimental results showed that the proposed TF-GSC outperformed conventional TF-GSC in terms of the perceptual evaluation of speech quality (PESQ) scores under various noise conditions and the direction of arrivals for the desired and interfering sources.


Sign in / Sign up

Export Citation Format

Share Document