Concurrent processing of voice activity detection and noise reduction using empirical mode decomposition and modulation spectrum analysis

Author(s):  
Yasuaki Kanai ◽  
Shota Morita ◽  
Masashi Unoki
2012 ◽  
Vol 198-199 ◽  
pp. 1560-1566
Author(s):  
Wen Lian Zhan ◽  
Jing Fang Wang

Hilbert-Huang transform is developed in recent years dealing with nonlinear, non-stationary signal analysis of the complete local time-frequency method, recurrence plot method is a recursive nonlinear dynamic behavior of time series method of reconstruction. In this paper, Hilbert-Huang Transform empirical mode decomposition (EMD) and the recurrence plot (RP) method, a new voice activity detection algorithm. Firstly, through the speech and noise based on the empirical mode decomposition and multi-scale features of the different intrinsic mode function (IMF) on a time scale filtering and nonlinear dynamic behavior of the recurrence plot method, quantitative Recursive analysis of statistical uncertainty for endpoint detection. Simulation results show that the method has a strong non-steady-state dynamic analysis capabilities, in low SNR environment more accurately than the traditional method to extract the start and end point of the speech signal, robustness.


2013 ◽  
Vol 22 (3) ◽  
pp. 269-282
Author(s):  
M.S. Rudramurthy ◽  
V. Kamakshi Prasad ◽  
R. Kumaraswamy

AbstractIn this article, a new adaptive data-driven strategy for voice activity detection (VAD) using empirical mode decomposition (EMD) is proposed. Speech data are decomposed using an a posteriori, adaptive, data-driven EMD in the time domain to yield a set of physically meaningful intrinsic mode functions (IMFs). Each IMF preserves the nonlinear and nonstationary property of the speech utterance. Among a set of IMFs, the IMF that contains source information dominantly called characteristic IMF (CIMF) can be identified and extracted by designing a zero-frequency filter-assisted peaking resonator. The detected CIMF is used to compute energy using short-term processing. Choosing proper threshold, voiced regions in speech utterances are detected using frame energy. The proposed framework has been studied on both clean speech utterance and noisy speech utterance (0-dB white noise). The proposed method is used for voice activity detection (VAD) in the presence of white noise and shows encouraging result in the presence of white noise up to 0 dB.


2004 ◽  
Author(s):  
Javier Ramirez ◽  
José Carlos Segura ◽  
Carmen Benitez ◽  
Angel de la Torre ◽  
Antonio Rubio

Sign in / Sign up

Export Citation Format

Share Document