A new time-frequency binary mask estimation method based on convex optimization of speech power

2018 ◽  
Vol 97 ◽  
pp. 51-65 ◽  
Author(s):  
Feng Bao ◽  
Waleed H. Abdulla
Author(s):  
Feng Bao ◽  
Waleed H. Abdulla

In computational auditory scene analysis, the accurate estimation of binary mask or ratio mask plays a key role in noise masking. An inaccurate estimation often leads to some artifacts and temporal discontinuity in the synthesized speech. To overcome this problem, we propose a new ratio mask estimation method in terms of Wiener filtering in each Gammatone channel. In the reconstruction of Wiener filter, we utilize the relationship of the speech and noise power spectra in each Gammatone channel to build the objective function for the convex optimization of speech power. To improve the accuracy of estimation, the estimated ratio mask is further modified based on its adjacent time–frequency units, and then smoothed by interpolating with the estimated binary masks. The objective tests including the signal-to-noise ratio improvement, spectral distortion and intelligibility, and subjective listening test demonstrate the superiority of the proposed method compared with the reference methods.


Energies ◽  
2021 ◽  
Vol 14 (5) ◽  
pp. 1437
Author(s):  
Mahfoud Drouaz ◽  
Bruno Colicchio ◽  
Ali Moukadem ◽  
Alain Dieterlen ◽  
Djafar Ould-Abdeslam

A crucial step in nonintrusive load monitoring (NILM) is feature extraction, which consists of signal processing techniques to extract features from voltage and current signals. This paper presents a new time-frequency feature based on Stockwell transform. The extracted features aim to describe the shape of the current transient signal by applying an energy measure on the fundamental and the harmonic frequency voices. In order to validate the proposed methodology, classical machine learning tools are applied (k-NN and decision tree classifiers) on two existing datasets (Controlled On/Off Loads Library (COOLL) and Home Equipment Laboratory Dataset (HELD1)). The classification rates achieved are clearly higher than that for other related studies in the literature, with 99.52% and 96.92% classification rates for the COOLL and HELD1 datasets, respectively.


Sign in / Sign up

Export Citation Format

Share Document