Auditory motivated front-end for noisy speech using spectro-temporal modulation filtering

2014 ◽  
Vol 136 (5) ◽  
pp. EL343-EL349 ◽  
Author(s):  
Sriram Ganapathy ◽  
Mohamed Omar
2020 ◽  
Author(s):  
Emmanuel Ponsot ◽  
Léo Varnet ◽  
Nicolas Wallaert ◽  
Elza Daoud ◽  
Shihab A. Shamma ◽  
...  

AbstractSpectrotemporal modulations (STMs) offer a unified framework to probe suprathreshold auditory processing. Here, we introduce a novel methodological framework based on psychophysical reverse-correlation deployed in the modulation space to characterize how STMs are detected by the auditory system and how cochlear hearing loss impacts this processing. Our results show that young normal-hearing (NH) and older hearing-impaired (HI) individuals rely on a comparable non-linear processing architecture involving non-directional band-pass modulation filtering. We demonstrate that a temporal-modulation filter-bank model can capture the strategy of the NH group and that a broader tuning of cochlear filters is sufficient to explain the overall shift toward temporal modulations of the HI group. Yet, idiosyncratic behaviors exposed within each group highlight the contribution and the need to consider additional mechanisms. This integrated experimental-computational approach offers a principled way to assess supra-threshold auditory processing distortions of each individual.


Author(s):  
Hyeongju Kim ◽  
Hyeonseung Lee ◽  
Woo Hyun Kang ◽  
Hyung Yong Kim ◽  
Nam Soo Kim

For multi-channel speech recognition, speech enhancement techniques such as denoising or dereverberation are conventionally applied as a front-end processor. Deep learning-based front-ends using such techniques require aligned clean and noisy speech pairs which are generally obtained via data simulation. Recently, several joint optimization techniques have been proposed to train the front-end without parallel data within an end-to-end automatic speech recognition (ASR) scheme. However, the ASR objective is sub-optimal and insufficient for fully training the front-end, which still leaves room for improvement. In this paper, we propose a novel approach which incorporates flow-based density estimation for the robust front-end using non-parallel clean and noisy speech. Experimental results on the CHiME-4 dataset show that the proposed method outperforms the conventional techniques where the front-end is trained only with ASR objective.


1990 ◽  
Vol 137 (1) ◽  
pp. 57 ◽  
Author(s):  
M. Steyaert ◽  
Z. Chang
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document