The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction

2014 ◽  
Vol 135 (6) ◽  
pp. 3502-3512 ◽  
Author(s):  
Alexandre Chabot-Leclerc ◽  
Søren Jørgensen ◽  
Torsten Dau
2021 ◽  
Vol 25 ◽  
pp. 233121652097802
Author(s):  
Emmanuel Ponsot ◽  
Léo Varnet ◽  
Nicolas Wallaert ◽  
Elza Daoud ◽  
Shihab A. Shamma ◽  
...  

Spectrotemporal modulations (STM) are essential features of speech signals that make them intelligible. While their encoding has been widely investigated in neurophysiology, we still lack a full understanding of how STMs are processed at the behavioral level and how cochlear hearing loss impacts this processing. Here, we introduce a novel methodological framework based on psychophysical reverse correlation deployed in the modulation space to characterize the mechanisms underlying STM detection in noise. We derive perceptual filters for young normal-hearing and older hearing-impaired individuals performing a detection task of an elementary target STM (a given product of temporal and spectral modulations) embedded in other masking STMs. Analyzed with computational tools, our data show that both groups rely on a comparable linear (band-pass)–nonlinear processing cascade, which can be well accounted for by a temporal modulation filter bank model combined with cross-correlation against the target representation. Our results also suggest that the modulation mistuning observed for the hearing-impaired group results primarily from broader cochlear filters. Yet, we find idiosyncratic behaviors that cannot be captured by cochlear tuning alone, highlighting the need to consider variability originating from additional mechanisms. Overall, this integrated experimental-computational approach offers a principled way to assess suprathreshold processing distortions in each individual and could thus be used to further investigate interindividual differences in speech intelligibility.


2021 ◽  
Author(s):  
Vibha Viswanathan ◽  
Barbara G. Shinn-Cunningham ◽  
Michael G. Heinz

To understand the mechanisms of speech perception in everyday listening environments, it is important to elucidate the relative contributions of different acoustics cues in transmitting phonetic content. Previous studies suggest that the energy envelopes of speech convey most speech content, while the temporal fine structure (TFS) can aid in segregating target speech from background noise. Despite the vast literature on TFS and speech intelligibility, the role of TFS in conveying additional speech content over what envelopes convey in complex acoustic scenes is poorly understood. The present study addresses this question using online psychophysical experiments to measure consonant identification in multi-talker babble for intelligibility-matched intact and 64-channel envelope-vocoded stimuli. Consonant confusion patterns revealed that listeners had a greater tendency in the vocoded (versus intact) condition to be biased towards reporting that they heard an unvoiced consonant, despite envelope and place cues being largely preserved. This result was replicated when babble instances were varied across independent experiments, suggesting that TFS conveys important voicing cues over what envelopes convey in multi-talker babble, a masker that is ubiquitous in everyday environments. This finding has implications for assistive listening devices that do not currently provide TFS cues, such as cochlear implants.


2020 ◽  
Author(s):  
Emmanuel Ponsot ◽  
Léo Varnet ◽  
Nicolas Wallaert ◽  
Elza Daoud ◽  
Shihab A. Shamma ◽  
...  

AbstractSpectrotemporal modulations (STMs) offer a unified framework to probe suprathreshold auditory processing. Here, we introduce a novel methodological framework based on psychophysical reverse-correlation deployed in the modulation space to characterize how STMs are detected by the auditory system and how cochlear hearing loss impacts this processing. Our results show that young normal-hearing (NH) and older hearing-impaired (HI) individuals rely on a comparable non-linear processing architecture involving non-directional band-pass modulation filtering. We demonstrate that a temporal-modulation filter-bank model can capture the strategy of the NH group and that a broader tuning of cochlear filters is sufficient to explain the overall shift toward temporal modulations of the HI group. Yet, idiosyncratic behaviors exposed within each group highlight the contribution and the need to consider additional mechanisms. This integrated experimental-computational approach offers a principled way to assess supra-threshold auditory processing distortions of each individual.


2019 ◽  
Vol 23 ◽  
pp. 233121651985459 ◽  
Author(s):  
Jan Rennies ◽  
Virginia Best ◽  
Elin Roverud ◽  
Gerald Kidd

Speech perception in complex sound fields can greatly benefit from different unmasking cues to segregate the target from interfering voices. This study investigated the role of three unmasking cues (spatial separation, gender differences, and masker time reversal) on speech intelligibility and perceived listening effort in normal-hearing listeners. Speech intelligibility and categorically scaled listening effort were measured for a female target talker masked by two competing talkers with no unmasking cues or one to three unmasking cues. In addition to natural stimuli, all measurements were also conducted with glimpsed speech—which was created by removing the time–frequency tiles of the speech mixture in which the maskers dominated the mixture—to estimate the relative amounts of informational and energetic masking as well as the effort associated with source segregation. The results showed that all unmasking cues as well as glimpsing improved intelligibility and reduced listening effort and that providing more than one cue was beneficial in overcoming informational masking. The reduction in listening effort due to glimpsing corresponded to increases in signal-to-noise ratio of 8 to 18 dB, indicating that a significant amount of listening effort was devoted to segregating the target from the maskers. Furthermore, the benefit in listening effort for all unmasking cues extended well into the range of positive signal-to-noise ratios at which speech intelligibility was at ceiling, suggesting that listening effort is a useful tool for evaluating speech-on-speech masking conditions at typical conversational levels.


2019 ◽  
Vol 71 (6) ◽  
pp. 1985-2004 ◽  
Author(s):  
Menghan Sun ◽  
Pham Anh Tuan ◽  
Marta S Izydorczyk ◽  
Belay T Ayele

Abstract This study aimed to gain insights into the molecular mechanisms underlying the role of ethylene in regulating germination and seedling growth in wheat by combining pharmacological, molecular, and metabolomics approaches. Our study showed that ethylene does not affect radicle protrusion but controls post-germination endospermic starch degradation through transcriptional regulation of specific α-amylase and α-glucosidase genes, and this effect is mediated by alteration of endospermic bioactive gibberellin (GA) levels, and GA sensitivity via expression of the GA signaling gene, TaGAMYB. Our data implicated ethylene as a positive regulator of embryo axis and coleoptile growth through transcriptional regulation of specific TaEXPA genes. These effects were associated with modulation of GA levels and sensitivity, through expression of GA metabolism (TaGA20ox1, TaGA3ox2, and TaGA2ox6) and signaling (TaGAMYB) genes, respectively, and/or the abscisic acid (ABA) level and sensitivity, via expression of specific ABA metabolism (TaNCED2 or TaCYP707A1) and signaling (TaABI3) genes, respectively. Ethylene appeared to regulate the expression of TaEXPA3 and thereby root growth through its control of coleoptile ABA metabolism, and root ABA signaling via expression of TaABI3 and TaABI5. These results show that spatiotemporal modulation of ABA/GA balance mediates the role of ethylene in regulating post-germination storage starch degradation and seedling growth in wheat.


2011 ◽  
Vol 53 (3) ◽  
pp. 327-339 ◽  
Author(s):  
Kuldip Paliwal ◽  
Belinda Schwerin ◽  
Kamil Wójcicki

Sign in / Sign up

Export Citation Format

Share Document