The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction

Alexandre Chabot-Leclerc; Søren Jørgensen; Torsten Dau

doi:10.1121/1.4873517

Mechanisms of Spectrotemporal Modulation Detection for Normal- and Hearing-Impaired Listeners

Trends in Hearing ◽

10.1177/2331216520978029 ◽

2021 ◽

Vol 25 ◽

pp. 233121652097802

Author(s):

Emmanuel Ponsot ◽

Léo Varnet ◽

Nicolas Wallaert ◽

Elza Daoud ◽

Shihab A. Shamma ◽

...

Keyword(s):

Speech Intelligibility ◽

Hearing Impaired ◽

Temporal Modulation ◽

Computational Tools ◽

Cochlear Tuning ◽

Cochlear Hearing Loss ◽

Nonlinear Processing ◽

Linear Band ◽

Band Pass ◽

Modulation Filter

Spectrotemporal modulations (STM) are essential features of speech signals that make them intelligible. While their encoding has been widely investigated in neurophysiology, we still lack a full understanding of how STMs are processed at the behavioral level and how cochlear hearing loss impacts this processing. Here, we introduce a novel methodological framework based on psychophysical reverse correlation deployed in the modulation space to characterize the mechanisms underlying STM detection in noise. We derive perceptual filters for young normal-hearing and older hearing-impaired individuals performing a detection task of an elementary target STM (a given product of temporal and spectral modulations) embedded in other masking STMs. Analyzed with computational tools, our data show that both groups rely on a comparable linear (band-pass)–nonlinear processing cascade, which can be well accounted for by a temporal modulation filter bank model combined with cross-correlation against the target representation. Our results also suggest that the modulation mistuning observed for the hearing-impaired group results primarily from broader cochlear filters. Yet, we find idiosyncratic behaviors that cannot be captured by cochlear tuning alone, highlighting the need to consider variability originating from additional mechanisms. Overall, this integrated experimental-computational approach offers a principled way to assess suprathreshold processing distortions in each individual and could thus be used to further investigate interindividual differences in speech intelligibility.

Download Full-text

Temporal fine structure influences voicing confusions for consonant identification in multi-talker babble

10.1101/2021.05.11.443678 ◽

2021 ◽

Author(s):

Vibha Viswanathan ◽

Barbara G. Shinn-Cunningham ◽

Michael G. Heinz

Keyword(s):

Fine Structure ◽

Cochlear Implants ◽

Background Noise ◽

Speech Intelligibility ◽

Temporal Fine Structure ◽

Vast Literature ◽

Listening Environments ◽

Intact Condition ◽

Speech Content

To understand the mechanisms of speech perception in everyday listening environments, it is important to elucidate the relative contributions of different acoustics cues in transmitting phonetic content. Previous studies suggest that the energy envelopes of speech convey most speech content, while the temporal fine structure (TFS) can aid in segregating target speech from background noise. Despite the vast literature on TFS and speech intelligibility, the role of TFS in conveying additional speech content over what envelopes convey in complex acoustic scenes is poorly understood. The present study addresses this question using online psychophysical experiments to measure consonant identification in multi-talker babble for intelligibility-matched intact and 64-channel envelope-vocoded stimuli. Consonant confusion patterns revealed that listeners had a greater tendency in the vocoded (versus intact) condition to be biased towards reporting that they heard an unvoiced consonant, despite envelope and place cues being largely preserved. This result was replicated when babble instances were varied across independent experiments, suggesting that TFS conveys important voicing cues over what envelopes convey in multi-talker babble, a masker that is ubiquitous in everyday environments. This finding has implications for assistive listening devices that do not currently provide TFS cues, such as cochlear implants.

Download Full-text

Mechanisms of spectrotemporal modulation detection for normal- and hearing-impaired listeners

10.1101/2020.01.03.894667 ◽

2020 ◽

Author(s):

Emmanuel Ponsot ◽

Léo Varnet ◽

Nicolas Wallaert ◽

Elza Daoud ◽

Shihab A. Shamma ◽

...

Keyword(s):

Auditory Processing ◽

Filter Bank ◽

Hearing Impaired ◽

Temporal Modulation ◽

Unified Framework ◽

Cochlear Hearing Loss ◽

Band Pass ◽

Modulation Filtering ◽

Modulation Filter ◽

Processing Architecture

AbstractSpectrotemporal modulations (STMs) offer a unified framework to probe suprathreshold auditory processing. Here, we introduce a novel methodological framework based on psychophysical reverse-correlation deployed in the modulation space to characterize how STMs are detected by the auditory system and how cochlear hearing loss impacts this processing. Our results show that young normal-hearing (NH) and older hearing-impaired (HI) individuals rely on a comparable non-linear processing architecture involving non-directional band-pass modulation filtering. We demonstrate that a temporal-modulation filter-bank model can capture the strategy of the NH group and that a broader tuning of cochlear filters is sufficient to explain the overall shift toward temporal modulations of the HI group. Yet, idiosyncratic behaviors exposed within each group highlight the contribution and the need to consider additional mechanisms. This integrated experimental-computational approach offers a principled way to assess supra-threshold auditory processing distortions of each individual.

Download Full-text

Auditory motivated front-end for noisy speech using spectro-temporal modulation filtering

The Journal of the Acoustical Society of America ◽

10.1121/1.4896406 ◽

2014 ◽

Vol 136 (5) ◽

pp. EL343-EL349 ◽

Cited By ~ 2

Author(s):

Sriram Ganapathy ◽

Mohamed Omar

Keyword(s):

Temporal Modulation ◽

Noisy Speech ◽

Front End ◽

Modulation Filtering

Download Full-text

Energetic and Informational Components of Speech-on-Speech Masking in Binaural Speech Intelligibility and Perceived Listening Effort

Trends in Hearing ◽

10.1177/2331216519854597 ◽

2019 ◽

Vol 23 ◽

pp. 233121651985459 ◽

Cited By ~ 8

Author(s):

Jan Rennies ◽

Virginia Best ◽

Elin Roverud ◽

Gerald Kidd

Keyword(s):

Speech Intelligibility ◽

Signal To Noise Ratio ◽

Spatial Separation ◽

Signal To Noise ◽

Listening Effort ◽

Complex Sound ◽

Time Frequency ◽

Sound Fields ◽

Energetic Masking

Speech perception in complex sound fields can greatly benefit from different unmasking cues to segregate the target from interfering voices. This study investigated the role of three unmasking cues (spatial separation, gender differences, and masker time reversal) on speech intelligibility and perceived listening effort in normal-hearing listeners. Speech intelligibility and categorically scaled listening effort were measured for a female target talker masked by two competing talkers with no unmasking cues or one to three unmasking cues. In addition to natural stimuli, all measurements were also conducted with glimpsed speech—which was created by removing the time–frequency tiles of the speech mixture in which the maskers dominated the mixture—to estimate the relative amounts of informational and energetic masking as well as the effort associated with source segregation. The results showed that all unmasking cues as well as glimpsing improved intelligibility and reduced listening effort and that providing more than one cue was beneficial in overcoming informational masking. The reduction in listening effort due to glimpsing corresponded to increases in signal-to-noise ratio of 8 to 18 dB, indicating that a significant amount of listening effort was devoted to segregating the target from the maskers. Furthermore, the benefit in listening effort for all unmasking cues extended well into the range of positive signal-to-noise ratios at which speech intelligibility was at ceiling, suggesting that listening effort is a useful tool for evaluating speech-on-speech masking conditions at typical conversational levels.

Download Full-text

The role of combined consonant duration and amplitude processing on speech intelligibility in noise

The Journal of the Acoustical Society of America ◽

10.1121/1.2935735 ◽

2008 ◽

Vol 123 (5) ◽

pp. 3865-3865

Author(s):

Jeffrey J. Digiovanni ◽

Jessica A. Wolfanger

Keyword(s):

Speech Intelligibility

Download Full-text

Ethylene regulates post-germination seedling growth in wheat through spatial and temporal modulation of ABA/GA balance

Journal of Experimental Botany ◽

10.1093/jxb/erz566 ◽

2019 ◽

Vol 71 (6) ◽

pp. 1985-2004 ◽

Cited By ~ 2

Author(s):

Menghan Sun ◽

Pham Anh Tuan ◽

Marta S Izydorczyk ◽

Belay T Ayele

Keyword(s):

Transcriptional Regulation ◽

Seedling Growth ◽

Molecular Mechanisms ◽

Starch Degradation ◽

Aba Signaling ◽

Temporal Modulation ◽

Aba Metabolism ◽

Radicle Protrusion ◽

Storage Starch

Abstract This study aimed to gain insights into the molecular mechanisms underlying the role of ethylene in regulating germination and seedling growth in wheat by combining pharmacological, molecular, and metabolomics approaches. Our study showed that ethylene does not affect radicle protrusion but controls post-germination endospermic starch degradation through transcriptional regulation of specific α-amylase and α-glucosidase genes, and this effect is mediated by alteration of endospermic bioactive gibberellin (GA) levels, and GA sensitivity via expression of the GA signaling gene, TaGAMYB. Our data implicated ethylene as a positive regulator of embryo axis and coleoptile growth through transcriptional regulation of specific TaEXPA genes. These effects were associated with modulation of GA levels and sensitivity, through expression of GA metabolism (TaGA20ox1, TaGA3ox2, and TaGA2ox6) and signaling (TaGAMYB) genes, respectively, and/or the abscisic acid (ABA) level and sensitivity, via expression of specific ABA metabolism (TaNCED2 or TaCYP707A1) and signaling (TaABI3) genes, respectively. Ethylene appeared to regulate the expression of TaEXPA3 and thereby root growth through its control of coleoptile ABA metabolism, and root ABA signaling via expression of TaABI3 and TaABI5. These results show that spatiotemporal modulation of ABA/GA balance mediates the role of ethylene in regulating post-germination storage starch degradation and seedling growth in wheat.

Download Full-text