Auditory Models of Suprathreshold Distortion and Speech Intelligibility in Persons with Impaired Hearing

Joshua G.W. Bernstein; Van Summers; Elena Grassi; Ken W. Grant

doi:10.3766/jaaa.24.4.6

Auditory Models of Suprathreshold Distortion and Speech Intelligibility in Persons with Impaired Hearing

Journal of the American Academy of Audiology ◽

10.3766/jaaa.24.4.6 ◽

2013 ◽

Vol 24 (04) ◽

pp. 307-328 ◽

Cited By ~ 18

Author(s):

Joshua G.W. Bernstein ◽

Van Summers ◽

Elena Grassi ◽

Ken W. Grant

Keyword(s):

Speech Recognition ◽

Auditory Processing ◽

Speech Intelligibility ◽

Prediction Accuracy ◽

Dynamic Range ◽

Recognition Performance ◽

Frequency Selectivity ◽

Temporal Acuity ◽

Dynamic Range Compression ◽

Speech Reception

Background: Hearing-impaired (HI) individuals with similar ages and audiograms often demonstrate substantial differences in speech-reception performance in noise. Traditional models of speech intelligibility focus primarily on average performance for a given audiogram, failing to account for differences between listeners with similar audiograms. Improved prediction accuracy might be achieved by simulating differences in the distortion that speech may undergo when processed through an impaired ear. Although some attempts to model particular suprathreshold distortions can explain general speech-reception deficits not accounted for by audibility limitations, little has been done to model suprathreshold distortion and predict speech-reception performance for individual HI listeners. Auditory-processing models incorporating individualized measures of auditory distortion, along with audiometric thresholds, could provide a more complete understanding of speech-reception deficits by HI individuals. A computational model capable of predicting individual differences in speech-recognition performance would be a valuable tool in the development and evaluation of hearing-aid signal-processing algorithms for enhancing speech intelligibility. Purpose: This study investigated whether biologically inspired models simulating peripheral auditory processing for individual HI listeners produce more accurate predictions of speech-recognition performance than audiogram-based models. Research Design: Psychophysical data on spectral and temporal acuity were incorporated into individualized auditory-processing models consisting of three stages: a peripheral stage, customized to reflect individual audiograms and spectral and temporal acuity; a cortical stage, which extracts spectral and temporal modulations relevant to speech; and an evaluation stage, which predicts speech-recognition performance by comparing the modulation content of clean and noisy speech. To investigate the impact of different aspects of peripheral processing on speech predictions, individualized details (absolute thresholds, frequency selectivity, spectrotemporal modulation [STM] sensitivity, compression) were incorporated progressively, culminating in a model simulating level-dependent spectral resolution and dynamic-range compression. Study Sample: Psychophysical and speech-reception data from 11 HI and six normal-hearing listeners were used to develop the models. Data Collection and Analysis: Eleven individualized HI models were constructed and validated against psychophysical measures of threshold, frequency resolution, compression, and STM sensitivity. Speech-intelligibility predictions were compared with measured performance in stationary speech-shaped noise at signal-to-noise ratios (SNRs) of −6, −3, 0, and 3 dB. Prediction accuracy for the individualized HI models was compared to the traditional audibility-based Speech Intelligibility Index (SII). Results: Models incorporating individualized measures of STM sensitivity yielded significantly more accurate within-SNR predictions than the SII. Additional individualized characteristics (frequency selectivity, compression) improved the predictions only marginally. A nonlinear model including individualized level-dependent cochlear-filter bandwidths, dynamic-range compression, and STM sensitivity predicted performance more accurately than the SII but was no more accurate than a simpler linear model. Predictions of speech-recognition performance simultaneously across SNRs and individuals were also significantly better for some of the auditory-processing models than for the SII. Conclusions: A computational model simulating individualized suprathreshold auditory-processing abilities produced more accurate speech-intelligibility predictions than the audibility-based SII. Most of this advantage was realized by a linear model incorporating audiometric and STM-sensitivity information. Although more consistent with known physiological aspects of auditory processing, modeling level-dependent changes in frequency selectivity and gain did not result in more accurate predictions of speech-reception performance.

Download Full-text

Auditory processing of complex sounds: an overview

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.1992.0062 ◽

1992 ◽

Vol 336 (1278) ◽

pp. 295-306 ◽

Cited By ~ 32

Keyword(s):

Auditory System ◽

Auditory Processing ◽

Dynamic Range ◽

Auditory Brainstem ◽

Frequency Selectivity ◽

Dorsal Cochlear Nucleus ◽

Cochlear Nerve ◽

Complex Sounds ◽

Wide Range ◽

Time Coding

The past 30 years has seen a remarkable development in our understanding of how the auditory system - particularly the peripheral system - processes complex sounds. Perhaps the most significant has been our understanding of the mechanisms underlying auditory frequency selectivity and their importance for normal and impaired auditory processing. Physiologically vulnerable cochlear filtering can account for many aspects of our normal and impaired psychophysical frequency selectivity with important consequences for the perception of complex sounds. For normal hearing, remarkable mechanisms in the organ of Corti, involving enhancement of mechanical tuning (in mammals probably by feedback of electro-mechanically generated energy from the hair cells), produce exquisite tuning, reflected in the tuning properties of cochlear nerve fibres. Recent comparisons of physiological (cochlear nerve) and psychophysical frequency selectivity in the same species indicate that the ear’s overall frequency selectivity can be accounted for by this cochlear filtering, at least in band width terms. Because this cochlear filtering is physiologically vulnerable, it deteriorates in deleterious conditions of the cochlea - hypoxia, disease, drugs, noise overexposure, mechanical disturbance - and is reflected in impaired psychophysical frequency selectivity. This is a fundamental feature of sensorineural hearing loss of cochlear origin, and is of diagnostic value. This cochlear filtering, particularly as reflected in the temporal patterns of cochlear fibres to complex sounds, is remarkably robust over a wide range of stimulus levels. Furthermore, cochlear filtering properties are a prime determinant of the ‘place’ and ‘time’ coding of frequency at the cochlear nerve level, both of which appear to be involved in pitch perception. The problem of how the place and time coding of complex sounds is effected over the ear’s remarkably wide dynamic range is briefly addressed. In the auditory brainstem, particularly the dorsal cochlear nucleus, are inhibitory mechanisms responsible for enhancing the spectral and temporal contrasts in complex sounds. These mechanisms are now being dissected neuropharmacologically. At the cortical level, mechanisms are evident that are capable of abstracting biologically relevant features of complex sounds. Fundamental studies of how the auditory system encodes and processes complex sounds are vital to promising recent applications in the diagnosis and rehabilitation of the hearing impaired.

Download Full-text

Evaluation of a Wide Range of AmplitudeFrequency Responses for the Hearing Impaired

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3801.211 ◽

1995 ◽

Vol 38 (1) ◽

pp. 211-221 ◽

Cited By ~ 36

Author(s):

Ronald A. van Buuren ◽

Joost M. Festen ◽

Reinier Plomp

Keyword(s):

Frequency Spectrum ◽

Speech Intelligibility ◽

Dynamic Range ◽

Finite Impulse Response ◽

Sound Quality ◽

Average Frequency ◽

Frequency Spectra ◽

Low Frequencies ◽

Speech Reception ◽

Wide Range

The long-term average frequency spectrum of speech was modified to 25 target frequency spectra in order to determine the effect of each of these spectra on speech intelligibility in noise and on sound quality. Speech intelligibility was evaluated using the test as developed by Plomp and Mimpen (1979), whereas sound quality was examined through judgments of loudness, sharpness, clearness, and pleasantness of speech fragments. Subjects had different degrees of sensorineural hearing loss and sloping audiograms, but not all of them were hearing aid users. The 25 frequency spectra were defined such that the entire dynamic range of each listener, from dB above threshold to 5 dB below UCL, was covered. Frequency shaping of the speech was carried out on-line by means of Finite Impulse Response (FIR) filters. The tests on speech reception in noise indicated that the Speech-Reception Thresholds (SRTs) did not differ significantly for the majority of spectra. Spectra with high levels, especially at low frequencies (probably causing significant upward spread of masking), and also those with steep negative slopes resulted in significantly higher SRTs. Sound quality judgments led to conclusions virtually identical to those from the SRT data: frequency spectra with an unacceptably low sound quality were in most of the cases significantly worse on the SRT test as well. Because the SRT did not vary significantly among the majority of frequency spectra, it was concluded that a wide range of spectra between the threshold and UCL levels of listeners with hearing losses is suitable for the presentation of speech energy. This is very useful in everyday listening, where the frequency spectrum of speech may vary considerably.

Download Full-text

Effect of Slow-Acting Wide Dynamic Range Compression on Measures of Intelligibility and Ratings of Speech Quality in Simulated-Loss Listeners

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2005/048) ◽

2005 ◽

Vol 48 (3) ◽

pp. 702-714 ◽

Cited By ~ 8

Author(s):

Peninah S. Rosengard ◽

Karen L. Payton ◽

Louis D. Braida

Keyword(s):

Hearing Loss ◽

Speech Intelligibility ◽

Dynamic Range ◽

Speech Quality ◽

Subjective Ratings ◽

Subjective Measures ◽

Wide Dynamic Range ◽

Dynamic Range Compression ◽

Amplitude Compression ◽

Moderate Hearing Loss

The purpose of this study was twofold: (a) to determine the extent to which 4-channel, slow-acting wide dynamic range amplitude compression (WDRC) can counteract the perceptual effects of reduced auditory dynamic range and (b) to examine the relation between objective measures of speech intelligibility and categorical ratings of speech quality for sentences processed with slow-acting WDRC. Multiband expansion was used to simulate the effects of elevated thresholds and loudness recruitment in normal hearing listeners. While some previous studies have shown that WDRC can improve both speech intelligibility and quality, others have found no benefit. The current experiment shows that moderate amounts of compression can provide a small but significant improvement in speech intelligibility, relative to linear amplification, for simulated-loss listeners with small dynamic ranges (i.e., flat, moderate hearing loss). This benefit was found for speech at conversational levels, both in quiet and in a background of babble. Simulated-loss listeners with large dynamic ranges (i.e., sloping, mild-to-moderate hearing loss) did not show any improvement. Comparison of speech intelligibility scores and subjective ratings of intelligibility showed that listeners with simulated hearing loss could accurately judge the overall intelligibility of speech. However, in all listeners, ratings of pleasantness decreased as the compression ratio increased. These findings suggest that subjective measures of speech quality should be used in conjunction with either objective or subjective measures of speech intelligibility to ensure that participant-selected hearing aid parameters optimize both comfort and intelligibility.

Download Full-text

Effects of noise and reverberation on speech recognition with variants of a multichannel adaptive dynamic range compression scheme

International Journal of Audiology ◽

10.1080/14992027.2019.1617902 ◽

2019 ◽

Vol 58 (10) ◽

pp. 661-669 ◽

Cited By ~ 2

Author(s):

Varsha H. Rallapalli ◽

Joshua M. Alexander

Keyword(s):

Speech Recognition ◽

Dynamic Range ◽

Compression Scheme ◽

Adaptive Dynamic ◽

Dynamic Range Compression

Download Full-text

Transcranial Contralateral Cochlear Stimulation in Unilateral Deafness

Otolaryngology ◽

10.1016/s0194-5998(03)00527-8 ◽

2003 ◽

Vol 129 (3) ◽

pp. 248-254 ◽

Cited By ~ 138

Author(s):

Jack J. Wazen ◽

Jaclyn B. Spitzer ◽

Soha N. Ghossaini ◽

José N. Fayad ◽

John K. Niparko ◽

...

Keyword(s):

Quality Of Life ◽

Hearing Aids ◽

Speech Intelligibility ◽

Recognition Performance ◽

Auditory Stimuli ◽

Unilateral Deafness ◽

Speech Reception ◽

Speech Reception Thresholds ◽

Single Sided Deafness

OBJECTIVES: The purpose of this study is to evaluate the effectiveness of Bone Anchored Cochlear Stimulator (BAHA) in transcranial routing of signal by implanting the deaf ear. STUDY DESIGN AND SETTINGS: Eighteen patients with unilateral deafness were included in a multisite study. They had a 1-month pre-implantation trial with a contralateral routing of signal (CROS) hearing aid. Their performance with BAHA was compared with the CROS device using speech reception thresholds, speech recognition performance in noise, and the Abbreviated Profile Hearing Benefit and Single Sided Deafness questionnaires. RESULTS: Patients reported a significant improvement in speech intelligibility in noise and greater benefit from BAHA compared with CROS hearing aids. Patients were satisfied with the device and its impact on their quality of life. No major complications were reported. CONCLUSION AND SIGNIFICANCE: BAHA is effective in unilateral deafness. Auditory stimuli from the deaf side can be transmitted to the good ear, avoiding the limitations inherent in CROS amplification.

Download Full-text

Efficacy of a Cochlear Implant Simultaneous Analog Stimulation Strategy Coupled with a Monopolar Electrode Configuration

Annals of Otology Rhinology & Laryngology ◽

10.1177/000348940511401113 ◽

2005 ◽

Vol 114 (11) ◽

pp. 886-893 ◽

Cited By ~ 8

Author(s):

Li Xu ◽

Teresa A. Zwolan ◽

Catherine S. Thompson ◽

Bryan E. Pfingst

Keyword(s):

Speech Recognition ◽

Cochlear Implant ◽

Speech Processing ◽

Dynamic Range ◽

Recognition Performance ◽

Electrode Configuration ◽

Processing Strategy ◽

Open Set ◽

Hearing In Noise ◽

Noise Test

Objectives: The present study was performed to evaluate the efficacy and clinical feasibility of using monopolar stimulation with the Clarion Simultaneous Analog Stimulation (SAS) strategy in patients with cochlear implants. Methods: Speech recognition by 10 Clarion cochlear implant users was evaluated by means of 4 different speech processing strategy/electrode configuration combinations; ie, SAS and Continuous Interleaved Sampling (CIS) strategies were each used with monopolar (MP) and bipolar (BP) electrode configurations. The test measures included consonants, vowels, consonant-nucleus-consonant words, and Hearing in Noise Test sentences with a +10 dB signal-to-noise ratio. Additionally, subjective judgments of sound quality were obtained for each strategy/configuration combination. Results: All subjects but 1 demonstrated open-set speech recognition with the SAS/MP combination. The group mean Hearing in Noise Test sentence score for the SAS/MP combination was 31.6% (range, 0% to 92%) correct, as compared to 25.0%, 46.7%, and 37.8% correct for the CIS/BP, CIS/MP, and SAS/BP combinations, respectively. Intersubject variability was high, and there were no significant differences in mean speech recognition scores or mean preference ratings among the 4 strategy/configuration combinations tested. Individually, the best speech recognition performance was with the subject's everyday strategy/configuration combination in 72% of the applicable cases. If the everyday strategy was excluded from the analysis, the subjects performed best with the SAS/MP combination in 37.5% of the remaining cases. Conclusions: The SAS processing strategy with an MP electrode configuration gave reasonable speech recognition in most subjects, even though subjects had minimal previous experience with this strategy/configuration combination. The SAS/MP combination might be particularly appropriate for patients for whom a full dynamic range of electrical hearing could not be achieved with a BP configuration.

Download Full-text

Predicting individual speech intelligibility from the cortical tracking of acoustic- and phonetic-level speech representations

10.1101/471367 ◽

2018 ◽

Cited By ~ 2

Author(s):

D Lesenfants ◽

J Vanthornhout ◽

E Verschueren ◽

L Decruy ◽

T Francart

Keyword(s):

Auditory Processing ◽

Speech Intelligibility ◽

Objective Measure ◽

List Type ◽

Search Terms ◽

Speech Reception ◽

Cortical Responses ◽

Potential Applications ◽

Speech Features ◽

Phonetic Features

ABSTRACTObjectiveTo objectively measure speech intelligibility of individual subjects from the EEG, based on cortical tracking of different representations of speech: low-level acoustical, higher-level discrete, or a combination. To compare each model’s prediction of the speech reception threshold (SRT) for each individual with the behaviorally measured SRT.MethodsNineteen participants listened to Flemish Matrix sentences presented at different signal-to-noise ratios (SNRs), corresponding to different levels of speech understanding. For different EEG frequency bands (delta, theta, alpha, beta or low-gamma), a model was built to predict the EEG signal from various speech representations: envelope, spectrogram, phonemes, phonetic features or a combination of phonetic Features and Spectrogram (FS). The same model was used for all subjects. The model predictions were then compared to the actual EEG of each subject for the different SNRs, and the prediction accuracy in function of SNR was used to predict the SRT.ResultsThe model based on the FS speech representation and the theta EEG band yielded the best SRT predictions, with a difference between the behavioral and objective SRT below 1 decibel for 53% and below 2 decibels for 89% of the subjects.ConclusionA model including low- and higher-level speech features allows to predict the speech reception threshold from the EEG of people listening to natural speech. It has potential applications in diagnostics of the auditory system.Search Termscortical speech tracking, objective measure, speech intelligibility, auditory processing, speech representations.HighlightsObjective EEG-based measure of speech intelligibilityImproved prediction of speech intelligibility by combining speech representationsCortical tracking of speech in the delta EEG band monotonically increased with SNRsCortical responses in the theta EEG band best predicted the speech reception thresholdDisclosureThe authors report no disclosures relevant to the manuscript.

Download Full-text

Wide Dynamic Range Compression Improves Speech Recognition

ASHA Leader ◽

10.1044/leader.ftj4.18082013.37 ◽

2013 ◽

Vol 18 (8) ◽

pp. 37-37

Keyword(s):

Speech Recognition ◽

Dynamic Range ◽

Wide Dynamic Range ◽

Dynamic Range Compression

Wide Dynamic Range Compression Improves Speech Recognition

Download Full-text

Fitting and Verification of Frequency Modulation Systems on Children with Normal Hearing

Journal of the American Academy of Audiology ◽

10.3766/jaaa.25.6.3 ◽

2014 ◽

Vol 25 (06) ◽

pp. 529-540 ◽

Cited By ~ 2

Author(s):

Erin C. Schafer ◽

Danielle Bryant ◽

Katie Sanders ◽

Nicole Baldus ◽

Katherine Algier ◽

...

Keyword(s):

Speech Recognition ◽

Frequency Modulation ◽

Auditory Processing ◽

Signal To Noise Ratio ◽

Recognition Performance ◽

Normal Hearing ◽

Signal To Noise ◽

The Real ◽

Hearing Sensitivity ◽

Speech Recognition In Noise

Background: Several recent investigations support the use of frequency modulation (FM) systems in children with normal hearing and auditory processing or listening disorders such as those diagnosed with auditory processing disorders, autism spectrum disorders, attention-deficit hyperactivity disorder, Friedreich ataxia, and dyslexia. The American Academy of Audiology (AAA) published suggested procedures, but these guidelines do not cite research evidence to support the validity of the recommended procedures for fitting and verifying nonoccluding open-ear FM systems on children with normal hearing. Documenting the validity of these fitting procedures is critical to maximize the potential FM-system benefit in the abovementioned populations of children with normal hearing and those with auditory-listening problems. Purpose: The primary goal of this investigation was to determine the validity of the AAA real-ear approach to fitting FM systems on children with normal hearing. The secondary goal of this study was to examine speech-recognition performance in noise and loudness ratings without and with FM systems in children with normal hearing sensitivity. Research Design: A two-group, cross-sectional design was used in the present study. Study Sample: Twenty-six typically functioning children, ages 5–12 yr, with normal hearing sensitivity participated in the study. Intervention: Participants used a nonoccluding open-ear FM receiver during laboratory-based testing. Data Collection and Analysis: Participants completed three laboratory tests: (1) real-ear measures, (2) speech recognition performance in noise, and (3) loudness ratings. Four real-ear measures were conducted to (1) verify that measured output met prescribed-gain targets across the 1000–4000 Hz frequency range for speech stimuli, (2) confirm that the FM-receiver volume did not exceed predicted uncomfortable loudness levels, and (3 and 4) measure changes to the real-ear unaided response when placing the FM receiver in the child’s ear. After completion of the fitting, speech recognition in noise at a –5 signal-to-noise ratio and loudness ratings at a +5 signal-to-noise ratio were measured in four conditions: (1) no FM system, (2) FM receiver on the right ear, (3) FM receiver on the left ear, and (4) bilateral FM system. Results: The results of this study suggested that the slightly modified AAA real-ear measurement procedures resulted in a valid fitting of one FM system on children with normal hearing. On average, prescriptive targets were met for 1000, 2000, 3000, and 4000 Hz within 3 dB, and maximum output of the FM system never exceeded and was significantly lower than predicted uncomfortable loudness levels for the children. There was a minimal change in the real-ear unaided response when the open-ear FM receiver was placed into the ear. Use of the FM system on one or both ears resulted in significantly better speech recognition in noise relative to a no-FM condition, and the unilateral and bilateral FM receivers resulted in a comfortably loud signal when listening in background noise. Conclusions: Real-ear measures are critical for obtaining an appropriate fit of an FM system on children with normal hearing.

Download Full-text

A Comparison of Personal Sound Amplification Products and Hearing Aids in Ecologically Relevant Test Environments

American Journal of Audiology ◽

10.1044/2018_aja-18-0027 ◽

2018 ◽

Vol 27 (4) ◽

pp. 581-593 ◽

Cited By ~ 11

Author(s):

Lisa Brody ◽

Yu-Hsiang Wu ◽

Elizabeth Stangl

Keyword(s):

Speech Recognition ◽

Hearing Aids ◽

Speech Intelligibility ◽

Best Practice ◽

Hearing Aid ◽

Recognition Performance ◽

Sound Quality ◽

Listening Effort ◽

Speech Intelligibility Index ◽

Sound Amplification

Purpose The aim of this study was to compare the benefit of self-adjusted personal sound amplification products (PSAPs) to audiologist-fitted hearing aids based on speech recognition, listening effort, and sound quality in ecologically relevant test conditions to estimate real-world effectiveness. Method Twenty-five older adults with bilateral mild-to-moderate hearing loss completed the single-blinded, crossover study. Participants underwent aided testing using 3 PSAPs and a traditional hearing aid, as well as unaided testing. PSAPs were adjusted based on participant preference, whereas the hearing aid was configured using best-practice verification protocols. Audibility provided by the devices was quantified using the Speech Intelligibility Index (American National Standards Institute, 2012). Outcome measures assessing speech recognition, listening effort, and sound quality were administered in ecologically relevant laboratory conditions designed to represent real-world speech listening situations. Results All devices significantly improved Speech Intelligibility Index compared to unaided listening, with the hearing aid providing more audibility than all PSAPs. Results further revealed that, in general, the hearing aid improved speech recognition performance and reduced listening effort significantly more than all PSAPs. Few differences in sound quality were observed between devices. All PSAPs improved speech recognition and listening effort compared to unaided testing. Conclusions Hearing aids fitted using best-practice verification protocols were capable of providing more aided audibility, better speech recognition performance, and lower listening effort compared to the PSAPs tested in the current study. Differences in sound quality between the devices were minimal. However, because all PSAPs tested in the study significantly improved participants' speech recognition performance and reduced listening effort compared to unaided listening, PSAPs could serve as a budget-friendly option for those who cannot afford traditional amplification.

Download Full-text