Spatial Release from Masking Using Clinical Corpora: Sentence Recognition in a Colocated or Spatially Separated Speech Masker

Grant King; Nicole E. Corbin; Lori J. Leibold; Emily Buss

doi:10.3766/jaaa.19018

Spatial Release from Masking Using Clinical Corpora: Sentence Recognition in a Colocated or Spatially Separated Speech Masker

Journal of the American Academy of Audiology ◽

10.3766/jaaa.19018 ◽

2020 ◽

Vol 31 (04) ◽

pp. 271-276

Author(s):

Grant King ◽

Nicole E. Corbin ◽

Lori J. Leibold ◽

Emily Buss

Keyword(s):

Hearing Loss ◽

Speech Recognition ◽

Spatial Separation ◽

Speech Corpus ◽

Sentence Recognition ◽

Spatial Release ◽

The Mean ◽

Release From Masking ◽

Perceived Spatial Separation ◽

Separation Conditions

Abstract Background Speech recognition in complex multisource environments is challenging, particularly for listeners with hearing loss. One source of difficulty is the reduced ability of listeners with hearing loss to benefit from spatial separation of the target and masker, an effect called spatial release from masking (SRM). Despite the prevalence of complex multisource environments in everyday life, SRM is not routinely evaluated in the audiology clinic. Purpose The purpose of this study was to demonstrate the feasibility of assessing SRM in adults using widely available tests of speech-in-speech recognition that can be conducted using standard clinical equipment. Research Design Participants were 22 young adults with normal hearing. The task was masked sentence recognition, using each of five clinically available corpora with speech maskers. The target always sounded like it originated from directly in front of the listener, and the masker either sounded like it originated from the front (colocated with the target) or from the side (separated from the target). In the real spatial manipulation conditions, source location was manipulated by routing the target and masker to either a single speaker or to two speakers: one directly in front of the participant, and one mounted in an adjacent corner, 90° to the right. In the perceived spatial separation conditions, the target and masker were presented from both speakers with delays that made them sound as if they were either colocated or separated. Results With real spatial manipulations, the mean SRM ranged from 7.1 to 11.4 dB, depending on the speech corpus. With perceived spatial manipulations, the mean SRM ranged from 1.8 to 3.1 dB. Whereas real separation improves the signal-to-noise ratio in the ear contralateral to the masker, SRM in the perceived spatial separation conditions is based solely on interaural timing cues. Conclusions The finding of robust SRM with widely available speech corpora supports the feasibility of measuring this important aspect of hearing in the audiology clinic. The finding of a small but significant SRM in the perceived spatial separation conditions suggests that modified materials could be used to evaluate the use of interaural timing cues specifically.

Download Full-text

Spatial Release From Masking Using Clinical Corpora: Sentence Recognition In a Colocated or Spatially Separated Speech Masker

Journal of the American Academy of Audiology ◽

10.3766/jaaa19018 ◽

2019 ◽

Author(s):

Grant King ◽

Nicole E. Corbin ◽

Lori J. Leibold ◽

Emily Buss

Keyword(s):

Hearing Loss ◽

Speech Recognition ◽

Signal To Noise Ratio ◽

Spatial Separation ◽

Spatial Release ◽

The Mean ◽

Release From Masking ◽

The Right ◽

Perceived Spatial Separation ◽

Separation Conditions

Background: Speech recognition in complex multisource environments is challenging, particularly forlisteners with hearing loss. One source of difficulty is the reduced ability of listeners with hearing loss tobenefit from spatial separation of the target and masker, an effect called spatial release from masking(SRM). Despite the prevalence of complex multisource environments in everyday life, SRM is not routinelyevaluated in the audiology clinic. Purpose: The purpose of this study was to demonstrate the feasibility of assessing SRM in adults usingwidely available tests of speech-in-speech recognition that can be conducted using standard clinicalequipment. Research Design: Participants were 22 young adults with normal hearing. The task was masked sentencerecognition, using each of five clinically available corpora with speech maskers. The target alwayssounded like it originated from directly in front of the listener, and the masker either sounded like it originatedfrom the front (colocated with the target) or from the side (separated from the target). In the realspatial manipulation conditions, source location was manipulated by routing the target and masker toeither a single speaker or to two speakers: one directly in front of the participant, and one mountedin an adjacent corner, 90° to the right. In the perceived spatial separation conditions, the target andmasker were presented from both speakers with delays that made them sound as if they were eithercolocated or separated. Results: With real spatial manipulations, the mean SRM ranged from 7.1 to 11.4 dB, depending on thespeech corpus. With perceived spatial manipulations, the mean SRM ranged from 1.8 to 3.1 dB. Whereasreal separation improves the signal-to-noise ratio in the ear contralateral to the masker, SRM in the perceivedspatial separation conditions is based solely on interaural timing cues. Conclusions: The finding of robust SRM with widely available speech corpora supports the feasibility ofmeasuring this important aspect of hearing in the audiology clinic. The finding of a small but significantSRM in the perceived spatial separation conditions suggests that modified materials could be used toevaluate the use of interaural timing cues specifically.

Download Full-text

Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition

Journal of Speech Language and Hearing Research ◽

10.1044/2017_jslhr-h-17-0215 ◽

2018 ◽

Vol 61 (2) ◽

pp. 428-435 ◽

Cited By ~ 2

Author(s):

Navin Viswanathan ◽

Kostas Kokkinakis ◽

Brittany T. Williams

Keyword(s):

Speech Recognition ◽

Spectral Resolution ◽

Signal To Noise Ratio ◽

Spatial Separation ◽

Channel Noise ◽

Signal To Noise ◽

Spatial Release ◽

Mixed Factorial Design ◽

Release From Masking ◽

Vocoded Speech

Purpose The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the target. We also assessed whether the spectral resolution of the noise-vocoded stimuli affected the presence of LRM and SRM under these conditions. Method In Experiment 1, a mixed factorial design was used to simultaneously manipulate the masker language (within-subject, English vs. Dutch), the simulated masker location (within-subject, right, center, left), and the spectral resolution (between-subjects, 6 vs. 12 channels) of noise-vocoded target–masker combinations presented at +25 dB signal-to-noise ratio (SNR). In Experiment 2, the study was repeated using a spectral resolution of 12 channels at +15 dB SNR. Results In both experiments, listeners' intelligibility of noise-vocoded targets was better when the background masker was Dutch, demonstrating reliable LRM in all conditions. The pattern of results in Experiment 1 was not reliably different across the 6- and 12-channel noise-vocoded speech. Finally, a reliable spatial benefit (SRM) was detected only in the more challenging SNR condition (Experiment 2). Conclusion The current study is the first to report a clear LRM benefit in noise-vocoded speech-in-speech recognition. Our results indicate that this benefit is available even under spectrally degraded conditions and that it may augment the benefit due to spatial separation of target speech and competing backgrounds.

Download Full-text

Using spatial release from masking to estimate the magnitude of the familiar-voice intelligibility benefit

10.31234/osf.io/5bkr4 ◽

2019 ◽

Author(s):

Ysabel Domingo ◽

Emma Holmes ◽

Ewan Macpherson ◽

Ingrid Johnsrude

Keyword(s):

Speech Intelligibility ◽

Spatial Separation ◽

Target Sentence ◽

Closed Set ◽

Successful Communication ◽

Spatial Release ◽

Familiar Voice ◽

Release From Masking ◽

Separation Conditions

The ability to segregate simultaneous speech streams is crucial for successful communication. Recent studies have demonstrated that participants can report 10–20% more words spoken by naturally familiar (e.g., friends or spouses) than unfamiliar talkers in two-voice mixtures. This benefit is commensurate with one of the largest benefits to speech intelligibility currently known—that gained by spatially separating two talkers. However, because of differences in the methods of these previous studies, the relative benefits of spatial separation and voice familiarity are unclear. Here, we directly compared the familiar-voice benefit and spatial release from masking, and examined if and how these two cues interact with one another. We recorded talkers speaking sentences from a published closed-set “matrix” task and then presented listeners with three different sentences played simultaneously. Each target sentence was played at 0° azimuth, and two masker sentences were symmetrically separated about the target. On average, participants reported 10–30% more words correctly when the target sentence was spoken in a familiar than unfamiliar voice (collapsed over spatial separation conditions); we found that participants gain a similar benefit from a familiar target as when an unfamiliar voice is separated from two symmetrical maskers by approximately 15° azimuth.

Download Full-text

An Evaluation of Hearing Aid Beamforming Microphone Arrays in a Noisy Laboratory Setting

Journal of the American Academy of Audiology ◽

10.3766/jaaa.17090 ◽

2019 ◽

Vol 30 (02) ◽

pp. 131-144 ◽

Cited By ~ 4

Author(s):

Erin M. Picou ◽

Todd A. Ricketts

Keyword(s):

Hearing Loss ◽

Speech Recognition ◽

Hearing Aids ◽

Linear Models ◽

Hearing Aid ◽

Recognition Performance ◽

Microphone Arrays ◽

Laboratory Setting ◽

Subjective Ratings ◽

Sentence Recognition

AbstractPeople with hearing loss experience difficulty understanding speech in noisy environments. Beamforming microphone arrays in hearing aids can improve the signal-to-noise ratio (SNR) and thus also speech recognition and subjective ratings. Unilateral beamformer arrays, also known as directional microphones, accomplish this improvement using two microphones in one hearing aid. Bilateral beamformer arrays, which combine information across four microphones in a bilateral fitting, further improve the SNR. Early bilateral beamformers were static with fixed attenuation patterns. Recently adaptive, bilateral beamformers have been introduced in commercial hearing aids.The purpose of this article was to evaluate the potential benefits of adaptive unilateral and bilateral beamformers for improving sentence recognition and subjective ratings in a laboratory setting. A secondary purpose was to identify potential participant factors that explain some of the variability in beamformer benefit.Participants were fitted with study hearing aids equipped with commercially available adaptive unilateral and bilateral beamformers. Participants completed sentence recognition testing in background noise using three hearing aid settings (omnidirectional, unilateral beamformer, bilateral beamformer) and two noise source configurations (surround, side). After each condition, participants made subjective ratings of their perceived work, desire to control the situation, willingness to give up, and tiredness.Eighteen adults (50–80 yr, M = 66.2, σ = 8.6) with symmetrical mild sloping to severe hearing loss participated.Sentence recognition scores and subjective ratings were analyzed separately using generalized linear models with two within-subject factors (hearing aid microphone and noise configuration). Two benefit scores were calculated: (1) unilateral beamformer benefit (relative to performance with omnidirectional) and (2) additional bilateral beamformer benefit (relative to performance with unilateral beamformer). Hierarchical multiple linear regression was used to determine if beamformer benefit was associated with participant factors (age, degree of hearing loss, unaided speech in noise ability, spatial release from masking, and performance in omnidirectional).Sentence recognition and subjective ratings of work, control, and tiredness were better with both types of beamformers relative to the omnidirectional conditions. In addition, the bilateral beamformer offered small additional improvements relative to the unilateral beamformer in terms of sentence recognition and subjective ratings of tiredness. Speech recognition performance and subjective ratings were generally independent of noise configuration. Performance in the omnidirectional setting and pure-tone average were independently related to unilateral beamformer benefits. Those with the lowest performance or the largest degree of hearing loss benefited the most. No factors were significantly related to additional bilateral beamformer benefit.Adaptive bilateral beamformers offer additional advantages over adaptive unilateral beamformers in hearing aids. The small additional advantages with the adaptive beamformer are comparable to those reported in the literature with static beamformers. Although the additional benefits are small, they positively affected subjective ratings of tiredness. These data suggest that adaptive bilateral beamformers have the potential to improve listening in difficult situations for hearing aid users. In addition, patients who struggle the most without beamforming microphones may also benefit the most from the technology.

Download Full-text

The MAndarin spoken word—Picture IDentification test in noise—Adaptive (MAPID-A) measures subtle speech-recognition-in-noise changes and spatial release from masking in very young children

PLoS ONE ◽

10.1371/journal.pone.0209768 ◽

2019 ◽

Vol 14 (1) ◽

pp. e0209768

Author(s):

Kevin Chi Pun Yuen ◽

Xin Yue Qiu ◽

Hong Yu Mou ◽

Xin Xi

Keyword(s):

Speech Recognition ◽

Young Children ◽

Spoken Word ◽

Very Young Children ◽

Spatial Release ◽

Identification Test ◽

Release From Masking ◽

Speech Recognition In Noise ◽

Picture Identification

Download Full-text

Spatial Release From Masking in Bimodal and Bilateral Pediatric Cochlear Implant Recipients

American Journal of Audiology ◽

10.1044/2020_aja-20-00051 ◽

2020 ◽

pp. 1-9

Author(s):

Kaylene King ◽

Margaret T. Dillon ◽

Brendan P. O'Connell ◽

Kevin D. Brown ◽

Lisa R. Park

Keyword(s):

Cochlear Implant ◽

Multiple Regression ◽

Recognition Performance ◽

Regression Analyses ◽

Sentence Recognition ◽

Spatial Release ◽

The Difference ◽

Release From Masking ◽

Device Use ◽

Clinical Measures

Purpose Traditional clinical measures of cochlear implant (CI) recipient performance may not fully evaluate the benefit of bimodal listening (hearing aid contralateral to a CI). The clinical assessment of spatial release from masking (SRM) may be a sensitive measure of the benefit of listening with bimodal stimulation. This study compared the SRM of pediatric bimodal and bilateral CI listeners using a clinically feasible method, and investigated variables that may contribute to speech recognition performance with spatially separated maskers. Method Forty pediatric bimodal ( N = 20) and bilateral CI ( N = 20) participants were assessed in their best aided listening condition on sentence recognition in a four-talker masker. Testing was completed with target and masker colocated at 0° azimuth, and with the masker directed at 90° to either ear. SRM was calculated as the difference in performance between the colocated and each 90° condition. A two-way mixed-methods analysis of variance was used to compare performance between groups in the three masker conditions. Multiple regression analyses were conducted to investigate potential predictors for SRM asymmetry including hearing history, unaided thresholds, word recognition, duration of device use, and acoustic bandwidth. Results Both groups demonstrated SRM, with significantly better recognition in each 90° condition as compared to the colocated condition. The groups did not differ significantly in SRM. The multiple regression analyses did not reveal any significant predictors of SRM asymmetry. Conclusions Bimodal and bilateral CI listeners demonstrated similar amounts of SRM. While no specific variables predicted SRM asymmetry in bimodal listeners, pediatric bimodal and bilateral CI recipients should expect similar amounts of SRM regardless of the side of the masker. SRM asymmetry in pediatric bimodal listeners may signal a need for consideration of a second CI.

Download Full-text

Normative Data for a Rapid, Automated Test of Spatial Release From Masking

American Journal of Audiology ◽

10.1044/2018_aja-17-0069 ◽

2018 ◽

Vol 27 (4) ◽

pp. 529-538 ◽

Cited By ~ 7

Author(s):

Kasey M. Jakien ◽

Frederick J. Gallun

Keyword(s):

Hearing Loss ◽

Normative Data ◽

Rapid Test ◽

Control Group ◽

Regression Equations ◽

Data Set ◽

Spatial Release ◽

Auditory Scene ◽

The Difference ◽

Release From Masking

Purpose The purpose of this study is to report normative data and predict thresholds for a rapid test of spatial release from masking for speech perception. The test is easily administered and has good repeatability, with the potential to be used in clinics and laboratories. Normative functions were generated for adults varying in age and amounts of hearing loss. Method The test of spatial release presents a virtual auditory scene over headphones with 2 conditions: colocated (with target and maskers at 0°) and spatially separated (with target at 0° and maskers at ± 45°). Listener thresholds are determined as target-to-masker ratios, and spatial release from masking (SRM) is determined as the difference between the colocated condition and spatially separated condition. Multiple linear regression was used to fit the data from 82 adults 18–80 years of age with normal to moderate hearing loss (0–40 dB HL pure-tone average [PTA]). The regression equations were then used to generate normative functions that relate age (in years) and hearing thresholds (as PTA) to target-to-masker ratios and SRM. Results Normative functions were able to predict thresholds with an error of less than 3.5 dB in all conditions. In the colocated condition, the function included only age as a predictive parameter, whereas in the spatially separated condition, both age and PTA were included as parameters. For SRM, PTA was the only significant predictor. Different functions were generated for the 1st run, the 2nd run, and the average of the 2 runs. All 3 functions were largely similar in form, with the smallest error being associated with the function on the basis of the average of 2 runs. Conclusion With the normative functions generated from this data set, it would be possible for a researcher or clinician to interpret data from a small number of participants or even a single patient without having to first collect data from a control group, substantially reducing the time and resources needed. Supplemental Material https://doi.org/10.23641/asha.7080878

Download Full-text

Speech Recognition in Multitalker Babble Using Digits, Words, and Sentences

Journal of the American Academy of Audiology ◽

10.3766/jaaa.16.9.9 ◽

2005 ◽

Vol 16 (09) ◽

pp. 726-739 ◽

Cited By ~ 19

Author(s):

Rachel A. McArdle ◽

Richard H. Wilson ◽

Christopher A. Burks

Keyword(s):

Hearing Loss ◽

Speech Recognition ◽

Mixed Model ◽

Recognition Performance ◽

Normal Hearing ◽

Linguistic Complexity ◽

Item Questionnaire ◽

The Mean ◽

Speech Recognition In Noise ◽

Better Than

The purpose of this mixed model design was to examine recognition performance differences when measuring speech recognition in multitalker babble on listeners with normal hearing (n = 36) and listeners with hearing loss (n = 72) utilizing stimulus of varying linguistic complexity (digits, words, and sentence materials). All listeners were administered two trials of two lists of each material in a descending speech-to-babble ratio. For each of the materials, recognition performances by the listeners with normal hearing were significantly better than the performances by the listeners with hearing loss. The mean separation between groups at the 50% point in signal-to-babble ratio on each of the three materials was ~8 dB. The 50% points for digits were obtained at a significantly lower signal-to-babble ratio than for sentences or words that were equivalent. There were no interlist differences between the two lists for the digits and words, but there was a significant disparity between QuickSIN™ lists for the listeners with hearing loss. A two-item questionnaire was used to obtain a subjective measurement of speech recognition, which showed moderate correlations with objective measures of speech recognition in noise using digits (r = .641), sentences (r = .572), and words (r = .673).

Download Full-text

İşitme cihazı kullanan çocuklarda gürültüde konuşmayı anlama becerisinin değerlendirilmesi: Preliminer sonuçlar

Turkish Journal of Audiology and Hearing Research ◽

10.34034/tjahr.10001 ◽

2021 ◽

Vol 4 (2) ◽

pp. 45-50

Author(s):

Ecem KARTAL ÖZCAN ◽

Merve ÖZBAL BATUK ◽

Şule KAYA ◽

Gonca SENNAROĞLU

Keyword(s):

Hearing Loss ◽

Speech Recognition ◽

Hearing Aids ◽

Classroom Environment ◽

Hearing Aid ◽

Normal Hearing ◽

Negative Effects ◽

The Mean ◽

Children With Hearing Loss ◽

Speech Recognition In Noise

Assessment of speech perception in noise in children with hearing aids: Preliminary results* Objective: Noisy environments are a part of the daily life of children, just like adults. Children with hearing loss who wear hearing aids are more susceptible to the negative effects of noise than their normal-hearing peers. This study aims to evaluate the speech recognition in noise performance of hearing aid users and compare them with their normal-hearing peers. Material and Method: Five children aged 6-12 years with bilateral moderate to severe symmetrical sensorineural hearing loss and using bilateral behind-the-ear hearing aids were included in the study. 4 different conditions of the Turkish HINT-C were applied, and a speech recognition threshold (SRT) is determined for each condition. Results: Regardless of their age, the SRT needed by children with hearing aids to achieve equal performance with their normal-hearing peers was found to be higher for all test conditions. As seen in children with normal hearing in general, the mean noise front score of the children with hearing loss was higher than the mean noise right and noise left scores. Conclusion: The results of this study revealed that children with bilaterally symmetrical moderate to severe hearing loss achieved poor speech recognition scores in environments similar to the classroom environment, compared to their normal-hearing peers. Our results guided appropriate rehabilitation and follow-up. Keywords: noise, speech recognition in noise, hearing loss, hearing aid, pediatric audiology, HINT, HINT-C

Download Full-text

Speech Recognition With Informational and Energetic Maskers in Patients With Single-Sided Deafness After Cochlear Implantation

Journal of Speech Language and Hearing Research ◽

10.1044/2021_jslhr-20-00677 ◽

2021 ◽

pp. 1-14

Author(s):

Verena Müller ◽

Ruth Lang-Roth

Keyword(s):

Speech Recognition ◽

Cochlear Implant ◽

Cochlear Implantation ◽

Informational Masking ◽

Slight Improvement ◽

Spatial Release ◽

Contralateral Ear ◽

Release From Masking ◽

Single Sided Deafness ◽

Competing Speech

Purpose The aim of the study was to assess the susceptibility to energetic and informational masking in patients with single-sided deafness (SSD) with one normal-hearing (NH) ear and a cochlear implant (CI) in the contralateral ear, understand the effect on speech recognition when spatially separating noise and speech maskers, and investigate the influence of the CI in situations with energetic and informational masking. Method Speech recognition was measured in the presence of either a modulated speech-shaped noise or one of two competing speech maskers in 11 SSD-CI listeners. The speech maskers were manipulated with respect to fundamental frequency to consider the effect of different voices. Measurements were conducted in the unaided (NH) and aided (NHCI) conditions. Spatial release from masking (SRM) was calculated for each masker type and both listening conditions (NH and NHCI) by subtracting scores of the colocated target and masker condition (S 0 N 0 ) from the spatially separated target and masker conditions (S 0 N ≠0 ). Results Speech recognition was highly variable depending on the type of masker. SRM occurred in the unaided (NH) and aided (NHCI) conditions when the speech masker had the same gender as the target talker. Adding the CI improved speech recognition when this speech masker was ipsilateral to the NH ear. Conclusions The amount of informational masking is substantial in SSD-CI listeners with both colocated and spatially separated target and masker signals. The contribution of SRM to better speech recognition largely depends on the masker and is considerable when no differences in voices between the target and the competing talker occur. There is only a slight improvement in speech recognition by adding the CI.

Download Full-text