Linking audiovisual integration to audiovisual speech recognition in noise

Mapping Intimacies ◽

10.31219/osf.io/46caf ◽

2020 ◽

Author(s):

Anja Gieseler ◽

Stephanie Rosemann ◽

Maike Tahden ◽

Kirsten C Wagener ◽

Christiane Thiel ◽

...

Keyword(s):

Hearing Loss ◽

Speech Recognition ◽

Hearing Impairment ◽

Visual Information ◽

Speech Intelligibility ◽

Individual Variability ◽

Audiovisual Integration ◽

Audiovisual Speech ◽

Specific Test ◽

Speech In Noise

Especially in challenging listening conditions, listeners can benefit from the audiovisual nature of speech by using visual information. Yet there exists great inter-individual variability, not only in understanding speech in noise, but also in the benefit obtained from additional visual cues. First empirical evidence suggests that the ability to integrate auditory and visual input, i.e. audiovisual integration, is altered in hearing impairment and is, at the same time, relevant for audiovisual speech intelligibility. The distinct role of mild hearing loss on audiovisual integration and the significance of these changes for speech intelligibility, however, need further scrutiny. Thus, here we investigated differences in audiovisual integration capacities between elderly, normal-hearing and hearing-impaired individuals using two tests of audiovisual integration (sound-induced flash illusion, McGurk task). To explore whether potential differences in audiovisual integration are meaningful for natural speech intelligibility, we then linked audiovisual integration capacities to speech-in-noise recognition using an audiovisual speech-reception threshold test, expecting this to reflect a more realistic listening scenario. Our results indicate that audiovisual integration abilities are already altered in mild hearing impairment, while the magnitude and direction of the effect depend on the specific test used. At the same time, audiovisual integration capacities seem relevant for predicting audiovisual speech intelligibility in noise, especially in those individuals with a hearing loss. We conclude that audiovisual integration abilities should therefore be considered for future predictions of speech recognition outcomes, which – in turn – should be assessed audiovisually, to account for the multisensory nature of speech and communication.

Download Full-text

Comparison of Speech Recognition and Subjective Hearing Handicap in Elderly Listeners as a Function of Degree of Hearing Loss

Audiology and Speech Research ◽

10.21848/asr.200024 ◽

2020 ◽

Vol 16 (2) ◽

pp. 115-123

Author(s):

KyooSang Kim ◽

Subong Kim ◽

Jae Hee Lee

Keyword(s):

Hearing Loss ◽

Speech Recognition ◽

Hearing Impairment ◽

The Elderly ◽

Normal Hearing ◽

Individual Variability ◽

Psychological Aspects ◽

World Health ◽

Large Individual ◽

Hearing Handicap

Purpose: This study aimed to compare objective speech recognition and subjective hearing handicap outcomes as a function of a degree of hearing loss. Methods: 120 elderly listeners participated, ranging in age from 60-83 years. Listeners’ degrees of hearing loss were derived corresponding to a newly proposed World Health Organization hearing impairment grading system. As objective outcomes, word and sentence recognition scores (WRS, SRS) in quiet were measured at an individually determined most comfortable level. The SRS in noise were obtained at 0 dB signal-to-noise ratio. The Korean Evaluation Scale for Hearing Handicap questionnaire for non-hearing aid users was used to evaluate the effects of hearing status on social and psychological aspects. Results: Within the same grading of hearing impairment, listeners tended to show a large individual variability in speech-in-noise recognition and subjective hearing handicaps. Listeners with even mild impairment had more reductions in SRS in noise and more handicaps in an interpersonal relationship compared to normal-hearing listeners. Among the listeners with no impairment or mild hearing impairment, listeners who had poorer sentence-in-noise scores actually showed greater hearing handicaps. The sentence-in-noise scores plus WRS explained the subjective hearing handicap by about 40%. Conclusion: The elderly with normal hearing or mild hearing loss can have reduced communication abilities in background noise, resulting in a negative effect on their social and psychological aspects. It is recommended to conduct the sentence-in-noise intelligibility test and the subjective hearing handicap survey as a standard audiometric measures to confirm the functional communication problems for the elderly.

Download Full-text

Predicting Speech Recognition Using the Speech Intelligibility Index and Other Variables for Cochlear Implant Users

Journal of Speech Language and Hearing Research ◽

10.1044/2018_jslhr-h-18-0303 ◽

2019 ◽

Vol 62 (5) ◽

pp. 1517-1531 ◽

Cited By ~ 1

Author(s):

Sungmin Lee ◽

Lisa Lucks Mendel ◽

Gavin M. Bidelman

Keyword(s):

Speech Recognition ◽

Speech Perception ◽

Predictive Factors ◽

Cognitive Skills ◽

Speech Intelligibility ◽

Digit Span ◽

Individual Variability ◽

Large Individual ◽

Speech Intelligibility Index ◽

Speech Performance

Purpose Although the speech intelligibility index (SII) has been widely applied in the field of audiology and other related areas, application of this metric to cochlear implants (CIs) has yet to be investigated. In this study, SIIs for CI users were calculated to investigate whether the SII could be an effective tool for predicting speech perception performance in a population with CI. Method Fifteen pre- and postlingually deafened adults with CI participated. Speech recognition scores were measured using the AzBio sentence lists. CI users also completed questionnaires and performed psychoacoustic (spectral and temporal resolution) and cognitive function (digit span) tests. Obtained SIIs were compared with predicted SIIs using a transfer function curve. Correlation and regression analyses were conducted on perceptual and demographic predictor variables to investigate the association between these factors and speech perception performance. Result Because of the considerably poor hearing and large individual variability in performance, the SII did not predict speech performance for this CI group using the traditional calculation. However, new SII models were developed incorporating predictive factors, which improved the accuracy of SII predictions in listeners with CI. Conclusion Conventional SII models are not appropriate for predicting speech perception scores for CI users. Demographic variables (aided audibility and duration of deafness) and perceptual–cognitive skills (gap detection and auditory digit span outcomes) are needed to improve the use of the SII for listeners with CI. Future studies are needed to improve our CI-corrected SII model by considering additional predictive factors. Supplemental Material https://doi.org/10.23641/asha.8057003

Download Full-text

Evaluation of Automatic Directional Processing with Cochlear Implant Recipients

Journal of the American Academy of Audiology ◽

10.1055/s-0041-1733967 ◽

2021 ◽

Vol 32 (08) ◽

pp. 478-486

Author(s):

Lisa G. Potts ◽

Soo Jang ◽

Cory L. Hillis

Keyword(s):

Signal Processing ◽

Speech Recognition ◽

Cochlear Implant ◽

Mixed Model ◽

Dynamic Range ◽

Individual Variability ◽

Noise Performance ◽

Mixed Model Analysis ◽

Speech In Noise ◽

Speech Recognition In Noise

Abstract Background For cochlear implant (CI) recipients, speech recognition in noise is consistently poorer compared with recognition in quiet. Directional processing improves performance in noise and can be automatically activated based on acoustic scene analysis. The use of adaptive directionality with CI recipients is new and has not been investigated thoroughly, especially utilizing the recipients' preferred everyday signal processing, dynamic range, and/or noise reduction. Purpose This study utilized CI recipients' preferred everyday signal processing to evaluate four directional microphone options in a noisy environment to determine which option provides the best speech recognition in noise. A greater understanding of automatic directionality could ultimately improve CI recipients' speech-in-noise performance and better guide clinicians in programming. Study Sample Twenty-six unilateral and seven bilateral CI recipients with a mean age of 66 years and approximately 4 years of CI experience were included. Data Collection and Analysis Speech-in-noise performance was measured using eight loudspeakers in a 360-degree array with HINT sentences presented in restaurant noise. Four directional options were evaluated (automatic [SCAN], adaptive [Beam], fixed [Zoom], and Omni-directional) with participants' everyday use signal processing options active. A mixed-model analysis of variance (ANOVA) and pairwise comparisons were performed. Results Automatic directionality (SCAN) resulted in the best speech-in-noise performance, although not significantly better than Beam. Omni-directional performance was significantly poorer compared with the three other directional options. A varied number of participants performed their best with each of the four-directional options, with 16 performing best with automatic directionality. The majority of participants did not perform best with their everyday directional option. Conclusion The individual variability seen in this study suggests that CI recipients try with different directional options to find their ideal program. However, based on a CI recipient's motivation to try different programs, automatic directionality is an appropriate everyday processing option.

Download Full-text

Modeling Binaural Unmasking of Speech Using a Blind Binaural Processing Stage

Trends in Hearing ◽

10.1177/2331216520975630 ◽

2020 ◽

Vol 24 ◽

pp. 233121652097563

Author(s):

Christopher F. Hauth ◽

Simon C. Berning ◽

Birger Kollmeier ◽

Thomas Brand

Keyword(s):

Speech Recognition ◽

Speech Intelligibility ◽

Single Channel ◽

Signal To Noise Ratio ◽

Binaural Processing ◽

Speech In Noise ◽

Masking Level Difference ◽

Low Pass ◽

Speech Intelligibility Index ◽

Filtered Speech

The equalization cancellation model is often used to predict the binaural masking level difference. Previously its application to speech in noise has required separate knowledge about the speech and noise signals to maximize the signal-to-noise ratio (SNR). Here, a novel, blind equalization cancellation model is introduced that can use the mixed signals. This approach does not require any assumptions about particular sound source directions. It uses different strategies for positive and negative SNRs, with the switching between the two steered by a blind decision stage utilizing modulation cues. The output of the model is a single-channel signal with enhanced SNR, which we analyzed using the speech intelligibility index to compare speech intelligibility predictions. In a first experiment, the model was tested on experimental data obtained in a scenario with spatially separated target and masker signals. Predicted speech recognition thresholds were in good agreement with measured speech recognition thresholds with a root mean square error less than 1 dB. A second experiment investigated signals at positive SNRs, which was achieved using time compressed and low-pass filtered speech. The results demonstrated that binaural unmasking of speech occurs at positive SNRs and that the modulation-based switching strategy can predict the experimental results.

Download Full-text

Lipreading and audiovisual speech recognition across the adult lifespan: Implications for audiovisual integration.

Psychology and Aging ◽

10.1037/pag0000094 ◽

2016 ◽

Vol 31 (4) ◽

pp. 380-389 ◽

Cited By ~ 26

Author(s):

Nancy Tye-Murray ◽

Brent Spehar ◽

Joel Myerson ◽

Sandra Hale ◽

Mitchell Sommers

Keyword(s):

Speech Recognition ◽

Audiovisual Integration ◽

Audiovisual Speech ◽

Adult Lifespan ◽

Audiovisual Speech Recognition

Download Full-text

Interactions Between Digital Noise Reduction and Reverberation: Acoustic and Behavioral Effects

Journal of the American Academy of Audiology ◽

10.3766/jaaa.18048 ◽

2020 ◽

Vol 31 (01) ◽

pp. 017-029

Author(s):

Paul Reinhart ◽

Pavel Zahorik ◽

Pamela Souza

Keyword(s):

Noise Reduction ◽

Hearing Impairment ◽

Hearing Aids ◽

Background Noise ◽

Speech Intelligibility ◽

Spectral Subtraction ◽

Listening Effort ◽

Speech In Noise ◽

Reverberant Environments ◽

Speech Naturalness

AbstractDigital noise reduction (DNR) processing is used in hearing aids to enhance perception in noise by classifying and suppressing the noise acoustics. However, the efficacy of DNR processing is not known under reverberant conditions where the speech-in-noise acoustics are further degraded by reverberation.The purpose of this study was to investigate acoustic and perceptual effects of DNR processing across a range of reverberant conditions for individuals with hearing impairment.This study used an experimental design to investigate the effects of varying reverberation on speech-in-noise processed with DNR.Twenty-six listeners with mild-to-moderate sensorineural hearing impairment participated in the study.Speech stimuli were combined with unmodulated broadband noise at several signal-to-noise ratios (SNRs). A range of reverberant conditions with realistic parameters were simulated, as well as an anechoic control condition without reverberation. Reverberant speech-in-noise signals were processed using a spectral subtraction DNR simulation. Signals were acoustically analyzed using a phase inversion technique to quantify improvement in SNR as a result of DNR processing. Sentence intelligibility and subjective ratings of listening effort, speech naturalness, and background noise comfort were examined with and without DNR processing across the conditions.Improvement in SNR was greatest in the anechoic control condition and decreased as the ratio of direct to reverberant energy decreased. There was no significant effect of DNR processing on speech intelligibility in the anechoic control condition, but there was a significant decrease in speech intelligibility with DNR processing in all of the reverberant conditions. Subjectively, listeners reported greater listening effort and lower speech naturalness with DNR processing in some of the reverberant conditions. Listeners reported higher background noise comfort with DNR processing only in the anechoic control condition.Results suggest that reverberation affects DNR processing using a spectral subtraction algorithm in such a way that decreases the ability of DNR to reduce noise without distorting the speech acoustics. Overall, DNR processing may be most beneficial in environments with little reverberation and that the use of DNR processing in highly reverberant environments may actually produce adverse perceptual effects. Further research is warranted using commercial hearing aids in realistic reverberant environments.

Download Full-text

Relationship between Speech Perception in Noise and Phonemic Restoration of Speech in Noise in Individuals with Normal Hearing

Journal of Audiology & Otology ◽

10.7874/jao.2019.00472 ◽

2020 ◽

Vol 24 (4) ◽

pp. 167-173

Author(s):

Srikar Vijayasarathy ◽

Animesh Barman

Keyword(s):

Speech Perception ◽

Speech Intelligibility ◽

Individual Variability ◽

Large Individual ◽

Noise Performance ◽

Top Down ◽

Speech Perception In Noise ◽

Phonemic Restoration ◽

Speech In Noise ◽

Interrupted Speech

Background and Objectives: Top-down restoration of distorted speech, tapped as phonemic restoration of speech in noise, maybe a useful tool to understand robustness of perception in adverse listening situations. However, the relationship between phonemic restoration and speech perception in noise is not empirically clear.Subjects and Methods: 20 adults (40-55 years) with normal audiometric findings were part of the study. Sentence perception in noise performance was studied with various signal-to-noise ratios (SNRs) to estimate the SNR with 50% score. Performance was also measured for sentences interrupted with silence and for those interrupted by speech noise at -10, -5, 0, and 5 dB SNRs. The performance score in the noise interruption condition was subtracted by quiet interruption condition to determine the phonemic restoration magnitude.Results: Fairly robust improvements in speech intelligibility was found when the sentences were interrupted with speech noise instead of silence. Improvement with increasing noise levels was non-monotonic and reached a maximum at -10 dB SNR. Significant correlation between speech perception in noise performance and phonemic restoration of sentences interrupted with -10 dB SNR speech noise was found.Conclusions: It is possible that perception of speech in noise is associated with top-down processing of speech, tapped as phonemic restoration of interrupted speech. More research with a larger sample size is indicated since the restoration is affected by the type of speech material and noise used, age, working memory, and linguistic proficiency, and has a large individual variability.

Download Full-text

Amplified Earmuffs

American Journal of Audiology ◽

10.1044/1059-0889(2005/007) ◽

2005 ◽

Vol 14 (1) ◽

pp. 80-85 ◽

Cited By ~ 3

Author(s):

Thomas G. Dolan ◽

Dennis O’Loughlin

Keyword(s):

Hearing Loss ◽

Speech Intelligibility ◽

Noise Exposure ◽

High Input ◽

Industrial Noise ◽

Low Input ◽

Speech In Noise ◽

Hearing In Noise ◽

Noise Test

Purpose: To determine how amplified earmuffs affect the intelligibility of speech in noise for people with hearing loss, and to determine how various brands of amplified earmuffs compare in terms of speech intelligibility and electroacoustic response. Method: The Hearing in Noise Test (HINT) was used to measure the intelligibility of speech for 10 participants with hearing loss when they listened in a background of recorded industrial noise at 85 dBA. Participants listened with 3 different sets of amplified earmuffs (Peltor Tactical 7-S, Elvex COM 55, and Bilsom 707 Impact II), with a set of passive earmuffs (E-A-R Ultra 9000), and with ears unoccluded. Two measurements of sentence threshold were obtained under each of the 5 listening conditions. Gain was measured electroacoustically across a range of input levels and frequencies for each amplified earmuff. Results: Electroacoustic measurements indicated that each electronic earmuff amplified at low input levels and attenuated at high input levels. However, gain characteristics varied greatly across devices. HINT sentence thresholds were not significantly different across the 5 listening conditions or across the 2 trials. Conclusion: Results suggest that each type of earmuff can be used to reduce the noise exposure of people with hearing loss without compromising their ability to understand speech.

Download Full-text

Enhancing Speech Intelligibility: Interactions Among Context, Modality, Speech Style, and Masker

Journal of Speech Language and Hearing Research ◽

10.1044/jslhr-h-13-0076 ◽

2014 ◽

Vol 57 (5) ◽

pp. 1908-1918 ◽

Cited By ~ 34

Author(s):

Kristin J. Van Engen ◽

Jasmine E. B. Phelps ◽

Rajka Smiljanic ◽

Bharath Chandrasekaran

Keyword(s):

Visual Information ◽

Speech Intelligibility ◽

Semantic Context ◽

Visual Input ◽

Conversational Speech ◽

Clear Speech ◽

Sentence Recognition ◽

Speech In Noise ◽

Masking Condition ◽

Speech Cues

Purpose The authors sought to investigate interactions among intelligibility-enhancing speech cues (i.e., semantic context, clearly produced speech, and visual information) across a range of masking conditions. Method Sentence recognition in noise was assessed for 29 normal-hearing listeners. Testing included semantically normal and anomalous sentences, conversational and clear speaking styles, auditory-only (AO) and audiovisual (AV) presentation modalities, and 4 different maskers (2-talker babble, 4-talker babble, 8-talker babble, and speech-shaped noise). Results Semantic context, clear speech, and visual input all improved intelligibility but also interacted with one another and with masking condition. Semantic context was beneficial across all maskers in AV conditions but only in speech-shaped noise in AO conditions. Clear speech provided the most benefit for AV speech with semantically anomalous targets. Finally, listeners were better able to take advantage of visual information for meaningful versus anomalous sentences and for clear versus conversational speech. Conclusion Because intelligibility-enhancing cues influence each other and depend on masking condition, multiple maskers and enhancement cues should be used to accurately assess individuals' speech-in-noise perception.

Download Full-text

Impact of Hearing Loss and Amplification on Performance on a Cognitive Screening Test

Journal of the American Academy of Audiology ◽

10.3766/jaaa.17044 ◽

2018 ◽

Vol 29 (07) ◽

pp. 648-655 ◽

Cited By ~ 6

Author(s):

Gabrielle H. Saunders ◽

Ian Odgear ◽

Anna Cosgrove ◽

Melissa T. Frederick

Keyword(s):

Hearing Loss ◽

Hearing Impairment ◽

Hearing Aids ◽

Pure Tone ◽

Pure Tone Audiometry ◽

Cognitive Screening ◽

Strong Correlations ◽

Speech In Noise ◽

Hearing Ability ◽

The Individual

AbstractThere have been numerous recent reports on the association between hearing impairment and cognitive function, such that the cognition of adults with hearing loss is poorer relative to the cognition of adults with normal hearing (NH), even when amplification is used. However, it is not clear the extent to which this is testing artifact due to the individual with hearing loss being unable to accurately hear the test stimuli.The primary purpose of this study was to examine whether use of amplification during cognitive screening with the Montreal Cognitive Assessment (MoCA) improves performance on the MoCA. Secondarily, we investigated the effects of hearing ability on MoCA performance, by comparing the performance of individuals with and without hearing impairment.Participants were 42 individuals with hearing impairment and 19 individuals with NH. Of the individuals with hearing impairment, 22 routinely used hearing aids; 20 did not use hearing aids.Following a written informec consent process, all participants completed pure tone audiometry, speech testing in quiet (Maryland consonant-nucleus-consonant [CNC] words) and in noise (Quick Speech in Noise [QuickSIN] test), and the MoCA. The speech testing and MoCA were completed twice. Individuals with hearing impairment completed testing once unaided and once with amplification, whereas individuals with NH completed unaided testing twice.The individuals with hearing impairment performed significantly less well on the MoCA than those without hearing impairment for unaided testing, and the use of amplification did not significantly change performance. This is despite the finding that amplification significantly improved the performance of the hearing aid users on the measures of speech in quiet and speech in noise. Furthermore, there were strong correlations between MoCA score and the four frequency pure tone average, Maryland CNC score and QuickSIN, which remain moderate to strong when the analyses were adjusted for age.It is concluded that the individuals with hearing loss here performed less well on the MoCA than individuals with NH and that the use of amplification did not compensate for this performance deficit. Nonetheless, this should not be taken to suggest the use of amplification during testing is unnecessary because it might be that other unmeasured factors, such as effort required to perform or fatigue, were decreased with the use of amplification.

Download Full-text