List Equivalency of PRESTO for the Evaluation of Speech Recognition

Kathleen F. Faulkner; Terrin N. Tamati; Jaimie L. Gilbert; David B. Pisoni

doi:10.3766/jaaa.14082

List Equivalency of PRESTO for the Evaluation of Speech Recognition

Journal of the American Academy of Audiology ◽

10.3766/jaaa.14082 ◽

2015 ◽

Vol 26 (06) ◽

pp. 582-594 ◽

Cited By ~ 4

Author(s):

Kathleen F. Faulkner ◽

Terrin N. Tamati ◽

Jaimie L. Gilbert ◽

David B. Pisoni

Keyword(s):

Speech Recognition ◽

Undergraduate Students ◽

Signal To Noise Ratio ◽

Spoken Word Recognition ◽

Recognition Task ◽

Clinical Test ◽

English Sentence ◽

Sentence Recognition ◽

Open Set ◽

Regional Dialect

Background: There is a pressing clinical need for the development of ecologically valid and robust assessment measures of speech recognition. Perceptually Robust English Sentence Test Open-set (PRESTO) is a new high-variability sentence recognition test that is sensitive to individual differences and was designed for use with several different clinical populations. PRESTO differs from other sentence recognition tests because the target sentences differ in talker, gender, and regional dialect. Increasing interest in using PRESTO as a clinical test of spoken word recognition dictates the need to establish equivalence across test lists. Purpose: The purpose of this study was to establish list equivalency of PRESTO for clinical use. Research Design: PRESTO sentence lists were presented to three groups of normal-hearing listeners in noise (multitalker babble [MTB] at 0 dB signal-to-noise ratio) or under eight-channel cochlear implant simulation (CI-Sim). Study Sample: Ninety-one young native speakers of English who were undergraduate students from the Indiana University community participated in this study. Data Collection and Analysis: Participants completed a sentence recognition task using different PRESTO sentence lists. They listened to sentences presented over headphones and typed in the words they heard on a computer. Keyword scoring was completed offline. Equivalency for sentence lists was determined based on the list intelligibility (mean keyword accuracy for each list compared with all other lists) and listener consistency (the relation between mean keyword accuracy on each list for each listener). Results: Based on measures of list equivalency and listener consistency, ten PRESTO lists were found to be equivalent in the MTB condition, nine lists were equivalent in the CI-Sim condition, and six PRESTO lists were equivalent in both conditions. Conclusions: PRESTO is a valuable addition to the clinical toolbox for assessing sentence recognition across different populations. Because the test condition influenced the overall intelligibility of lists, researchers and clinicians should take the presentation conditions into consideration when selecting the best PRESTO lists for their research or clinical protocols.

Download Full-text

A Sequential Sentence Paradigm Using Revised PRESTO Sentence Lists

Journal of the American Academy of Audiology ◽

10.3766/jaaa.15074 ◽

2016 ◽

Vol 27 (08) ◽

pp. 647-660 ◽

Cited By ~ 1

Author(s):

Andrea R. Plotkowski ◽

Joshua M. Alexander

Keyword(s):

Individual Differences ◽

Speech Recognition ◽

Full Range ◽

Recognition Task ◽

Total N ◽

English Sentence ◽

Delayed Recall ◽

Single Sentence ◽

Immediate Recall ◽

Open Set

Background: Listening in challenging situations requires explicit cognitive resources to decode and process speech. Traditional speech recognition tests are limited in documenting this cognitive effort, which may differ greatly between individuals or listening conditions despite similar scores. A sequential sentence paradigm was designed to be more sensitive to individual differences in demands on verbal processing during speech recognition. Purpose: The purpose of this study was to establish the feasibility, validity, and equivalency of test materials in the sequential sentence paradigm as well as to evaluate the effects of masker type, signal-to-noise ratio (SNR), and working memory (WM) capacity on performance in the task. Research Design: Listeners heard a pair of sentences and repeated aloud the second sentence (immediate recall) and then wrote down the first sentence (delayed recall). Sentence lists were from the Perceptually Robust English Sentence Test Open-set (PRESTO) test. In experiment I, listeners completed a traditional speech recognition task. In experiment II, listeners completed the sequential sentence task at one SNR. In experiment III, the masker type (steady noise versus multitalker babble) and SNR were varied to demonstrate the effects of WM as the speech material increased in difficulty. Study Sample: Young, normal-hearing adults (total n = 53) from the Purdue University community completed one of the three experiments. Data Collection and Analysis: Keyword scoring of the PRESTO lists was completed for both the immediate- and delayed-recall sentences. The Verbal Letter Monitoring task, a test of WM, was used to separate listeners into a low-WM or high-WM group. Results: Experiment I indicated that mean recognition on the single-sentence task was highly variable between the original PRESTO lists. Modest rearrangement of the sentences yielded 18 statistically equivalent lists (mean recognition = 65.0%, range = 64.4–65.7%), which were used in the sequential sentence task in experiment II. In the new test paradigm, recognition of the immediate-recall sentences was not statistically different from the single-sentence task, indicating that there were no cognitive load effects from the delayed-recall sentences. Finally, experiment III indicated that multitalker babble was equally detrimental compared to steady-state noise for immediate recall of sentences for both low- and high-WM groups. On the other hand, delayed recall of sentences in multitalker babble was disproportionately more difficult for the low-WM group compared with the high-WM group. Conclusions: The sequential sentence paradigm is a feasible test format with mostly equivalent lists. Future studies using this paradigm may need to consider individual differences in WM to see the full range of effects across different conditions. Possible applications include testing the efficacy of various signal-processing techniques in clinical populations.

Download Full-text

Effect of Competition, Signal-to-Noise Ratio, Race, and Sex on Southern American English Dialect Talkers’ Sentence Recognition

Journal of the American Academy of Audiology ◽

10.3766/jaaa19029 ◽

2020 ◽

Author(s):

Andrew Stuart ◽

Yolanda F. Holt ◽

Alyssa N. Kerls ◽

Madeline R. Smith

Keyword(s):

African American ◽

Speech Perception ◽

Signal To Noise Ratio ◽

American English ◽

Broadband Noise ◽

Signal To Noise ◽

English Sentence ◽

Sentence Recognition ◽

Open Set ◽

Noise Ratio

Background: Although numerous studies have examined regional and racial–ethnic labeling of talkeridentity, few have evaluated speech perception skills of listeners from the southern United States.Purpose: The objective of the study was to examine the effect of competition, signal-to-noise ratio(SNR), race, and sex on sentence recognition performance in talkers from the Southern American Englishdialect region.Research Design: A four-factor mixed-measures design was used.Study Sample: Forty-eight normal-hearing young African American and White adults participated.Data Collection and Analyses: The Perceptually Robust English Sentence Test Open-set was used inquiet and in continuous and interrupted noise and multitalker babble at SNRs of -10, -5, 0, and 5 dB.Results: Significant main effects of competition (p < 0.001) and SNR (p < 0.001) and a competition bySNR interaction were found (p < 0.001). Performance improved with increasing SNRs. Performance wasalso greater in the interrupted broadband noise at poorer SNRs, relative to the other competitors. Multitalkerbabble performance was significantly poorer than the continuous noise at the poorer SNRs. Therewas no effect of race or sex on performance in quiet or competition.Conclusions: Although African American English and White American English talkers living in the samegeographic region demonstrate differences in speech production, their speech perception in noise doesnot appear to differ under the conditions examined in this study.

Download Full-text

Multitasking with typical use of hearing aid noise reduction in older listeners

10.31234/osf.io/bhq2j ◽

2018 ◽

Author(s):

Tim Schoof ◽

Pamela Souza

Keyword(s):

Speech Recognition ◽

Noise Reduction ◽

Hearing Aids ◽

Recognition Task ◽

Hearing Impaired ◽

Improve Performance ◽

Sentence Recognition ◽

Monitoring Task ◽

Speech In Noise ◽

Dual Task Paradigm

Objective: Older hearing-impaired adults typically experience difficulties understanding speech in noise. Most hearing aids address this issue using digital noise reduction. While noise reduction does not necessarily improve speech recognition, it may reduce the resources required to process the speech signal. Those available resources may, in turn, aid the ability to perform another task while listening to speech (i.e., multitasking). This study examined to what extent changing the strength of digital noise reduction in hearing aids affects the ability to multitask. Design: Multitasking was measured using a dual-task paradigm, combining a speech recognition task and a visual monitoring task. The speech recognition task involved sentence recognition in the presence of six-talker babble at signal-to-noise ratios (SNRs) of 2 and 7 dB. Participants were fit with commercially-available hearing aids programmed under three noise reduction settings: off, mild, strong. Study sample: 18 hearing-impaired older adults. Results: There were no effects of noise reduction on the ability to multitask, or on the ability to recognize speech in noise. Conclusions: Adjustment of noise reduction settings in the clinic may not invariably improve performance for some tasks.

Download Full-text

Implanting outside the Guidelines: A Case Study

Journal of the American Academy of Audiology ◽

10.3766/jaaa.19.3.2 ◽

2008 ◽

Vol 19 (03) ◽

pp. 197-203

Author(s):

Millicent K. Seymour ◽

Larry Lundy

Keyword(s):

Cochlear Implantation ◽

Signal To Noise Ratio ◽

Percent Correct ◽

Sentence Recognition ◽

Speech In Noise ◽

The Past ◽

Open Set ◽

Hearing Aid Use ◽

Communication Difficulties

An 81-year-old female was referred for cochlear implantation due to difficulty communicating in her daily activities despite the use of appropriate amplification. The poorer ear was unable to tolerate amplification for the past 15 years. The open-set sentence-recognition test score in quiet in her "good" ear was 85 percent correct, indicating that the patient was not a traditional cochlear implant candidate. However, the sentence-recognition score in noise at +10 dB signal-to-noise ratio was 0 percent, demonstrating a significant breakdown in the patient's speech understanding in more difficult listening situations. This speech-in-noise score appeared to correlate with the patient's reported communication difficulties as well as with the communicative breakdowns that were observed clinically. The patient underwent cochlear implantation in the better ear. Cochlear implantation in this nontraditional patient provided objective and subjective benefit over hearing aid use. Una mujer de 81 años de edad fue referida para implante coclear debido a sus dificultades para comunicarse en sus actividades diarias, a pesar del uso de amplificación apropiada. El oído peor había sido incapaz de tolerar la amplificación durante los últimos 15 años. El puntaje de la prueba de reconocimiento de palabras de lista abierta en silencio en su oído "bueno" fue de un 85%, indicando que le paciente no era una candidata tradicional para un implante coclear. Sin embargo, los puntajes de reconocimiento de frases en ruido, a una tasa señal/ruido de ±10 dB fueron de 0%, demostrando un colapso en la comprensión del lenguaje por parte de la paciente en situaciones auditivas más difíciles. Este puntaje de lenguaje en ruido pareció correlacionar con las dificultades reportadas por la paciente, así como con el colapso comunicativo que se observó clínicamente. La paciente fue sometida a una implantación coclear en el mejor oído. El implante coclear en esta paciente no tradicional aportó un beneficio objetivo y subjetivo sobre el uso de auxiliares auditivos.

Download Full-text

Effects of Adaptive Hearing Aid Directionality and Noise Reduction on Masked Speech Recognition for Children Who Are Hard of Hearing

American Journal of Audiology ◽

10.1044/2018_aja-18-0045 ◽

2019 ◽

Vol 28 (1) ◽

pp. 101-113 ◽

Cited By ~ 3

Author(s):

Jenna M. Browning ◽

Emily Buss ◽

Mary Flaherty ◽

Tim Vallier ◽

Lori J. Leibold

Keyword(s):

Speech Recognition ◽

Hearing Aids ◽

Hard Of Hearing ◽

Signal To Noise Ratio ◽

Hearing Aid ◽

Normal Hearing ◽

Signal To Noise ◽

Open Set ◽

Fully Adaptive ◽

Noise Ratio

Purpose The purpose of this study was to evaluate speech-in-noise and speech-in-speech recognition associated with activation of a fully adaptive directional hearing aid algorithm in children with mild to severe bilateral sensory/neural hearing loss. Method Fourteen children (5–14 years old) who are hard of hearing participated in this study. Participants wore laboratory hearing aids. Open-set word recognition thresholds were measured adaptively for 2 hearing aid settings: (a) omnidirectional (OMNI) and (b) fully adaptive directionality. Each hearing aid setting was evaluated in 3 listening conditions. Fourteen children with normal hearing served as age-matched controls. Results Children who are hard of hearing required a more advantageous signal-to-noise ratio than children with normal hearing to achieve comparable performance in all 3 conditions. For children who are hard of hearing, the average improvement in signal-to-noise ratio when comparing fully adaptive directionality to OMNI was 4.0 dB in noise, regardless of target location. Children performed similarly with fully adaptive directionality and OMNI settings in the presence of the speech maskers. Conclusions Compared to OMNI, fully adaptive directionality improved speech recognition in steady noise for children who are hard of hearing, even when they were not facing the target source. This algorithm did not affect speech recognition when the background noise was speech. Although the use of hearing aids with fully adaptive directionality is not proposed as a substitute for remote microphone systems, it appears to offer several advantages over fixed directionality, because it does not depend on children facing the target talker and provides access to multiple talkers within the environment. Additional experiments are required to further evaluate children's performance under a variety of spatial configurations in the presence of both noise and speech maskers.

Download Full-text

Comparison of Speech Recognition in Cochlear Implant Users with Different Speech Processors

Journal of the American Academy of Audiology ◽

10.1055/s-0041-1735252 ◽

2021 ◽

Vol 32 (07) ◽

pp. 469-476

Author(s):

Maria Madalena Canina Pinheiro ◽

Patricia Cotta Mancini ◽

Alexandra Dezani Soares ◽

Ângela Ribas ◽

Danielle Penna Lima ◽

...

Keyword(s):

Speech Recognition ◽

Cochlear Implant ◽

Paired Comparison ◽

Signal To Noise Ratio ◽

Cross Sectional Study ◽

Signal Design ◽

Cross Sectional ◽

Sentence Recognition ◽

Significant Difference ◽

Sound Processor

Abstract Background Speech recognition in noisy environments is a challenge for both cochlear implant (CI) users and device manufacturers. CI manufacturers have been investing in technological innovations for processors and researching strategies to improve signal processing and signal design for better aesthetic acceptance and everyday use. Purpose This study aimed to compare speech recognition in CI users using off-the-ear (OTE) and behind-the-ear (BTE) processors. Design A cross-sectional study was conducted with 51 CI recipients, all users of the BTE Nucleus 5 (CP810) sound processor. Speech perception performances were compared in quiet and noisy conditions using the BTE sound processor Nucleus 5 (N5) and OTE sound processor Kanso. Each participant was tested with the Brazilian-Portuguese version of the hearing in noise test using each sound processor in a randomized order. Three test conditions were analyzed with both sound processors: (i) speech level fixed at 65 decibel sound pressure level in a quiet, (ii) speech and noise at fixed levels, and (iii) adaptive speech levels with a fixed noise level. To determine the relative performance of OTE with respect to BTE, paired comparison analyses were performed. Results The paired t-tests showed no significant difference between the N5 and Kanso in quiet conditions. In all noise conditions, the performance of the OTE (Kanso) sound processor was superior to that of the BTE (N5), regardless of the order in which they were used. With the speech and noise at fixed levels, a significant mean 8.1 percentage point difference was seen between Kanso (78.10%) and N5 (70.7%) in the sentence scores. Conclusion CI users had a lower signal-to-noise ratio and a higher percentage of sentence recognition with the OTE processor than with the BTE processor.

Download Full-text

Optimizing the Reliability of Speech Recognition Scores

Journal of Speech Language and Hearing Research ◽

10.1044/jslhr.4105.1088 ◽

1998 ◽

Vol 41 (5) ◽

pp. 1088-1102 ◽

Cited By ~ 34

Author(s):

Stanley A. Gelfand

Keyword(s):

Speech Recognition ◽

Signal To Noise Ratio ◽

Binomial Model ◽

Verbal Responses ◽

Interactive Computer Program ◽

Open Set ◽

Monosyllabic Words ◽

Short Test ◽

Test Outcomes ◽

Normal Performance

Speech recognition assessment involves a dilemma because clinicians want a test that is short and reliable, but statistical principles dictate that a short test is unreliable. Curves representing the variability of test scores based on the binomial model reveal that approximately 450 scorable items are needed in order to optimize the reliability of a speech recognition test. A testing approach was developed to achieve this sample size while retaining the principal features of the most commonly accepted speech recognition tests (i.e., monosyllabic words presented in an open-set format, verbal responses, and right/wrong scoring). It involves the use of an interactive computer program to present CNC words in 50 three-word groups, which are scored phonemically, resulting in 450 scorable items. Normal performance is described as a function of both presentation level and signal-to-noise ratio. Comparisons of test and retest scores for 100 individuals with normal hearing and 100 persons with sensorineural losses revealed that the approach achieves the degree of reliability predicted by the binomial model for both groups. Phoneme scores accounted for 99% of the variance of word scores for most of the performance range encountered in clinical practice, making it possible for test outcomes based on phonemic scoring to be expressed in terms of equivalent word recognition scores.

Download Full-text

Second Language Experience Facilitates Sentence Recognition in Temporally-Modulated Noise for Non-native Listeners

Frontiers in Psychology ◽

10.3389/fpsyg.2021.631060 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jingjing Guan ◽

Xuetong Cao ◽

Chang Liu

Keyword(s):

Native American ◽

Auditory Processing ◽

Signal To Noise Ratio ◽

The United States ◽

Language Experience ◽

Vowel Identification ◽

English Sentence ◽

Temporal Fluctuations ◽

Sentence Recognition ◽

High Level

Non-native listeners deal with adverse listening conditions in their daily life much harder than native listeners. However, previous work in our laboratories found that native Chinese listeners with native English exposure may improve the use of temporal fluctuations of noise for English vowel identification. The purpose of this study was to investigate whether Chinese listeners can generalize the use of temporal cues for the English sentence recognition in noise. Institute of Electrical and Electronics Engineers (IEEE) sentence recognition in quiet condition, stationary noise, and temporally-modulated noise were measured for native American English listeners (EN), native Chinese listeners in the United States (CNU), and native Chinese listeners in China (CNC). Results showed that in general, EN listeners outperformed the two groups of CN listeners in quiet and noise, while CNU listeners had better scores of sentence recognition than CNC listeners. Moreover, the native English exposure helped CNU listeners use high-level linguistic cues more effectively and take more advantage of temporal fluctuations of noise to process English sentence in severely degraded listening conditions [i.e., the signal-to-noise ratio (SNR) of −12 dB] than CNC listeners. These results suggest a significant effect of language experience on the auditory processing of both speech and noise.

Download Full-text

Masking Release Due to Linguistic and Phonetic Dissimilarity Between the Target and Masker Speech

American Journal of Audiology ◽

10.1044/1059-0889(2013/12-0072) ◽

2013 ◽

Vol 22 (1) ◽

pp. 157-164 ◽

Cited By ~ 25

Author(s):

Lauren Calandruccio ◽

Susanne Brouwer ◽

Kristin J. Van Engen ◽

Sumitrajit Dhar ◽

Ann R. Bradlow

Keyword(s):

Significant Role ◽

Recognition Task ◽

Study Data ◽

Linguistic Distance ◽

English Sentence ◽

Sentence Recognition ◽

Masking Release ◽

Mandarin Language ◽

Competing Speech

Purpose To investigate masking release for speech maskers for linguistically and phonetically close (English and Dutch) and distant (English and Mandarin) language pairs. Method Thirty-two monolingual speakers of English with normal audiometric thresholds participated in the study. Data are reported for an English sentence recognition task in English and for Dutch and Mandarin competing speech maskers (Experiment 1) and noise maskers (Experiment 2) that were matched either to the long-term average speech spectra or to the temporal modulations of the speech maskers from Experiment 1. Results Listener performance increased as the target-to-masker linguistic distance increased (English-in-English < English-in-Dutch < English-in-Mandarin). Conclusion Spectral differences between maskers can account for some, but not all, of the variation in performance between maskers; however, temporal differences did not seem to play a significant role.

Download Full-text

Children's Perception of Speech Produced in a Two-Talker Background

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2013/12-0287) ◽

2014 ◽

Vol 57 (1) ◽

pp. 327-337 ◽

Cited By ~ 8

Author(s):

Mallory Baker ◽

Emily Buss ◽

Adam Jacks ◽

Crystal Taylor ◽

Lori J. Leibold

Keyword(s):

Speech Perception ◽

Word Recognition ◽

Repeated Measures ◽

Signal To Noise Ratio ◽

Age Groups ◽

Recognition Task ◽

Monosyllabic Word ◽

Repeated Measures Design ◽

Speech In Noise ◽

Open Set

Purpose This study evaluated the degree to which children benefit from the acoustic modifications made by talkers when they produce speech in noise. Method A repeated measures design compared the speech perception performance of children (5–11 years) and adults in a 2-talker masker. Target speech was produced in a 2-talker background or in quiet. In Experiment 1, recognition with the 2 target sets was assessed using an adaptive spondee identification procedure. In Experiment 2, the benefit of speech produced in a 2-talker background was assessed using an open-set, monosyllabic word recognition task at a fixed signal-to-noise ratio (SNR). Results Children performed more poorly than adults, regardless of whether the target speech was produced in quiet or in a 2-talker background. A small improvement in the SNR required to identify spondees was observed for both children and adults using speech produced in a 2-talker background (Experiment 1). Similarly, average open-set word recognition scores were 11 percentage points higher for both age groups using speech produced in a 2-talker background compared with quiet (Experiment 2). Conclusion The results indicate that children can use the acoustic modifications of speech produced in a 2-talker background to improve masked speech perception, as previously demonstrated for adults.

Download Full-text