scholarly journals Some Factors Underlying Individual Differences in Speech Recognition on PRESTO: A First Report

2013 ◽  
Vol 24 (07) ◽  
pp. 616-634 ◽  
Author(s):  
Terrin N. Tamati ◽  
Jaimie L. Gilbert ◽  
David B. Pisoni

Background: Previous studies investigating speech recognition in adverse listening conditions have found extensive variability among individual listeners. However, little is currently known about the core underlying factors that influence speech recognition abilities. Purpose: To investigate sensory, perceptual, and neurocognitive differences between good and poor listeners on the Perceptually Robust English Sentence Test Open-set (PRESTO), a new high-variability sentence recognition test under adverse listening conditions. Research Design: Participants who fell in the upper quartile (HiPRESTO listeners) or lower quartile (LoPRESTO listeners) on key word recognition on sentences from PRESTO in multitalker babble completed a battery of behavioral tasks and self-report questionnaires designed to investigate real-world hearing difficulties, indexical processing skills, and neurocognitive abilities. Study Sample: Young, normal-hearing adults (N = 40) from the Indiana University community participated in the current study. Data Collection and Analysis: Participants' assessment of their own real-world hearing difficulties was measured with a self-report questionnaire on situational hearing and hearing health history. Indexical processing skills were assessed using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Neurocognitive abilities were measured with the Auditory Digit Span Forward (verbal short-term memory) and Digit Span Backward (verbal working memory) tests, the Stroop Color and Word Test (attention/inhibition), the WordFam word familiarity test (vocabulary size), the Behavioral Rating Inventory of Executive Function–Adult Version (BRIEF-A) self-report questionnaire on executive function, and two performance subtests of the Wechsler Abbreviated Scale of Intelligence (WASI) Performance Intelligence Quotient (IQ; nonverbal intelligence). Scores on self-report questionnaires and behavioral tasks were tallied and analyzed by listener group (HiPRESTO and LoPRESTO). Results: The extreme groups did not differ overall on self-reported hearing difficulties in real-world listening environments. However, an item-by-item analysis of questions revealed that LoPRESTO listeners reported significantly greater difficulty understanding speakers in a public place. HiPRESTO listeners were significantly more accurate than LoPRESTO listeners at gender discrimination and regional dialect categorization, but they did not differ on talker discrimination accuracy or response time, or gender discrimination response time. HiPRESTO listeners also had longer forward and backward digit spans, higher word familiarity ratings on the WordFam test, and lower (better) scores for three individual items on the BRIEF-A questionnaire related to cognitive load. The two groups did not differ on the Stroop Color and Word Test or either of the WASI performance IQ subtests. Conclusions: HiPRESTO listeners and LoPRESTO listeners differed in indexical processing abilities, short-term and working memory capacity, vocabulary size, and some domains of executive functioning. These findings suggest that individual differences in the ability to encode and maintain highly detailed episodic information in speech may underlie the variability observed in speech recognition performance in adverse listening conditions using high-variability PRESTO sentences in multitalker babble.

2014 ◽  
Vol 25 (09) ◽  
pp. 869-892 ◽  
Author(s):  
Terrin N. Tamati ◽  
David B. Pisoni

Background: Natural variability in speech is a significant challenge to robust successful spoken word recognition. In everyday listening environments, listeners must quickly adapt and adjust to multiple sources of variability in both the signal and listening environments. High-variability speech may be particularly difficult to understand for non-native listeners, who have less experience with the second language (L2) phonological system and less detailed knowledge of sociolinguistic variation of the L2. Purpose: The purpose of this study was to investigate the effects of high-variability sentences on non-native speech recognition and to explore the underlying sources of individual differences in speech recognition abilities of non-native listeners. Research Design: Participants completed two sentence recognition tasks involving high-variability and low-variability sentences. They also completed a battery of behavioral tasks and self-report questionnaires designed to assess their indexical processing skills, vocabulary knowledge, and several core neurocognitive abilities. Study Sample: Native speakers of Mandarin (n = 25) living in the United States recruited from the Indiana University community participated in the current study. A native comparison group consisted of scores obtained from native speakers of English (n = 21) in the Indiana University community taken from an earlier study. Data Collection and Analysis: Speech recognition in high-variability listening conditions was assessed with a sentence recognition task using sentences from PRESTO (Perceptually Robust English Sentence Test Open-Set) mixed in 6-talker multitalker babble. Speech recognition in low-variability listening conditions was assessed using sentences from HINT (Hearing In Noise Test) mixed in 6-talker multitalker babble. Indexical processing skills were measured using a talker discrimination task, a gender discrimination task, and a forced-choice regional dialect categorization task. Vocabulary knowledge was assessed with the WordFam word familiarity test, and executive functioning was assessed with the BRIEF-A (Behavioral Rating Inventory of Executive Function – Adult Version) self-report questionnaire. Scores from the non-native listeners on behavioral tasks and self-report questionnaires were compared with scores obtained from native listeners tested in a previous study and were examined for individual differences. Results: Non-native keyword recognition scores were significantly lower on PRESTO sentences than on HINT sentences. Non-native listeners’ keyword recognition scores were also lower than native listeners’ scores on both sentence recognition tasks. Differences in performance on the sentence recognition tasks between non-native and native listeners were larger on PRESTO than on HINT, although group differences varied by signal-to-noise ratio. The non-native and native groups also differed in the ability to categorize talkers by region of origin and in vocabulary knowledge. Individual non-native word recognition accuracy on PRESTO sentences in multitalker babble at more favorable signal-to-noise ratios was found to be related to several BRIEF-A subscales and composite scores. However, non-native performance on PRESTO was not related to regional dialect categorization, talker and gender discrimination, or vocabulary knowledge. Conclusions: High-variability sentences in multitalker babble were particularly challenging for non-native listeners. Difficulty under high-variability testing conditions was related to lack of experience with the L2, especially L2 sociolinguistic information, compared with native listeners. Individual differences among the non-native listeners were related to weaknesses in core neurocognitive abilities affecting behavioral control in everyday life.


Author(s):  
Adam K. Bosen ◽  
Victoria A. Sevich ◽  
Shauntelle A. Cannon

Purpose In individuals with cochlear implants, speech recognition is not associated with tests of working memory that primarily reflect storage, such as forward digit span. In contrast, our previous work found that vocoded speech recognition in individuals with normal hearing was correlated with performance on a forward digit span task. A possible explanation for this difference across groups is that variability in auditory resolution across individuals with cochlear implants could conceal the true relationship between speech and memory tasks. Here, our goal was to determine if performance on forward digit span and speech recognition tasks are correlated in individuals with cochlear implants after controlling for individual differences in auditory resolution. Method We measured sentence recognition ability in 20 individuals with cochlear implants with Perceptually Robust English Sentence Test Open-set sentences. Spectral and temporal modulation detection tasks were used to assess individual differences in auditory resolution, auditory forward digit span was used to assess working memory storage, and self-reported word familiarity was used to assess vocabulary. Results Individual differences in speech recognition were predicted by spectral and temporal resolution. A correlation was found between forward digit span and speech recognition, but this correlation was not significant after controlling for spectral and temporal resolution. No relationship was found between word familiarity and speech recognition. Forward digit span performance was not associated with individual differences in auditory resolution. Conclusions Our findings support the idea that sentence recognition in individuals with cochlear implants is primarily limited by individual differences in working memory processing, not storage. Studies examining the relationship between speech and memory should control for individual differences in auditory resolution.


2017 ◽  
Vol 7 (3-4) ◽  
pp. 301-330 ◽  
Author(s):  
Natalia Meir

Abstract The current study assessed independent and combined effects of SLI and bilingualism on tasks tapping into verbal short-term memory (vSTM) with varying linguistic load in two languages (Russian and Hebrew). The study explored the extent to which the presence of SLI is related to limited vSTM storage and bilingualism is associated with reduced vocabulary size. A total of 190 monolingual and bilingual children aged 5;5–6;8 participated in the current study: 108 sequential Russian-Hebrew bilinguals (18 with SLI), 48 Hebrew monolinguals (13 with SLI) and 34 Russian monolinguals (14 with SLI). Children performed three repetition tasks: forward-digit span (FWD), non-word repetition (NWR) and sentence repetition (SRep); bilingual children were tested in both of their languages. Results indicated a negative effect of SLI on all experimental tasks tapping into vSTM. The effect of SLI rose as a function of increased linguistic load. Regarding bilingualism, no effect was found on the measure of vSTM with the lowest linguistic load (FWD), while its effect was robust once the linguistic load was increased (SRep). The results reported in this study bring evidence that lower performance on measures of vSTM in children with SLI and bilingual children stem from different sources. Although, children with SLI have limitations of vSTM, deficient vSTM cannot fully account for the linguistic difficulties observed in children with SLI. As for bilingualism, it does not affect verbal storage when the linguistic load is minimal, while poor performance in bilingual children on tasks with greater linguistic load is attributed to smaller vocabulary sizes.


2020 ◽  
Vol 5 (5) ◽  
pp. 1231-1242
Author(s):  
Celeste Domsch ◽  
Lori Stiritz ◽  
Jay Huff

Purpose This study used a mixed-methods design to assess changes in students' cultural awareness during and following a short-term study abroad. Method Thirty-six undergraduate and graduate students participated in a 2-week study abroad to England during the summers of 2016 and 2017. Quantitative data were collected using standardized self-report measures administered prior to departure and after returning to the United States and were analyzed using paired-samples t tests. Qualitative data were collected in the form of daily journal reflections during the trip and interviews after returning to the United States and analyzed using phenomenological methods. Results No statistically significant changes were evident on any standardized self-report measures once corrections for multiple t tests were applied. In addition, a ceiling effect was found on one measure. On the qualitative measures, themes from student transcripts included increased global awareness and a sense of personal growth. Conclusions Measuring cultural awareness poses many challenges. One is that social desirability bias may influence responses. A second is that current measures of cultural competence may exhibit ceiling or floor effects. Analysis of qualitative data may be more useful in examining effects of participation in a short-term study abroad, which appears to result in decreased ethnocentrism and increased global awareness in communication sciences and disorders students. Future work may wish to consider the long-term effects of participation in a study abroad for emerging professionals in the field.


2000 ◽  
Vol 16 (1) ◽  
pp. 53-58 ◽  
Author(s):  
Hans Ottosson ◽  
Martin Grann ◽  
Gunnar Kullgren

Summary: Short-term stability or test-retest reliability of self-reported personality traits is likely to be biased if the respondent is affected by a depressive or anxiety state. However, in some studies, DSM-oriented self-reported instruments have proved to be reasonably stable in the short term, regardless of co-occurring depressive or anxiety disorders. In the present study, we examined the short-term test-retest reliability of a new self-report questionnaire for personality disorder diagnosis (DIP-Q) on a clinical sample of 30 individuals, having either a depressive, an anxiety, or no axis-I disorder. Test-retest scorings from subjects with depressive disorders were mostly unstable, with a significant change in fulfilled criteria between entry and retest for three out of ten personality disorders: borderline, avoidant and obsessive-compulsive personality disorder. Scorings from subjects with anxiety disorders were unstable only for cluster C and dependent personality disorder items. In the absence of co-morbid depressive or anxiety disorders, mean dimensional scores of DIP-Q showed no significant differences between entry and retest. Overall, the effect from state on trait scorings was moderate, and it is concluded that test-retest reliability for DIP-Q is acceptable.


Author(s):  
Claire Marcus Bernstein ◽  
Diane Majerus Brewer ◽  
Matthew H. Bakke ◽  
Anne D. Olson ◽  
Elizabeth Jackson Machmer ◽  
...  

Abstract Background Increasing numbers of adults are receiving cochlear implants (CIs) and many achieve high levels of speech perception and improved quality of life. However, a proportion of implant recipients still struggle due to limited speech recognition and/or greater communication demands in their daily lives. For these individuals a program of aural rehabilitation (AR) has the potential to improve outcomes. Purpose The study investigated the effects of a short-term AR intervention on speech recognition, functional communication, and psychosocial outcomes in post lingually deafened adult CI users. Research Design The experimental design was a multisite clinical study with participants randomized to either an AR treatment or active control group. Each group completed 6 weekly 90-minute individual treatment sessions. Assessments were completed pretreatment, 1 week and 2 months post-treatment. Study Sample Twenty-five post lingually deafened adult CI recipients participated. AR group: mean age 66.2 (48–80); nine females, four males; months postactivation 7.7 (3–16); mean years severe to profound deafness 18.4 (2–40). Active control group: mean age 62.8 (47–85); eight females, four males; months postactivation 7.0 (3–13); mean years severe to profound deafness 18.8 (1–55). Intervention The AR protocol consisted of auditory training (words, sentences, speech tracking), and psychosocial counseling (informational and communication strategies). Active control group participants engaged in cognitive stimulation activities (e.g., crosswords, sudoku, etc.). Data Collection and Analysis Repeated measures ANOVA or analysis of variance, MANOVA or multivariate analysis of variance, and planned contrasts were used to compare group performance on the following measures: CasperSent; Hearing Handicap Inventory; Nijmegen Cochlear Implant Questionnaire; Client Oriented Scale of Improvement; Glasgow Benefit Inventory. Results The AR group showed statistically significant improvements on speech recognition performance, psychosocial function, and communication goals with no significant improvement seen in the control group. The two groups were statistically equivalent on all outcome measures at preassessment. The robust improvements for the AR group were maintained at 2 months post-treatment. Conclusion Results of this clinical study provide evidence that a short-term AR intervention protocol can maximize outcomes for adult post lingually deafened CI users. The impact of this brief multidimensional AR intervention to extend CI benefit is compelling, and may serve as a template for best practices with adult CI users.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 3063
Author(s):  
Aleksandr Laptev ◽  
Andrei Andrusenko ◽  
Ivan Podluzhny ◽  
Anton Mitrofanov ◽  
Ivan Medennikov ◽  
...  

With the rapid development of speech assistants, adapting server-intended automatic speech recognition (ASR) solutions to a direct device has become crucial. For on-device speech recognition tasks, researchers and industry prefer end-to-end ASR systems as they can be made resource-efficient while maintaining a higher quality compared to hybrid systems. However, building end-to-end models requires a significant amount of speech data. Personalization, which is mainly handling out-of-vocabulary (OOV) words, is another challenging task associated with speech assistants. In this work, we consider building an effective end-to-end ASR system in low-resource setups with a high OOV rate, embodied in Babel Turkish and Babel Georgian tasks. We propose a method of dynamic acoustic unit augmentation based on the Byte Pair Encoding with dropout (BPE-dropout) technique. The method non-deterministically tokenizes utterances to extend the token’s contexts and to regularize their distribution for the model’s recognition of unseen words. It also reduces the need for optimal subword vocabulary size search. The technique provides a steady improvement in regular and personalized (OOV-oriented) speech recognition tasks (at least 6% relative word error rate (WER) and 25% relative F-score) at no additional computational cost. Owing to the BPE-dropout use, our monolingual Turkish Conformer has achieved a competitive result with 22.2% character error rate (CER) and 38.9% WER, which is close to the best published multilingual system.


Languages ◽  
2021 ◽  
Vol 6 (1) ◽  
pp. 56
Author(s):  
Elma Blom ◽  
Evelyn Bosma ◽  
Wilbert Heeringa

Bilingual children often experience difficulties with inflectional morphology. The aim of this longitudinal study was to investigate how regularity of inflection in combination with verbal short-term and working memory (VSTM, VWM) influences bilingual children’s performance. Data from 231 typically developing five- to eight-year-old children were analyzed: Dutch monolingual children (N = 45), Frisian-Dutch bilingual children (N = 106), Turkish-Dutch bilingual children (N = 31), Tarifit-Dutch bilingual children (N = 38) and Arabic-Dutch bilingual children (N = 11). Inflection was measured with an expressive morphology task. VSTM and VWM were measured with a Forward and Backward Digit Span task, respectively. The results showed that, overall, children performed more accurately at regular than irregular forms, with the smallest gap between regulars and irregulars for monolinguals. Furthermore, this gap was smaller for older children and children who scored better on a non-verbal intelligence measure. In bilingual children, higher accuracy at using (irregular) inflection was predicted by a smaller cross-linguistic distance, a larger amount of Dutch at home, and a higher level of parental education. Finally, children with better VSTM, but not VWM, were more accurate at using regular and irregular inflection.


Sign in / Sign up

Export Citation Format

Share Document