The role of subglottal resonances in speech processing algorithms

2015 ◽  
Vol 137 (4) ◽  
pp. 2327-2327 ◽  
Author(s):  
Abeer Alwan ◽  
Steven Lulich ◽  
Harish Ariskere
2018 ◽  
Author(s):  
Jeesun Kim ◽  
Sonya Karisma ◽  
Vincent Aubanel ◽  
Chris Davis

2021 ◽  
Vol 22 (5) ◽  
pp. 481-508
Author(s):  
Robert P. Carlyon ◽  
Tobias Goehring

AbstractCochlear implants (CIs) are the world’s most successful sensory prosthesis and have been the subject of intense research and development in recent decades. We critically review the progress in CI research, and its success in improving patient outcomes, from the turn of the century to the present day. The review focuses on the processing, stimulation, and audiological methods that have been used to try to improve speech perception by human CI listeners, and on fundamental new insights in the response of the auditory system to electrical stimulation. The introduction of directional microphones and of new noise reduction and pre-processing algorithms has produced robust and sometimes substantial improvements. Novel speech-processing algorithms, the use of current-focusing methods, and individualised (patient-by-patient) deactivation of subsets of electrodes have produced more modest improvements. We argue that incremental advances have and will continue to be made, that collectively these may substantially improve patient outcomes, but that the modest size of each individual advance will require greater attention to experimental design and power. We also briefly discuss the potential and limitations of promising technologies that are currently being developed in animal models, and suggest strategies for researchers to collectively maximise the potential of CIs to improve hearing in a wide range of listening situations.


2017 ◽  
Vol 61 (1) ◽  
pp. 84-96 ◽  
Author(s):  
David M. Gómez ◽  
Peggy Mok ◽  
Mikhail Ordin ◽  
Jacques Mehler ◽  
Marina Nespor

Research has demonstrated distinct roles for consonants and vowels in speech processing. For example, consonants have been shown to support lexical processes, such as the segmentation of speech based on transitional probabilities (TPs), more effectively than vowels. Theory and data so far, however, have considered only non-tone languages, that is to say, languages that lack contrastive lexical tones. In the present work, we provide a first investigation of the role of consonants and vowels in statistical speech segmentation by native speakers of Cantonese, as well as assessing how tones modulate the processing of vowels. Results show that Cantonese speakers are unable to use statistical cues carried by consonants for segmentation, but they can use cues carried by vowels. This difference becomes more evident when considering tone-bearing vowels. Additional data from speakers of Russian and Mandarin suggest that the ability of Cantonese speakers to segment streams with statistical cues carried by tone-bearing vowels extends to other tone languages, but is much reduced in speakers of non-tone languages.


Author(s):  
Anton Batliner ◽  
Bernd Möbius

Automatic speech processing (ASP) is understood as covering word recognition, the processing of higher linguistic components (syntax, semantics, and pragmatics), and the processing of computational paralinguistics (CP), which deals with speaker states and traits. This chapter attempts to track the role of prosody in ASP from the word level up to CP. A short history of the field from 1980 to 2020 distinguishes the early years (until 2000)—when the prosodic contribution to the modelling of linguistic phenomena, such as accents, boundaries, syntax, semantics, and dialogue acts, was the focus—from the later years, when the focus shifted to paralinguistics; prosody ceased to be visible. Different types of predictor variables are addressed, among them high-performance power features as well as leverage features, which can also be employed in teaching and therapy.


2021 ◽  
Vol 1998 (1) ◽  
pp. 012024
Author(s):  
Prathibha Sudhakaran ◽  
Ashwani Kumar Yadav ◽  
Sunil Karamchandani

2019 ◽  
Vol 31 (8) ◽  
pp. 1205-1215 ◽  
Author(s):  
Victor J. Boucher ◽  
Annie C. Gilbert ◽  
Boutheina Jemel

Studies that use measures of cerebro-acoustic coherence have shown that theta oscillations (3–10 Hz) entrain to syllable-size modulations in the energy envelope of speech. This entrainment creates sensory windows in processing acoustic cues. Recent reports submit that delta oscillations (<3 Hz) can be entrained by nonsensory content units like phrases and serve to process meaning—though such views face fundamental problems. Other studies suggest that delta underlies a sensory chunking linked to the processing of sequential attributes of speech sounds. This chunking associated with the “focus of attention” is commonly manifested by the temporal grouping of items in sequence recall. Similar grouping in speech may entrain delta. We investigate this view by examining how low-frequency oscillations entrain to three types of stimuli (tones, nonsense syllables, and utterances) having similar timing, pitch, and energy contours. Entrainment was indexed by “intertrial phase coherence” in the EEGs of 18 listeners. The results show that theta oscillations at central sites entrain to syllable-size elements in speech and tones. However, delta oscillations at frontotemporal sites specifically entrain to temporal groups in both meaningful utterances and meaningless syllables, which indicates that delta may support but does not directly bear on a processing of content. The findings overall suggest that, although theta entrainment relates to a processing of acoustic attributes, delta entrainment links to a sensory chunking that relates to a processing of properties of articulated sounds. The results also show that measures of intertrial phase coherence can be better suited than cerebro-acoustic coherence in revealing delta entrainment.


2020 ◽  
Vol 32 (2) ◽  
pp. 226-240 ◽  
Author(s):  
Benedikt Zoefel ◽  
Isobella Allard ◽  
Megha Anil ◽  
Matthew H. Davis

Several recent studies have used transcranial alternating current stimulation (tACS) to demonstrate a causal role of neural oscillatory activity in speech processing. In particular, it has been shown that the ability to understand speech in a multi-speaker scenario or background noise depends on the timing of speech presentation relative to simultaneously applied tACS. However, it is possible that tACS did not change actual speech perception but rather auditory stream segregation. In this study, we tested whether the phase relation between tACS and the rhythm of degraded words, presented in silence, modulates word report accuracy. We found strong evidence for a tACS-induced modulation of speech perception, but only if the stimulation was applied bilaterally using ring electrodes (not for unilateral left hemisphere stimulation with square electrodes). These results were only obtained when data were analyzed using a statistical approach that was identified as optimal in a previous simulation study. The effect was driven by a phasic disruption of word report scores. Our results suggest a causal role of neural entrainment for speech perception and emphasize the importance of optimizing stimulation protocols and statistical approaches for brain stimulation research.


Infancy ◽  
2004 ◽  
Vol 5 (3) ◽  
pp. 341-353 ◽  
Author(s):  
Barbara Höhle ◽  
Jürgen Weissenborn ◽  
Dorothea Kiefer ◽  
Antje Schulz ◽  
Michaela Schmitz

2007 ◽  
Vol 18 (07) ◽  
pp. 539-547 ◽  
Author(s):  
Fergus I.M. Craik

The article presents a commentary on the accompanying six papers from the perspective of a cognitive psychologist. Treisman's (1964, 1969) levels of analysis model of selective attention is suggested as a framework within which the interactions between 'bottom-up' auditory factors and 'top-down' cognitive factors may be understood. The complementary roles of auditory and cognitive aspects of hearing are explored, and their mutually compensatory properties discussed. The findings and ideas reported in the six accompanying papers fit well into such a 'levels of processing' framework, which may therefore be proposed as a model for understanding the effects of aging on speech processing and comprehension. El artículo presenta un comentario sobre los seis trabajos acompañantes desde la perspectiva de un psicólogo de la cognición. Se sugiere el modelo de Treisman (1964, 1969) de niveles de análisis de la atención selectiva como el marco dentro del cuál las interacciones entre los factores de atención "de abajo hacia arriba" y los factores cognitivos de "de arriba hacia abajo" pueden ser comprendidos. Se exploran los papeles complementarios de los aspectos auditivos y cognitivos de la audición y se discuten sus mutuas propiedades compensatorias. Los hallazgos e ideas reportados en los seis trabajos acompañantes calzan bien en dicho marco de "niveles de procesamiento", los que puede, por tanto, ser propuestos como un modelo para comprender el efecto del envejecimiento para el procesamiento y la comprensión del lenguaje.


Sign in / Sign up

Export Citation Format

Share Document