The role of subglottal resonances in speech processing algorithms

Abeer Alwan; Steven Lulich; Harish Ariskere

doi:10.1121/1.4920497

Investigating the Role of Familiar Face and Voice Cues in Speech Processing in Noise

10.21437/interspeech.2018-1812 ◽

2018 ◽

Author(s):

Jeesun Kim ◽

Sonya Karisma ◽

Vincent Aubanel ◽

Chris Davis

Keyword(s):

Speech Processing ◽

Familiar Face

Download Full-text

Cochlear Implant Research and Development in the Twenty-first Century: A Critical Update

Journal of the Association for Research in Otolaryngology ◽

10.1007/s10162-021-00811-5 ◽

2021 ◽

Vol 22 (5) ◽

pp. 481-508

Author(s):

Robert P. Carlyon ◽

Tobias Goehring

Keyword(s):

Research And Development ◽

Patient Outcomes ◽

Speech Processing ◽

Turn Of The Century ◽

First Century ◽

Directional Microphones ◽

Wide Range ◽

The Subject ◽

Processing Algorithms ◽

Improve Patient

AbstractCochlear implants (CIs) are the world’s most successful sensory prosthesis and have been the subject of intense research and development in recent decades. We critically review the progress in CI research, and its success in improving patient outcomes, from the turn of the century to the present day. The review focuses on the processing, stimulation, and audiological methods that have been used to try to improve speech perception by human CI listeners, and on fundamental new insights in the response of the auditory system to electrical stimulation. The introduction of directional microphones and of new noise reduction and pre-processing algorithms has produced robust and sometimes substantial improvements. Novel speech-processing algorithms, the use of current-focusing methods, and individualised (patient-by-patient) deactivation of subsets of electrodes have produced more modest improvements. We argue that incremental advances have and will continue to be made, that collectively these may substantially improve patient outcomes, but that the modest size of each individual advance will require greater attention to experimental design and power. We also briefly discuss the potential and limitations of promising technologies that are currently being developed in animal models, and suggest strategies for researchers to collectively maximise the potential of CIs to improve hearing in a wide range of listening situations.

Download Full-text

Statistical Speech Segmentation in Tone Languages: The Role of Lexical Tones

Language and Speech ◽

10.1177/0023830917706529 ◽

2017 ◽

Vol 61 (1) ◽

pp. 84-96 ◽

Cited By ~ 7

Author(s):

David M. Gómez ◽

Peggy Mok ◽

Mikhail Ordin ◽

Jacques Mehler ◽

Marina Nespor

Keyword(s):

Speech Processing ◽

Additional Data ◽

Native Speakers ◽

Speech Segmentation ◽

Lexical Tones ◽

Tone Languages ◽

Transitional Probabilities ◽

Lexical Processes

Research has demonstrated distinct roles for consonants and vowels in speech processing. For example, consonants have been shown to support lexical processes, such as the segmentation of speech based on transitional probabilities (TPs), more effectively than vowels. Theory and data so far, however, have considered only non-tone languages, that is to say, languages that lack contrastive lexical tones. In the present work, we provide a first investigation of the role of consonants and vowels in statistical speech segmentation by native speakers of Cantonese, as well as assessing how tones modulate the processing of vowels. Results show that Cantonese speakers are unable to use statistical cues carried by consonants for segmentation, but they can use cues carried by vowels. This difference becomes more evident when considering tone-bearing vowels. Additional data from speakers of Russian and Mandarin suggest that the ability of Cantonese speakers to segment streams with statistical cues carried by tone-bearing vowels extends to other tone languages, but is much reduced in speakers of non-tone languages.

Download Full-text

Prosody in Automatic Speech Processing

The Oxford Handbook of Language Prosody ◽

10.1093/oxfordhb/9780198832232.013.42 ◽

2020 ◽

pp. 632-645

Author(s):

Anton Batliner ◽

Bernd Möbius

Keyword(s):

Speech Processing ◽

High Performance ◽

Early Years ◽

Short History ◽

Word Level ◽

Dialogue Acts ◽

Semantics And Pragmatics ◽

Different Types ◽

History Of

Automatic speech processing (ASP) is understood as covering word recognition, the processing of higher linguistic components (syntax, semantics, and pragmatics), and the processing of computational paralinguistics (CP), which deals with speaker states and traits. This chapter attempts to track the role of prosody in ASP from the word level up to CP. A short history of the field from 1980 to 2020 distinguishes the early years (until 2000)—when the prosodic contribution to the modelling of linguistic phenomena, such as accents, boundaries, syntax, semantics, and dialogue acts, was the focus—from the later years, when the focus shifted to paralinguistics; prosody ceased to be visible. Different types of predictor variables are addressed, among them high-performance power features as well as leverage features, which can also be employed in teaching and therapy.

Download Full-text

Parasitic sorority of speech processing algorithms with an assortment of statistical toolkits

Journal of Physics Conference Series ◽

10.1088/1742-6596/1998/1/012024 ◽

2021 ◽

Vol 1998 (1) ◽

pp. 012024

Author(s):

Prathibha Sudhakaran ◽

Ashwani Kumar Yadav ◽

Sunil Karamchandani

Keyword(s):

Speech Processing ◽

Processing Algorithms

Download Full-text

The Role of Low-frequency Neural Oscillations in Speech Processing: Revisiting Delta Entrainment

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01410 ◽

2019 ◽

Vol 31 (8) ◽

pp. 1205-1215 ◽

Cited By ~ 2

Author(s):

Victor J. Boucher ◽

Annie C. Gilbert ◽

Boutheina Jemel

Keyword(s):

Speech Processing ◽

Low Frequency ◽

Phase Coherence ◽

Theta Oscillations ◽

Neural Oscillations ◽

Acoustic Cues ◽

Low Frequency Oscillations ◽

Temporal Grouping ◽

Delta Oscillations

Studies that use measures of cerebro-acoustic coherence have shown that theta oscillations (3–10 Hz) entrain to syllable-size modulations in the energy envelope of speech. This entrainment creates sensory windows in processing acoustic cues. Recent reports submit that delta oscillations (<3 Hz) can be entrained by nonsensory content units like phrases and serve to process meaning—though such views face fundamental problems. Other studies suggest that delta underlies a sensory chunking linked to the processing of sequential attributes of speech sounds. This chunking associated with the “focus of attention” is commonly manifested by the temporal grouping of items in sequence recall. Similar grouping in speech may entrain delta. We investigate this view by examining how low-frequency oscillations entrain to three types of stimuli (tones, nonsense syllables, and utterances) having similar timing, pitch, and energy contours. Entrainment was indexed by “intertrial phase coherence” in the EEGs of 18 listeners. The results show that theta oscillations at central sites entrain to syllable-size elements in speech and tones. However, delta oscillations at frontotemporal sites specifically entrain to temporal groups in both meaningful utterances and meaningless syllables, which indicates that delta may support but does not directly bear on a processing of content. The findings overall suggest that, although theta entrainment relates to a processing of acoustic attributes, delta entrainment links to a sensory chunking that relates to a processing of properties of articulated sounds. The results also show that measures of intertrial phase coherence can be better suited than cerebro-acoustic coherence in revealing delta entrainment.

Download Full-text

Perception of Rhythmic Speech Is Modulated by Focal Bilateral Transcranial Alternating Current Stimulation

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01490 ◽

2020 ◽

Vol 32 (2) ◽

pp. 226-240 ◽

Cited By ~ 5

Author(s):

Benedikt Zoefel ◽

Isobella Allard ◽

Megha Anil ◽

Matthew H. Davis

Keyword(s):

Speech Perception ◽

Speech Processing ◽

Alternating Current ◽

Stream Segregation ◽

Causal Role ◽

Oscillatory Activity ◽

Transcranial Alternating Current Stimulation ◽

Neural Entrainment ◽

Actual Speech

Several recent studies have used transcranial alternating current stimulation (tACS) to demonstrate a causal role of neural oscillatory activity in speech processing. In particular, it has been shown that the ability to understand speech in a multi-speaker scenario or background noise depends on the timing of speech presentation relative to simultaneously applied tACS. However, it is possible that tACS did not change actual speech perception but rather auditory stream segregation. In this study, we tested whether the phase relation between tACS and the rhythm of degraded words, presented in silence, modulates word report accuracy. We found strong evidence for a tACS-induced modulation of speech perception, but only if the stimulation was applied bilaterally using ring electrodes (not for unilateral left hemisphere stimulation with square electrodes). These results were only obtained when data were analyzed using a statistical approach that was identified as optimal in a previous simulation study. The effect was driven by a phasic disruption of word report scores. Our results suggest a causal role of neural entrainment for speech perception and emphasize the importance of optimizing stimulation protocols and statistical approaches for brain stimulation research.

Download Full-text

The Role of Age-Related Neural Timing Variability in Speech Processing

Proceedings of the Human Factors and Ergonomics Society Annual Meeting ◽

10.1177/1071181311551034 ◽

2011 ◽

Vol 55 (1) ◽

pp. 162-166 ◽

Cited By ~ 3

Author(s):

B. A. Taylor ◽

D. M. Roberts ◽

C. L. Baldwin

Keyword(s):

Speech Processing ◽

Timing Variability ◽

Age Related

Download Full-text

Functional Elements in Infants' Speech Processing: The Role of Determiners in the Syntactic Categorization of Lexical Elements

Infancy ◽

10.1207/s15327078in0503_5 ◽

2004 ◽

Vol 5 (3) ◽

pp. 341-353 ◽

Cited By ~ 93

Author(s):

Barbara Höhle ◽

Jürgen Weissenborn ◽

Dorothea Kiefer ◽

Antje Schulz ◽

Michaela Schmitz

Keyword(s):

Speech Processing ◽

Functional Elements

Download Full-text

The Role of Cognition in Age-Related Hearing Loss

Journal of the American Academy of Audiology ◽

10.3766/jaaa.18.7.2 ◽

2007 ◽

Vol 18 (07) ◽

pp. 539-547 ◽

Cited By ~ 25

Author(s):

Fergus I.M. Craik

Keyword(s):

Speech Processing ◽

Cognitive Factors ◽

Levels Of Analysis ◽

Analysis Model ◽

Top Down ◽

Age Related ◽

Effects Of Aging ◽

Age Related Hearing Loss ◽

Processing Framework

The article presents a commentary on the accompanying six papers from the perspective of a cognitive psychologist. Treisman's (1964, 1969) levels of analysis model of selective attention is suggested as a framework within which the interactions between 'bottom-up' auditory factors and 'top-down' cognitive factors may be understood. The complementary roles of auditory and cognitive aspects of hearing are explored, and their mutually compensatory properties discussed. The findings and ideas reported in the six accompanying papers fit well into such a 'levels of processing' framework, which may therefore be proposed as a model for understanding the effects of aging on speech processing and comprehension. El artículo presenta un comentario sobre los seis trabajos acompañantes desde la perspectiva de un psicólogo de la cognición. Se sugiere el modelo de Treisman (1964, 1969) de niveles de análisis de la atención selectiva como el marco dentro del cuál las interacciones entre los factores de atención "de abajo hacia arriba" y los factores cognitivos de "de arriba hacia abajo" pueden ser comprendidos. Se exploran los papeles complementarios de los aspectos auditivos y cognitivos de la audición y se discuten sus mutuas propiedades compensatorias. Los hallazgos e ideas reportados en los seis trabajos acompañantes calzan bien en dicho marco de "niveles de procesamiento", los que puede, por tanto, ser propuestos como un modelo para comprender el efecto del envejecimiento para el procesamiento y la comprensión del lenguaje.

Download Full-text