Neurophysiological Feature Detectors and Speech Perception: A Discussion of Theoretical Implications

James H. Abbs; Harvey M. Sussman

doi:10.1044/jshr.1401.23

Neurophysiological Feature Detectors and Speech Perception: A Discussion of Theoretical Implications

Journal of Speech and Hearing Research ◽

10.1044/jshr.1401.23 ◽

1971 ◽

Vol 14 (1) ◽

pp. 23-36 ◽

Cited By ~ 39

Author(s):

James H. Abbs ◽

Harvey M. Sussman

Keyword(s):

Speech Perception ◽

Receptive Fields ◽

Physical Parameters ◽

Feature Detector ◽

Human Speech ◽

Feature Detectors

The purpose of this paper is to promote consideration of a neurophysiologically oriented theory of speech perception. This theory holds that the phonological attributes of human speech are decoded by neurosensory receptive fields operating as “feature detectors” These fields are held to be innately structured to detect, and respond to, the various distinguishing physical parameters of the acoustic sound stream. Neurophysiological, psychophysical, and developmental evidence is cited to support such a position. A feature detector theory appears to provide an explanation for many phenomena revealed by speech perception research.

Download Full-text

Temporal contrast effects in human speech perception are immune to selective attention

Scientific Reports ◽

10.1038/s41598-020-62613-8 ◽

2020 ◽

Vol 10 (1) ◽

Cited By ~ 1

Author(s):

Hans Rutger Bosker ◽

Matthias J. Sjerps ◽

Eva Reinisch

Keyword(s):

Speech Perception ◽

Selective Attention ◽

Contrast Effects ◽

Temporal Contrast ◽

Human Speech

Download Full-text

Zebra finches are sensitive to prosodic features of human speech

Proceedings of The Royal Society B Biological Sciences ◽

10.1098/rspb.2014.0480 ◽

2014 ◽

Vol 281 (1787) ◽

pp. 20140480 ◽

Cited By ~ 24

Author(s):

Michelle J. Spierings ◽

Carel ten Cate

Keyword(s):

Speech Perception ◽

Emotional State ◽

The Other ◽

Zebra Finches ◽

Prosodic Features ◽

Human Speech ◽

Prosodic Cues ◽

Paralinguistic Information ◽

Final Syllable ◽

Human Specific

Variation in pitch, amplitude and rhythm adds crucial paralinguistic information to human speech. Such prosodic cues can reveal information about the meaning or emphasis of a sentence or the emotional state of the speaker. To examine the hypothesis that sensitivity to prosodic cues is language independent and not human specific, we tested prosody perception in a controlled experiment with zebra finches. Using a go/no-go procedure, subjects were trained to discriminate between speech syllables arranged in XYXY patterns with prosodic stress on the first syllable and XXYY patterns with prosodic stress on the final syllable. To systematically determine the salience of the various prosodic cues (pitch, duration and amplitude) to the zebra finches, they were subjected to five tests with different combinations of these cues. The zebra finches generalized the prosodic pattern to sequences that consisted of new syllables and used prosodic features over structural ones to discriminate between stimuli. This strong sensitivity to the prosodic pattern was maintained when only a single prosodic cue was available. The change in pitch was treated as more salient than changes in the other prosodic features. These results show that zebra finches are sensitive to the same prosodic cues known to affect human speech perception.

Download Full-text

Deep Neural Network Model of Hearing-Impaired Speech-in-Noise Perception

Frontiers in Neuroscience ◽

10.3389/fnins.2020.588448 ◽

2020 ◽

Vol 14 ◽

Author(s):

Stephanie Haro ◽

Christopher J. Smalt ◽

Gregory A. Ciccarelli ◽

Thomas F. Quatieri

Keyword(s):

Neural Network ◽

Speech Perception ◽

Background Noise ◽

Deep Neural Network ◽

Recognition Accuracy ◽

Auditory Function ◽

Human Speech ◽

Digit Recognition ◽

Speech In Noise ◽

Hearing Damage

Many individuals struggle to understand speech in listening scenarios that include reverberation and background noise. An individual's ability to understand speech arises from a combination of peripheral auditory function, central auditory function, and general cognitive abilities. The interaction of these factors complicates the prescription of treatment or therapy to improve hearing function. Damage to the auditory periphery can be studied in animals; however, this method alone is not enough to understand the impact of hearing loss on speech perception. Computational auditory models bridge the gap between animal studies and human speech perception. Perturbations to the modeled auditory systems can permit mechanism-based investigations into observed human behavior. In this study, we propose a computational model that accounts for the complex interactions between different hearing damage mechanisms and simulates human speech-in-noise perception. The model performs a digit classification task as a human would, with only acoustic sound pressure as input. Thus, we can use the model's performance as a proxy for human performance. This two-stage model consists of a biophysical cochlear-nerve spike generator followed by a deep neural network (DNN) classifier. We hypothesize that sudden damage to the periphery affects speech perception and that central nervous system adaptation over time may compensate for peripheral hearing damage. Our model achieved human-like performance across signal-to-noise ratios (SNRs) under normal-hearing (NH) cochlear settings, achieving 50% digit recognition accuracy at −20.7 dB SNR. Results were comparable to eight NH participants on the same task who achieved 50% behavioral performance at −22 dB SNR. We also simulated medial olivocochlear reflex (MOCR) and auditory nerve fiber (ANF) loss, which worsened digit-recognition accuracy at lower SNRs compared to higher SNRs. Our simulated performance following ANF loss is consistent with the hypothesis that cochlear synaptopathy impacts communication in background noise more so than in quiet. Following the insult of various cochlear degradations, we implemented extreme and conservative adaptation through the DNN. At the lowest SNRs (<0 dB), both adapted models were unable to fully recover NH performance, even with hundreds of thousands of training samples. This implies a limit on performance recovery following peripheral damage in our human-inspired DNN architecture.

Download Full-text

Contributions of Nonhuman Animal Models to Understanding Human Speech Perception

Listening to Speech ◽

10.4324/9780203933107-13 ◽

2012 ◽

pp. 203-220

Author(s):

Keith R. Kluender ◽

Andrew J. Lotto ◽

Lori L. Holt

Keyword(s):

Speech Perception ◽

Animal Models ◽

Nonhuman Animal ◽

Human Speech

Download Full-text

Feature Detectors and Speech Perception: A Critical Evaluation

Recognition of Pattern and Form - Lecture Notes in Biomathematics ◽

10.1007/978-3-642-93199-4_7 ◽

1982 ◽

pp. 111-145 ◽

Cited By ~ 2

Author(s):

Joanne L. Miller ◽

Peter D. Eimas

Keyword(s):

Speech Perception ◽

Critical Evaluation ◽

Feature Detectors

Download Full-text

Susceptibility of a stop consonant to adaptation on a speech-nonspeech continuum: Further evidence against feature detectors in speech perception

Perception & Psychophysics ◽

10.3758/bf03199900 ◽

1980 ◽

Vol 27 (1) ◽

pp. 17-23 ◽

Cited By ~ 8

Author(s):

Robert E. Remez

Keyword(s):

Speech Perception ◽

Stop Consonant ◽

Feature Detectors

Download Full-text

Bayesian learning for models of human speech perception

IEEE Workshop on Statistical Signal Processing, 2003 ◽

10.1109/ssp.2003.1289432 ◽

2004 ◽

Author(s):

M. Hasegawa-Johnson

Keyword(s):

Speech Perception ◽

Bayesian Learning ◽

Human Speech

Download Full-text

Structure in talker variability: How much is there and how much can it help?

10.31234/osf.io/a4tkn ◽

2018 ◽

Author(s):

Dave F Kleinschmidt

Keyword(s):

Speech Recognition ◽

Speech Perception ◽

R Package ◽

Ideal Observer ◽

Linguistic Variation ◽

Robust Speech Recognition ◽

Talker Variability ◽

New Techniques ◽

Human Speech ◽

The Face

One of the persistent puzzles in understanding human speech perception is how listeners cope with talker variability. One thing that might help listeners is structure in talker variability: rather than varying randomly, talkers of the same gender, dialect, age, etc. tend to produce language in similar ways. Sociolinguistic research has shown that listeners are sensitive to this covariation between linguistic variation and socio-indexical variables. In this paper I present new techniques based on ideal observer models to quantify 1) the amount and type of structure in talker variation, and 2) how useful such structure can be for robust speech recognition in the face of talker variability. I demonstrate these techniques in two phonetic domains---word-initial stop voicing and vowel identity---and show that these domains have different amounts and types of talker variability, consistent with previous, impressionistic findings. An `R` package accompanies this paper, enabling researchers to apply these techniques to their own data.

Download Full-text

On integrating insights from human speech perception into automatic speech recognition

10.21437/interspeech.2005-475 ◽

2005 ◽

Author(s):

Sorin Dusan ◽

Larry R. Rabiner

Keyword(s):

Speech Recognition ◽

Speech Perception ◽

Automatic Speech Recognition ◽

Human Speech

Download Full-text

A Simple, Biologically Plausible Feature Detector for Language Acquisition

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01494 ◽

2020 ◽

Vol 32 (3) ◽

pp. 435-445

Author(s):

Ansgar D. Endress

Keyword(s):

Language Acquisition ◽

Brain Regions ◽

Simple Circuit ◽

Feature Detector ◽

Feature Detectors ◽

Grammatical Rule

Language has a complex grammatical system we still have to understand computationally and biologically. However, some evolutionarily ancient mechanisms have been repurposed for grammar so that we can use insight from other taxa into possible circuit-level mechanisms of grammar. Drawing upon recent evidence for the importance of disinhibitory circuits across taxa and brain regions, I suggest a simple circuit that explains the acquisition of core grammatical rules used in 85% of the world's languages: grammatical rules based on sameness/difference relations. This circuit acts as a sameness detector. “Different” items are suppressed through inhibition, but presenting two “identical” items leads to inhibition of inhibition. The items are thus propagated for further processing. This sameness detector thus acts as a feature detector for a grammatical rule. I suggest that having a set of feature detectors for elementary grammatical rules might make language acquisition feasible based on relatively simple computational mechanisms.

Download Full-text