A theoretical model of cochlear processing improves spectrally degraded speech perception

2006 ◽  
Vol 119 (5) ◽  
pp. 3238-3238
Author(s):  
Evan C. Smith ◽  
Lori L. Holt
Author(s):  
Nele Salveste

Erinevate häälikute laad meie igapäevases kõnes varieerub tugevalt, kuid häälduse varieeruvus ei ole enamasti kõneeristusele takistuseks. See annab alust oletada, et kõnetaju on välja arendanud süsteemi, millega tuvastada foneeme väga suure varieeruvusega kõnesignaalist. See süsteem tegeleb kõne varieeruvusega nii tõhusalt ja kiiresti, et me ei ole sellest enamasti teadlikud. Seda süsteemi võiks nimetada kategoriaalseks tajuks (ingl Categorical Perception), kuid kuna taju on uurimisele üksnes kaudselt kättesaadav, siis tähistab see termin pigem eksperimentaalset mudelit või meetodit, millega uuritakse taju võimet foneeme kõnesignaalist eristada. (Schouten jt 2003) Käesolevas artiklis arutatakse kategoriaalse taju kui mudeli ja katsemeetodi üle, mille teoreetilised lähtekohad on olnud nii muudes keeltes kui eesti keeles läbi viidud tajukatsete ülesehituse ja järelduste eeldusteks.Categorical perception or the hypothesis of how we perceive linguistic units. The acoustic signal of everyday speech is very variable, but it seldom distracts the normal speech communication. This motivates the hypothesis that the speech perception must have developed a special mechanism for extracting phonemes from highly variable speech signal. This mechanism extracts phonemes so efficiently and quickly that we are often unaware of it. We would like to call this mechanism “categorical perception of speech”, but since the perceptual processes are only indirectly accessible for investigation, the term refers rather to a theoretical model or an experimental method for investigating our perceptual ability to distinguish phonemes from the speech signal so efficiently (Schouten et al. 2003). In this paper the Categorical Perception as an experimental method and its theoretical statements will be discussed in connection to perception experiments and findings in other languages as well as in Estonian language.


2019 ◽  
Vol 62 (9) ◽  
pp. 3290-3301
Author(s):  
Jingjing Guan ◽  
Chang Liu

Purpose Degraded speech intelligibility in background noise is a common complaint of listeners with hearing loss. The purpose of the current study is to explore whether 2nd formant (F2) enhancement improves speech perception in noise for older listeners with hearing impairment (HI) and normal hearing (NH). Method Target words (e.g., color and digit) were selected and presented based on the paradigm of the coordinate response measure corpus. Speech recognition thresholds with original and F2-enhanced speech in 2- and 6-talker babble were examined for older listeners with NH and HI. Results The thresholds for both the NH and HI groups improved for enhanced speech signals primarily in 2-talker babble, but not in 6-talker babble. The F2 enhancement benefits did not correlate significantly with listeners' age and their average hearing thresholds in most listening conditions. However, speech intelligibility index values increased significantly with F2 enhancement in babble for listeners with HI, but not for NH listeners. Conclusions Speech sounds with F2 enhancement may improve listeners' speech perception in 2-talker babble, possibly due to a greater amount of speech information available in temporally modulated noise or a better capacity to separate speech signals from background babble.


Author(s):  
Jiaqiang Zhu ◽  
Xiaoxiang Chen ◽  
Fei Chen ◽  
Seth Wiener

Purpose: Individuals with congenital amusia exhibit degraded speech perception. This study examined whether adult Chinese Mandarin listeners with amusia were still able to extract the statistical regularities of Mandarin speech sounds, despite their degraded speech perception. Method: Using the gating paradigm with monosyllabic syllable–tone words, we tested 19 Mandarin-speaking amusics and 19 musically intact controls. Listeners heard increasingly longer fragments of the acoustic signal across eight duration-blocked gates. The stimuli varied in syllable token frequency and syllable–tone co-occurrence probability. The correct syllable–tone word, correct syllable-only, correct tone-only, and correct syllable–incorrect tone responses were compared respectively between the two groups using mixed-effects models. Results: Amusics were less accurate than controls in terms of the correct word, correct syllable-only, and correct tone-only responses. Amusics, however, showed consistent patterns of top-down processing, as indicated by more accurate responses to high-frequency syllables, high-probability tones, and tone errors all in manners similar to those of the control listeners. Conclusions: Amusics are able to learn syllable and tone statistical regularities from the language input. This extends previous work by showing that amusics can track phonological segment and pitch cues despite their degraded speech perception. The observed speech deficits in amusics are therefore not due to an abnormal statistical learning mechanism. These results support rehabilitation programs aimed at improving amusics' sensitivity to pitch.


2020 ◽  
Author(s):  
Cris Lanting ◽  
Ad Snik ◽  
Joop Leijendeckers ◽  
Arjan Bosman ◽  
Ronald Pennings

AbstractThe relation between speech recognition and hereditary hearing loss is not straightforward. Impaired cochlear processing of sound might be determined by underlying genetic defects. Data obtained in nine groups of patients with a specific type of genetic hearing loss were evaluated. For each group, the affected cochlear structure, or site-of-lesion, was determined based on previously published animal studies. Retrospectively obtained speech recognition scores in noise were related to several aspects of supra-threshold cochlear processing, as assessed by psychophysical measurements. The differences in speech perception in noise between these patient groups could be explained by these factors, and partially by the hypothesized affected structure of the cochlea, suggesting that speech recognition in noise was associated with genetics-related malfunctioning of the cochlea.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Ediz Sohoglu ◽  
Matthew H Davis

Human speech perception can be described as Bayesian perceptual inference but how are these Bayesian computations instantiated neurally? We used magnetoencephalographic recordings of brain responses to degraded spoken words and experimentally manipulated signal quality and prior knowledge. We first demonstrate that spectrotemporal modulations in speech are more strongly represented in neural responses than alternative speech representations (e.g. spectrogram or articulatory features). Critically, we found an interaction between speech signal quality and expectations from prior written text on the quality of neural representations; increased signal quality enhanced neural representations of speech that mismatched with prior expectations, but led to greater suppression of speech that matched prior expectations. This interaction is a unique neural signature of prediction error computations and is apparent in neural responses within 100 ms of speech input. Our findings contribute to the detailed specification of a computational model of speech perception based on predictive coding frameworks.


2019 ◽  
Author(s):  
Matthew H. Davis ◽  
Ediz Sohoglu

Spoken language is one of the most important sounds that humans hear, yet, also one of the most difficult sounds for non-human listeners or machines to identify. In this chapter we explore different neuro-computational implementations of Bayesian Inference for Speech Perception. We propose, in line with Predictive Coding (PC) principles, that Bayesian Inference is based on neural computations of the difference between heard and expected speech segments (Prediction Error). We will review three functions of these Prediction Error representations: (1) in combining prior knowledge and degraded speech for optimal word identification, (2) supporting rapid learning processes so that perception remains optimal despite perceptual degradation or variation, (3) ensuring that listeners detect instances of lexical novelty (previously unfamiliar words) so as to learn new words over the life span. Evidence from MEG and multivariate fMRI studies suggestion computations of Prediction Error in the Superior Temporal Gyrus (STG) during these three processes.


2005 ◽  
Vol 382 (3) ◽  
pp. 254-258 ◽  
Author(s):  
Tetsuaki Kawase ◽  
Keiichiro Yamaguchi ◽  
Takenori Ogawa ◽  
Ken-ichi Suzuki ◽  
Maki Suzuki ◽  
...  

Author(s):  
Derek M. Houston ◽  
Chi-hsin Chen ◽  
Claire Monroy ◽  
Irina Castellanos

It is generally assumed that deaf and hard-of-hearing children’s difficulties in learning novel words stem entirely from impaired speech perception. Degraded speech perception makes words more confusable, and correctly recognizing words clearly plays an important role in word learning. However, recent findings suggest that early auditory experience may affect other factors involved in linking the sound patterns of words to their referents. This chapter reviews those findings and discusses possible factors that may be affected by early auditory experience and, in turn, also affect the ability to learn word-referent associations. These factors include forming representations for the sound patterns of words, encoding phonological information into memory, sensory integration, and quality of language input. Overall, we learn that in order to understand and to help mitigate the difficulties deaf and hard-of-hearing children face in learning spoken words after cochlear implantation, we must look well beyond speech perception.


Sign in / Sign up

Export Citation Format

Share Document