Importance of temporal-envelope cues in consonant recognition

René van der Horst; A. Rens Leeuw; Wouter A. Dreschler

doi:10.1121/1.426718

Consonant recognition in noise with temporal cues. III. Effects of temporal envelope enhancement on identification thresholds

The Journal of the Acoustical Society of America ◽

10.1121/1.4744762 ◽

2001 ◽

Vol 109 (5) ◽

pp. 2468-2468

Author(s):

Frederic Apoux ◽

Stephane Garnier ◽

Christian Lorenzi

Keyword(s):

Temporal Envelope ◽

Temporal Cues ◽

Consonant Recognition

Download Full-text

Relative Weights of Temporal Envelope Cues in Different Frequency Regions for Mandarin Vowel, Consonant, and Lexical Tone Recognition

Frontiers in Neuroscience ◽

10.3389/fnins.2021.744959 ◽

2021 ◽

Vol 15 ◽

Author(s):

Zhong Zheng ◽

Keyi Li ◽

Gang Feng ◽

Yang Guo ◽

Yinan Li ◽

...

Keyword(s):

Speech Coding ◽

Frequency Region ◽

Lexical Tone ◽

Temporal Envelope ◽

Percent Correct ◽

Coding Schemes ◽

Consonant Recognition ◽

Tone Recognition ◽

Vowel Recognition ◽

Lexical Tone Recognition

Objectives: Mandarin-speaking users of cochlear implants (CI) perform poorer than their English counterpart. This may be because present CI speech coding schemes are largely based on English. This study aims to evaluate the relative contributions of temporal envelope (E) cues to Mandarin phoneme (including vowel, and consonant) and lexical tone recognition to provide information for speech coding schemes specific to Mandarin.Design: Eleven normal hearing subjects were studied using acoustic temporal E cues that were extracted from 30 continuous frequency bands between 80 and 7,562 Hz using the Hilbert transform and divided into five frequency regions. Percent-correct recognition scores were obtained with acoustic E cues presented in three, four, and five frequency regions and their relative weights calculated using the least-square approach.Results: For stimuli with three, four, and five frequency regions, percent-correct scores for vowel recognition using E cues were 50.43–84.82%, 76.27–95.24%, and 96.58%, respectively; for consonant recognition 35.49–63.77%, 67.75–78.87%, and 87.87%; for lexical tone recognition 60.80–97.15%, 73.16–96.87%, and 96.73%. For frequency region 1 to frequency region 5, the mean weights in vowel recognition were 0.17, 0.31, 0.22, 0.18, and 0.12, respectively; in consonant recognition 0.10, 0.16, 0.18, 0.23, and 0.33; in lexical tone recognition 0.38, 0.18, 0.14, 0.16, and 0.14.Conclusion: Regions that contributed most for vowel recognition was Region 2 (502–1,022 Hz) that contains first formant (F1) information; Region 5 (3,856–7,562 Hz) contributed most to consonant recognition; Region 1 (80–502 Hz) that contains fundamental frequency (F0) information contributed most to lexical tone recognition.

Download Full-text

Consonant recognition in noise with temporal cues: I. Effects of temporal envelope enhancement on identification performance

The Journal of the Acoustical Society of America ◽

10.1121/1.428848 ◽

2000 ◽

Vol 107 (5) ◽

pp. 2913-2913

Author(s):

Frederic Apoux ◽

Christian Lorenzi ◽

Frederic Berthommier

Keyword(s):

Identification Performance ◽

Temporal Envelope ◽

Temporal Cues ◽

Consonant Recognition

Download Full-text

Consonant recognition in noise with temporal cues: II. Effects of temporal envelope enhancement on response times

The Journal of the Acoustical Society of America ◽

10.1121/1.428849 ◽

2000 ◽

Vol 107 (5) ◽

pp. 2913-2913

Author(s):

Frederic Apoux ◽

Olivier Crouzet ◽

Christian Lorenzi

Keyword(s):

Response Times ◽

Temporal Envelope ◽

Temporal Cues ◽

Consonant Recognition

Download Full-text

Effect of Consonant-Vowel Ratio Modification on Amplitude Envelope Cues for Consonant Recognition

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3402.415 ◽

1991 ◽

Vol 34 (2) ◽

pp. 415-426 ◽

Cited By ~ 16

Author(s):

Richard L. Freyman ◽

G. Patrick Nerbonne ◽

Heather A. Cote

Keyword(s):

White Noise ◽

Intensity Ratio ◽

Recognition Performance ◽

Amplitude Envelope ◽

Signal To Noise ◽

Noise Masking ◽

Spectral Cues ◽

Consonant Recognition ◽

Nonlinear Amplification ◽

Speech Envelope

This investigation examined the degree to which modification of the consonant-vowel (C-V) intensity ratio affected consonant recognition under conditions in which listeners were forced to rely more heavily on waveform envelope cues than on spectral cues. The stimuli were 22 vowel-consonant-vowel utterances, which had been mixed at six different signal-to-noise ratios with white noise that had been modulated by the speech waveform envelope. The resulting waveforms preserved the gross speech envelope shape, but spectral cues were limited by the white-noise masking. In a second stimulus set, the consonant portion of each utterance was amplified by 10 dB. Sixteen subjects with normal hearing listened to the unmodified stimuli, and 16 listened to the amplified-consonant stimuli. Recognition performance was reduced in the amplified-consonant condition for some consonants, presumably because waveform envelope cues had been distorted. However, for other consonants, especially the voiced stops, consonant amplification improved recognition. Patterns of errors were altered for several consonant groups, including some that showed only small changes in recognition scores. The results indicate that when spectral cues are compromised, nonlinear amplification can alter waveform envelope cues for consonant recognition.

Download Full-text

Temporal envelope perception in Cochlear Implant users

Klinische Neurophysiologie ◽

10.1055/s-0032-1301645 ◽

2012 ◽

Vol 43 (01) ◽

Author(s):

L Timm ◽

D Agrawal ◽

M Wittfoth ◽

R Dengler

Keyword(s):

Cochlear Implant ◽

Temporal Envelope

Download Full-text

Identification of temporal envelope cues in Chinese tone recognition

Asia Pacific Journal of Speech Language and Hearing ◽

10.1179/136132800807547582 ◽

2000 ◽

Vol 5 (1) ◽

pp. 45-57 ◽

Cited By ~ 92

Author(s):

Qian-Jie Fu ◽

Fan-Gang Zeng

Keyword(s):

Temporal Envelope ◽

Tone Recognition

Download Full-text

Representation of Spectral and Temporal Envelope of Twitter Vocalizations in Common Marmoset Primary Auditory Cortex

Journal of Neurophysiology ◽

10.1152/jn.00632.2001 ◽

2002 ◽

Vol 87 (4) ◽

pp. 1723-1737 ◽

Cited By ~ 71

Author(s):

Srikantan S. Nagarajan ◽

Steven W. Cheung ◽

Purvis Bedenbaugh ◽

Ralph E. Beitel ◽

Christoph E. Schreiner ◽

...

Keyword(s):

Cortical Neurons ◽

Social Contact ◽

Primary Auditory Cortex ◽

Common Marmoset ◽

Spectral Envelope ◽

Response Patterns ◽

Temporal Envelope ◽

Degraded Speech ◽

Neurophysiological Data ◽

Narrowband Channels

Cortical sensitivity in representations of behaviorally relevant complex input signals was examined in recordings from primary auditory cortical neurons (AI) in adult, barbiturate-anesthetized common marmoset monkeys ( Callithrix jacchus). We studied the robustness of distributed responses to natural and degraded forms of twitter calls, social contact vocalizations comprising several quasi-periodic phrases of frequency and AM. We recorded neuronal responses to a monkey's own twitter call (MOC), degraded forms of their twitter call, and sinusoidal amplitude modulated (SAM) tones with modulation rates similar to those of twitter calls. In spectral envelope degradation, calls with narrowband channels of varying bandwidths had the same temporal envelope as a natural call. However, the carrier phase was randomized within each narrowband channel. In temporal envelope degradation, the temporal envelope within narrowband channels was filtered while the carrier frequencies and phases remained unchanged. In a third form of degradation, noise was added to the natural calls. Spatiotemporal discharge patterns in AI both within and across frequency bands encoded spectrotemporal acoustic features in the call although the encoded response is an abstract version of the call. The average temporal response pattern in AI, however, was significantly correlated with the average temporal envelope for each phrase of a call. Response entrainment to MOC was significantly correlated with entrainment to SAM stimuli at comparable modulation frequencies. Sensitivity of the response patterns to MOC was substantially greater for temporal envelope than for spectral envelope degradations. The distributed responses in AI were robust to additive continuous noise at signal-to-noise ratios ≥10 dB. Neurophysiological data reflecting response sensitivity in AI to these forms of degradation closely parallel human psychophysical results on the intelligibility of degraded speech in quiet and noisy conditions.

Download Full-text

Representations of the temporal envelope of sounds in human auditory cortex: Can the results from invasive intracortical “depth” electrode recordings be replicated using non-invasive MEG “virtual electrodes”?

NeuroImage ◽

10.1016/j.neuroimage.2012.09.017 ◽

2013 ◽

Vol 64 ◽

pp. 185-196 ◽

Cited By ~ 15

Author(s):

Rebecca E. Millman ◽

Garreth Prendergast ◽

Mark Hymers ◽

Gary G.R. Green

Keyword(s):

Auditory Cortex ◽

Temporal Envelope ◽

Non Invasive ◽

Depth Electrode ◽

Human Auditory Cortex ◽

Virtual Electrodes

Download Full-text

Introduction to the Consonant‐Recognition Test

The Journal of the Acoustical Society of America ◽

10.1121/1.1974704 ◽

1970 ◽

Vol 47 (1A) ◽

pp. 74-74

Author(s):

John W. Preusse

Keyword(s):

Recognition Test ◽

Consonant Recognition

Download Full-text