Auditory-Motor Formant Tracking: A Study of Speech Imitation

1974 ◽  
Vol 17 (2) ◽  
pp. 203-222 ◽  
Author(s):  
R. D. Kent

Auditory-motor formant tracking, or the vocal reproduction of formant patterns, is one aspect of speech imitation skill. The study reported here assessed the ability of four adult speakers to imitate synthesized vocalic stimuli. These stimuli took the form of two steady-state segments joined by a transitional segment. The first steady-state segment corresponded to one of eight American English vowels, and the second, to one of 14 vowels that were not expected to have a prominent phonemic identity in the language. Spectrographic analyses of the imitative responses allowed comparisons of the formant structure for the synthesized stimuli and the corresponding human reproductions. Analyses of the spectrograms revealed that the directions of movement for the first two formants were almost always reproduced accurately, but the extent of movement frequently was overshot. These responses were judged to be consistent with a contrast effect in speech perception, a phenomenon previously discovered in experiments on vowel identification. The variability of formant reproduction for a given vowel was predicted at least roughly by the ambiguity of the stimulus in a preliminary identification experiment. These results suggest that the responses in an imitation task are intermediate in dimensionality to the responses in discrimination and identification tasks.

2005 ◽  
Vol 26 (2) ◽  
pp. 227-247 ◽  
Author(s):  
STEPHEN G. LAMBACHER ◽  
WILLIAM L. MARTENS ◽  
KAZUHIKO KAKEHI ◽  
CHANDRAJITH A. MARASINGHE ◽  
GARRY MOLHOLT

The effectiveness of a high variability identification training procedure to improve native Japanese identification and production of the American English (AE) mid and low vowels /æ/, //, //, //, // was investigated. Vowel identification and production performance for two groups of Japanese participants was measured before and after a 6-week identification training period. Recordings were made of both group's pre-/posttraining vowel productions of the five vowels, which were evaluated by a group of native AE listeners using a five-alternative, forced-choice identification task and by an acoustic analysis of the vowel productions. The overall results confirmed that the identification performance of the experimental (trained) participants improved after identification training with feedback and that the training also had a positive effect on their production of the target AE vowels.


2013 ◽  
Vol 43 (1) ◽  
pp. 23-35 ◽  
Author(s):  
Austin L. Oder ◽  
Cynthia G. Clopper ◽  
Sarah Hargus Ferguson

A great deal of recent research has focused on phonetic variation among American English vowels from different dialects. This body of research continues to grow as vowels continuously undergo diachronic formant changes that become characteristic of certain dialects. Two experiments using the Nationwide Speech Project corpus (Clopper & Pisoni 2006a) explored whether the Midland dialect is more closely related acoustically and perceptually to the Mid-Atlantic or to the Southern dialect. The goal of this study was to further our understanding of acoustic and perceptual differences between two of the most marked dialects (Mid-Atlantic and Southern) and one of the least marked dialects (Midland) of American English. Ten vowels in /hVd/ context produced by one male talker from each of these three dialects were acoustically analyzed and presented to Midland listeners for identification. The listeners showed the greatest vowel identification accuracy for the Mid-Atlantic talker (95.2%), followed by the Midland talker (92.5%), and finally the Southern talker (79.7%). Vowel error patterns were consistent with vowel acoustic differences between the talkers. The results suggest that, acoustically and perceptually, the Midland and Mid-Atlantic dialects are more similar than are the Midland and Southern dialects.


2019 ◽  
Vol 62 (12) ◽  
pp. 4534-4543
Author(s):  
Wei Hu ◽  
Sha Tao ◽  
Mingshuang Li ◽  
Chang Liu

Purpose The purpose of this study was to investigate how the distinctive establishment of 2nd language (L2) vowel categories (e.g., how distinctively an L2 vowel is established from nearby L2 vowels and from the native language counterpart in the 1st formant [F1] × 2nd formant [F2] vowel space) affected L2 vowel perception. Method Identification of 12 natural English monophthongs, and categorization and rating of synthetic English vowels /i/ and /ɪ/ in the F1 × F2 space were measured for Chinese-native (CN) and English-native (EN) listeners. CN listeners were also examined with categorization and rating of Chinese vowels in the F1 × F2 space. Results As expected, EN listeners significantly outperformed CN listeners in English vowel identification. Whereas EN listeners showed distinctive establishment of 2 English vowels, CN listeners had multiple patterns of L2 vowel establishment: both, 1, or neither established. Moreover, CN listeners' English vowel perception was significantly related to the perceptual distance between the English vowel and its Chinese counterpart, and the perceptual distance between the adjacent English vowels. Conclusions L2 vowel perception relied on listeners' capacity to distinctively establish L2 vowel categories that were distant from the nearby L2 vowels.


1992 ◽  
Vol 35 (1) ◽  
pp. 192-200 ◽  
Author(s):  
Michele L. Steffens ◽  
Rebecca E. Eilers ◽  
Karen Gross-Glenn ◽  
Bonnie Jallad

Speech perception was investigated in a carefully selected group of adult subjects with familial dyslexia. Perception of three synthetic speech continua was studied: /a/-//, in which steady-state spectral cues distinguished the vowel stimuli; /ba/-/da/, in which rapidly changing spectral cues were varied; and /sta/-/sa/, in which a temporal cue, silence duration, was systematically varied. These three continua, which differed with respect to the nature of the acoustic cues discriminating between pairs, were used to assess subjects’ abilities to use steady state, dynamic, and temporal cues. Dyslexic and normal readers participated in one identification and two discrimination tasks for each continuum. Results suggest that dyslexic readers required greater silence duration than normal readers to shift their perception from /sa/ to /sta/. In addition, although the dyslexic subjects were able to label and discriminate the synthetic speech continua, they did not necessarily use the acoustic cues in the same manner as normal readers, and their overall performance was generally less accurate.


1995 ◽  
Vol 97 (5) ◽  
pp. 3099-3111 ◽  
Author(s):  
James Hillenbrand ◽  
Laura A. Getty ◽  
Michael J. Clark ◽  
Kimberlee Wheeler

Linguistica ◽  
2017 ◽  
Vol 57 (1) ◽  
pp. 59-72
Author(s):  
Biljana Čubrović

This study aims at discussing the phonetic property of vowel quality in English, as exercised by both native speakers of General American English (AE) and non-native speakers of General American English of Serbian language background, all residents of the United States. Ten Serbian male speakers and four native male speakers of AE are recorded in separate experiments and their speech analyzed acoustically for any significant phonetic differences, looking into a set of monosyllabic English words representing nine vowels of AE. The general aim of the experiments is to evaluate the phonetic characteristics of AE vowels, with particular attention to F1 and F2 values, investigate which vowels differ most in the two groups of participants, and provide some explanations for these variations. A single most important observation that is the result of this vowel study is an evident merger of three pairs of vowels in the non-native speech: /i ɪ/, /u ʊ/, and /ɛ æ/.


2016 ◽  
Vol 1 (1) ◽  
Author(s):  
Robert Daland ◽  
Mira Oh

Loanword corpora have been an important tool in studying the relationship between speech perception and native-language phonotactics. Recent work has challenged this use of loanword corpora on methodological grounds, based on the fact that source and possibly loan orthography conditions the adaptation. The present study replicates and extends this finding by using information theory to quantify the relative strength of orthographic effects, in the adaptation of English vowels into Korean. It is found that the orthographic effect is strong for unstressed vowels, but almost unnoticable for stressed vowels. It is proposed that orthography plays a large role in adaptation only when the source form is perceptually compatible with multiple phonological parses in the borrowing language.


2012 ◽  
Vol 56 (2) ◽  
pp. 145-161 ◽  
Author(s):  
Valeriy Shafiro ◽  
Erika S. Levy ◽  
Reem Khamis-Dakwar ◽  
Anatoliy Kharkhurin

Sign in / Sign up

Export Citation Format

Share Document