Estimation of the vocal tract spectrum from articulatory movements using phoneme-dependent neural networks

Author(s):  
Takuya Tsuji ◽  
Tokihiko Kaburagi ◽  
Kohei Wakamiya ◽  
Jiji Kim
Author(s):  
Boris A. Kleber ◽  
Jean Mary Zarate

To produce vocalizations including speech and song, the control of all muscles along the vocal tract (e.g. for respiration, vocal fold motion, resonance changes, and articulation) requires the concerted effort of a vast network of brain regions. However, singers are usually unaware of the neural networks that govern and coordinate all of these muscle groups, or what happens in these networks when auditory or somatosensory feedback notifies the singer of vocal errors, or if feedback is compromised even temporarily. In this chapter, the authors attempt to define the basic neural networks involved in singing, discuss how these networks may change due to extensive vocal training and practice, and present recent findings that illustrate how the networks respond to alterations to auditory and kinesthetic feedback.


2014 ◽  
Vol 36 (1) ◽  
pp. 3-11 ◽  
Author(s):  
Betty McMicken ◽  
Margaret Vento-Wilson ◽  
Shelley Von Berg ◽  
Kelly Rogers

This research examined cineradiographic films (CRF) of articulatory movements in a person with congenital aglossia (PWCA) during speech production of four phrases. Pearson correlations and a multiple regression model investigated co-variation of independent variables, positions of mandible and hyoid; and pseudo-tongue-dependent variables, positions of mylohyoid and tongue base. Results suggest that backing/fronting of the mandible assisted the mylohyoid/tongue base in making mid-antero-posterior constrictions. Co-linearity findings suggest the best predictor of tongue base movement was mandible for back sounds. Hyoid movement was highly correlated with mandibular movement horizontally, but hyoid acted independently vertically and possibly with greater phonemic specialty in the PWCA. Findings suggest hyoid was a strong determinant of vertically dependent variable movement in all phrases. The extent of hyoid activity was a unique finding and one that may begin to explain relative intelligibility in this PWCA. Observed changes in vocal tract length may have influenced F2 transitional/vowel midpoint values.


2018 ◽  
Vol 37 (11) ◽  
pp. 5087-5100
Author(s):  
Vasantha Sama Sai ◽  
Suryakanth V. Gangashetty ◽  
Ashraf Alkhairy ◽  
Afshan Jafri

1978 ◽  
Vol 43 (3) ◽  
pp. 353-373 ◽  
Author(s):  
Raymond Kent ◽  
Ronald Netsell

This report presents cinefluorographic data on the articulation of isolated vowels, VCV nonsense utterances, and short sentences by five subjects with athetoid cerebral palsy. Articulatory abnormalities were identified from tracings of vocal tract shapes and from displacement-by-time plots of articulatory events. The most frequent abnormalities were large ranges of jaw movement, inappropriate positioning of the tongue for various phonetic segments (especially because of a reduced range of tongue movement in the anteroposterior dimension), intermittency of velopharyngeal closure caused by an instability of velar elevation, prolonged transition times for articulatory movements, and retrusion of the lower lip. The speech disorder associated with athetosis is considered with respect to a model of motor learning.


Author(s):  
Katarzyna Pisanski ◽  
Andrey Anikin ◽  
David Reby

Vocal tract elongation, which uniformly lowers vocal tract resonances (formant frequencies) in animal vocalizations, has evolved independently in several vertebrate groups as a means for vocalizers to exaggerate their apparent body size. Here, we propose that smaller speech-like articulatory movements that alter only individual formants can serve a similar yet less energetically costly size-exaggerating function. To test this, we examine whether uneven formant spacing alters the perceived body size of vocalizers in synthesized human vowels and animal calls. Among six synthetic vowel patterns, those characterized by the lowest first and second formant (the vowel /u/ as in ‘boot’) are consistently perceived as produced by the largest vocalizer. Crucially, lowering only one or two formants in animal-like calls also conveys the impression of a larger body size, and lowering the second and third formants simultaneously exaggerates perceived size to a similar extent as rescaling all formants. As the articulatory movements required for individual formant shifts are minor compared to full vocal tract extension, they represent a rapid and energetically efficient mechanism for acoustic size exaggeration. We suggest that, by favouring the evolution of uneven formant patterns in vocal communication, this deceptive strategy may have contributed to the origins of the phonemic diversification required for articulated speech. This article is part of the theme issue ‘Voice modulation: from origin and mechanism to social impact (Part II)’.


2008 ◽  
Vol 100 (3) ◽  
pp. 1171-1183 ◽  
Author(s):  
Pascal Perrier ◽  
Susanne Fuchs

Relations between tangential velocity and trajectory curvature are analyzed for tongue movements during speech production in the framework of the 1/3 power law, discovered by Viviani and colleagues for arm movements. In 2004, Tasko and Westbury found for American English that the power function provides a good account of speech kinematics, but with an exponent that varies across articulators. The present work aims at broadening Tasko and Westbury's study 1) by analyzing speed–curvature relations for various languages (French, German, Mandarin) and for a biomechanical tongue model simulating speech gestures at various speaking rates and 2) by providing for each speaker or each simulated speaking rate a comparison of results found for the complete set of movements with those found for each movement separately. It is found that the 1/3 power law offers a fair description of the global speed–curvature relations for all speakers and all languages, when articulatory speech data are considered in their whole. This is also observed in the simulations, where the motor control model does not specify any kinematic property of the articulatory paths. However, the refined analysis for individual movements reveals numerous exceptions to this law: the velocity always decreases when curvature increases, but the slope in the log–log representation is variable. It is concluded that the speed–curvature relation is not controlled in speech movements and that it accounts only for general properties of the articulatory movements, which could arise from vocal tract dynamics or/and from stochastic characteristics of the measured signals.


Phonology ◽  
1994 ◽  
Vol 11 (2) ◽  
pp. 277-316 ◽  
Author(s):  
April McMahon ◽  
Paul Foulkes ◽  
Laura Tollfree

Recent work on Articulatory Phonology (Browman & Goldstein 1986, 1989, 1991, 1992a, b) raises a number of questions, specifically involving the phonetics–phonology ‘interface’. One advantage of using Articulatory Phonology (henceforth ArtP), with its basic units of abstract gestures based on articulatory movements, is its ability to link phenomena previously seen as phonological to those which are conventionally described as allophonic, or even lower-level phonetic effects, since ‘gestures are... useful primitives for characterising phonological patterns as well as for analysing the activity of the vocal tract articulators’ (Browman & Goldstein 1991: 313). If both phonetics and phonology could ultimately be cast entirely in gestural terms, the phonetics–phonology interface might effectively cease to exist, at least in terms of units of analysis.


Sign in / Sign up

Export Citation Format

Share Document