How physical models of the human vocal tract contribute to the field of speech communication

Speech is a perceptuo-motor system. A natural computational modeling framework is provided by cognitive robotics, or more precisely speech robotics, which is also based on embodiment, multimodality, development, and interaction. This paper describes the bases of a virtual baby robot which consists in an articulatory model that integrates the non-uniform growth of the vocal tract, a set of sensors, and a learning model. The articulatory model delivers sagittal contour, lip shape and acoustic formants from seven input parameters that characterize the configurations of the jaw, the tongue, the lips and the larynx. To simulate the growth of the vocal tract from birth to adulthood, a process modifies the longitudinal dimension of the vocal tract shape as a function of age. The auditory system of the robot comprises a “phasic” system for event detection over time, and a “tonic” system to track formants. The model of visual perception specifies the basic lips characteristics: height, width, area and protrusion. The orosensorial channel, which provides the tactile sensation on the lips, the tongue and the palate, is elaborated as a model for the prediction of tongue-palatal contacts from articulatory commands. Learning involves Bayesian programming, in which there are two phases: (i) specification of the variables, decomposition of the joint distribution and identification of the free parameters through exploration of a learning set, and (ii) utilization which relies on questions about the joint distribution. Two studies were performed with this system. Each of them focused on one of the two basic mechanisms, which ought to be at work in the initial periods of speech acquisition, namely vocal exploration and vocal imitation. The first study attempted to assess infants’ motor skills before and at the beginning of canonical babbling. It used the model to infer the acoustic regions, the articulatory degrees of freedom and the vocal tract shapes that are the likeliest explored by actual infants according to their vocalizations. Subsequently, the aim was to simulate data reported in the literature on early vocal imitation, in order to test whether and how the robot was able to reproduce them and to gain some insights into the actual cognitive representations that might be involved in this behavior. Speech modeling in a robotics framework should contribute to a computational approach of sensori-motor interactions in speech communication, which seems crucial for future progress in the study of speech and language ontogeny and phylogeny.

Download Full-text

An acoustic glottal source for vocal tract physical models

Measurement Science and Technology ◽

10.1088/1361-6501/aa85a6 ◽

2017 ◽

Vol 28 (11) ◽

pp. 115902 ◽

Cited By ~ 2

Author(s):

Antti Hannukainen ◽

Juha Kuortti ◽

Jarmo Malinen ◽

Antti Ojalammi

Keyword(s):

Vocal Tract ◽

Physical Models ◽

Glottal Source

Download Full-text

A Study on the Principle of Sound Production in Human Whistling using Physical Models of the Human Vocal Tract

IEEJ Transactions on Fundamentals and Materials ◽

10.1541/ieejfms.132.206 ◽

2012 ◽

Vol 132 (2) ◽

pp. 206-207 ◽

Cited By ~ 2

Author(s):

Mikio Mori ◽

Yoko Satomi ◽

Mitsuhiro Ogihara

Keyword(s):

Sound Production ◽

Vocal Tract ◽

Physical Models

Download Full-text

Embouchure of Human Whistling using Physical Models of Vocal Tract

IEEJ Transactions on Fundamentals and Materials ◽

10.1541/ieejfms.139.242 ◽

2019 ◽

Vol 139 (4) ◽

pp. 242-243

Author(s):

Mikio Mori ◽

Saki Fukuda ◽

Kotaro Ishiguro ◽

Saran Sukplang

Keyword(s):

Vocal Tract ◽

Physical Models

Download Full-text

Physical models of the human vocal tract as tools for education in acoustics

The Journal of the Acoustical Society of America ◽

10.1121/1.4779488 ◽

2002 ◽

Vol 112 (5) ◽

pp. 2345-2345 ◽

Cited By ~ 1

Author(s):

Takayuki Arai ◽

Eri Maeda ◽

Noriko Saika ◽

Yuji Murahara

Keyword(s):

Vocal Tract ◽

Physical Models

Download Full-text

Five Decades of Research in Speech Motor Control: What Have We Learned, and Where Should We Go From Here?

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2013/12-0382) ◽

2013 ◽

Vol 56 (6) ◽

pp. 1857-1874 ◽

Cited By ~ 7

Author(s):

Joseph S. Perkell

Keyword(s):

Motor Control ◽

Neural Control ◽

Vocal Tract ◽

Brain Activity ◽

Computer Hardware ◽

Speech Motor Control ◽

Control Group ◽

Speech Communication ◽

Control Mechanisms ◽

Speech Motor

Purpose The author presents a view of research in speech motor control over the past 5 decades, as observed from within Ken Stevens's Speech Communication Group (SCG) in the Research Laboratory of Electronics at MIT. Method The author presents a limited overview of some important developments and discoveries. The perspective is based largely on the research interests of the Speech Motor Control Group (SMCG) within the SCG; thus, it is selective, focusing on normal motor control of the vocal tract in the production of sound segments and syllables. It also covers the particular theories and models that drove the research. Following a brief introduction, there are sections on methodological advances, scientific advances, and conclusions. Results Scientific and methodological advances have been closely interrelated. Advances in instrumentation and computer hardware and software have made it possible to record and process increasingly large, multifaceted data sets; introduce new paradigms for feedback perturbation; image brain activity; and develop more sophisticated, computational physiological and neural models. Such approaches have led to increased understanding of the widespread variability in speech, motor-equivalent trading relations, sensory goals, and the nature of feedback and feedforward neural control mechanisms. Conclusions Some ideas about important future directions for speech research are presented.

Download Full-text