Une nouvelle méthode de mesure de la fonction d'aire du conduit vocal : cas des voyelles

H Teffahi; B Guerin; A Djeradi

doi:10.1139/p05-026

Une nouvelle méthode de mesure de la fonction d'aire du conduit vocal : cas des voyelles

Canadian Journal of Physics ◽

10.1139/p05-026 ◽

2005 ◽

Vol 83 (7) ◽

pp. 721-737

Author(s):

H Teffahi ◽

B Guerin ◽

A Djeradi

Keyword(s):

Measurement Method ◽

Cross Correlation ◽

Sound Production ◽

Linear Prediction ◽

Vocal Tract ◽

Random Sequence ◽

Speech Sound ◽

Acoustic Properties ◽

External Excitation ◽

White Noise Excitation

Knowledge of vocal tract area functions is important for the understanding of phenomena occurring during speech production. We present here a new measurement method based on the external excitation of the vocal tract with a known pseudo-random sequence, where the area function is obtained by a linear prediction analysis applied to the cross-correlation between the sequence and the signal measured at the lips. The advantages of this method over methods based on sweep-tones or white noise excitation are (1) a much shorter measurement time (about 100 ms) and (2) the possibility of speech sound production during the measurement. This method has been checked against classical methods through systematic comparisons on a small corpus of vowels. Moreover, it has been verified that simultaneous speech sound production does not perturb significantly the measurements. This method should thus be a very helpful tool for the investigation of the acoustic properties of the vocal tract in various cases for vowels.

Download Full-text

Vocal tract configuration for breathing and speech sound production

Global Imaging Insights ◽

10.15761/gii.1000142 ◽

2017 ◽

Author(s):

E. Fiona Bailey

Keyword(s):

Sound Production ◽

Vocal Tract ◽

Speech Sound

Download Full-text

Atlas-Based Tongue Muscle Correlation Analysis From Tagged and High-Resolution Magnetic Resonance Imaging

Journal of Speech Language and Hearing Research ◽

10.1044/2019_jslhr-s-18-0495 ◽

2019 ◽

Vol 62 (7) ◽

pp. 2258-2269

Author(s):

Fangxu Xing ◽

Maureen Stone ◽

Tessa Goldsmith ◽

Jerry L. Prince ◽

Georges El Fakhri ◽

...

Keyword(s):

Magnetic Resonance Imaging ◽

Magnetic Resonance ◽

High Resolution ◽

Sound Production ◽

Vocal Tract ◽

Speech Sound ◽

Resonance Imaging ◽

En Bloc ◽

Tongue Muscle ◽

Tongue Muscles

Purpose Intrinsic and extrinsic tongue muscles in healthy and diseased populations vary both in their intra- and intersubject behaviors during speech. Identifying coordination patterns among various tongue muscles can provide insights into speech motor control and help in developing new therapeutic and rehabilitative strategies. Method We present a method to analyze multisubject tongue muscle correlation using motion patterns in speech sound production. Motion of muscles is captured using tagged magnetic resonance imaging and computed using a phase-based deformation extraction algorithm. After being assembled in a common atlas space, motions from multiple subjects are extracted at each individual muscle location based on a manually labeled mask using high-resolution magnetic resonance imaging and a vocal tract atlas. Motion correlation between each muscle pair is computed within each labeled region. The analysis is performed on a population of 16 control subjects and 3 post–partial glossectomy patients. Results The floor-of-mouth (FOM) muscles show reduced correlation comparing to the internal tongue muscles. Patients present a higher amount of overall correlation between all muscles and exercise en bloc movements. Conclusions Correlation matrices in the atlas space show the coordination of tongue muscles in speech sound production. The FOM muscles are weakly correlated with the internal tongue muscles. Patients tend to use FOM muscles more than controls to compensate for their postsurgery function loss.

Download Full-text

Reading Risk in Children With Speech Sound Disorder: Prevalence, Persistence, and Predictors

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-20-00108 ◽

2020 ◽

Vol 63 (11) ◽

pp. 3714-3726

Author(s):

Sherine R. Tambyraja ◽

Kelly Farquharson ◽

Laura Justice

Keyword(s):

At Risk ◽

Sound Production ◽

Speech Therapy ◽

Speech Sound ◽

Reading Difficulties ◽

School Age Children ◽

Speech Sound Disorder ◽

Word Decoding ◽

School Year ◽

School Based

Purpose The purpose of this study was to determine the extent to which school-age children with speech sound disorder (SSD) exhibit concomitant reading difficulties and examine the extent to which phonological processing and speech production abilities are associated with increased likelihood of reading risks. Method Data were obtained from 120 kindergarten, first-grade, and second-grade children who were in receipt of school-based speech therapy services. Children were categorized as being “at risk” for reading difficulties if standardized scores on a word decoding measure were 1 SD or more from the mean. The selected predictors of reading risk included children's rapid automatized naming ability, phonological awareness (PA), and accuracy of speech sound production. Results Descriptive results indicated that just over 25% of children receiving school-based speech therapy for an SSD exhibited concomitant deficits in word decoding and that those exhibiting risk at the beginning of the school year were likely to continue to be at risk at the end of the school year. Results from a hierarchical logistic regression suggested that, after accounting for children's age, general language abilities, and socioeconomic status, both PA and speech sound production abilities were significantly associated with the likelihood of being classified as at risk. Conclusions School-age children with SSD are at increased risk for reading difficulties that are likely to persist throughout an academic year. The severity of phonological deficits, reflected by PA and speech output, may be important indicators of subsequent reading problems.

Download Full-text

Speech Sound-Production Deficits in Children With Visual Impairment: A Preliminary Investigation of the Nature and Prevalence of Coexisting Conditions

Contemporary Issues in Communication Science and Disorders ◽

10.1044/cicsd_42_s_33 ◽

2015 ◽

Vol 42 (Spring) ◽

pp. 33-46 ◽

Cited By ~ 1

Author(s):

Monica Gordon-Pershey ◽

Diane Hoffman ◽

Emily Gunderson

Keyword(s):

Visual Impairment ◽

Sound Production ◽

Preliminary Investigation ◽

Speech Sound

Download Full-text

EQUALIZATION OF THE IMAGE TO IMPROVE CROSS-CORRELATION MEASUREMENT METHOD

Measuring Monitoring Management Control ◽

10.21685/2307-5538-2019-2-4 ◽

2019 ◽

Author(s):

M. S. Revunov ◽

◽

Keyword(s):

Measurement Method ◽

Cross Correlation ◽

Correlation Measurement

Download Full-text

On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition

Sensors ◽

10.3390/s21051888 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1888

Author(s):

Juraj Kacur ◽

Boris Puterka ◽

Jarmila Pavlovicova ◽

Milos Oravec

Keyword(s):

Emotion Recognition ◽

Linear Prediction ◽

Filter Banks ◽

Vocal Tract ◽

Statistical Tests ◽

Extraction Methods ◽

Speech Emotion Recognition ◽

Speech Characteristics ◽

Evaluation Phase ◽

Cepstral Features

Many speech emotion recognition systems have been designed using different features and classification methods. Still, there is a lack of knowledge and reasoning regarding the underlying speech characteristics and processing, i.e., how basic characteristics, methods, and settings affect the accuracy, to what extent, etc. This study is to extend physical perspective on speech emotion recognition by analyzing basic speech characteristics and modeling methods, e.g., time characteristics (segmentation, window types, and classification regions—lengths and overlaps), frequency ranges, frequency scales, processing of whole speech (spectrograms), vocal tract (filter banks, linear prediction coefficient (LPC) modeling), and excitation (inverse LPC filtering) signals, magnitude and phase manipulations, cepstral features, etc. In the evaluation phase the state-of-the-art classification method and rigorous statistical tests were applied, namely N-fold cross validation, paired t-test, rank, and Pearson correlations. The results revealed several settings in a 75% accuracy range (seven emotions). The most successful methods were based on vocal tract features using psychoacoustic filter banks covering the 0–8 kHz frequency range. Well scoring are also spectrograms carrying vocal tract and excitation information. It was found that even basic processing like pre-emphasis, segmentation, magnitude modifications, etc., can dramatically affect the results. Most findings are robust by exhibiting strong correlations across tested databases.

Download Full-text

Phonological potentials and the lower vocal tract

Journal of the International Phonetic Association ◽

10.1017/s0025100318000403 ◽

2019 ◽

pp. 1-35 ◽

Cited By ~ 1

Author(s):

Scott R. Moisik ◽

Ewa Czaykowska-Higgins ◽

John H. Esling

Keyword(s):

Theoretical Approach ◽

Vocal Tract ◽

Speech Sound ◽

Physical Nature ◽

Discrete Structure ◽

Complex Interactions

This paper outlines a theoretical approach to speech sound systems based on the notion of phonological potentials: physical ‘pressures’ or biases that give rise to discrete structure and the tendencies associated with this structure that arise from the physical nature of speech sound systems. We apply this approach to a poorly understood area of phonology – phenomena of the lower vocal tract (LVT) – through a schematic that encapsulates the complex interactions among the vocal tract structures responsible for producing LVT sounds. With the framework, we provide an account of a range of LVT phenomena from several languages, illustrating how tonal, phonatory, and vowel qualities interact. Finally, we consider how the idea of phonological potentials extends across various physical domains and might exhibit patterns of alignment across these domains, thereby serving to guide the formation of patterns found in speech sound systems.

Download Full-text

Speech Sound Production in 2-Year-Olds Who Are Hard of Hearing

American Journal of Speech-Language Pathology ◽

10.1044/2014_ajslp-13-0039 ◽

2014 ◽

Vol 23 (2) ◽

pp. 91-104 ◽

Cited By ~ 5

Author(s):

Sophie E. Ambrose ◽

Lauren M. Unflat Berry ◽

Elizabeth A. Walker ◽

Melody Harrison ◽

Jacob Oleson ◽

...

Keyword(s):

Socioeconomic Status ◽

Hearing Loss ◽

Hearing Aids ◽

Hard Of Hearing ◽

Sound Production ◽

Speech Sound ◽

Linguistic Variables ◽

Vowel Production ◽

Severe Hearing Loss ◽

Children's Speech

Purpose The purpose of the study was to (a) compare the speech sound production abilities of 2-year-old children who are hard of hearing (HH) to children with normal hearing (NH), (b) identify sources of risk for individual children who are HH, and (c) determine whether speech sound production skills at age 2 were predictive of speech sound production skills at age 3. Method Seventy children with bilateral, mild-to-severe hearing loss who use hearing aids and 37 age- and socioeconomic status–matched children with NH participated. Children's speech sound production abilities were assessed at 2 and 3 years of age. Results At age 2, the HH group demonstrated vowel production abilities on par with their NH peers but weaker consonant production abilities. Within the HH group, better outcomes were associated with hearing aid fittings by 6 months of age, hearing loss of less than 45 dB HL, stronger vocabulary scores, and being female. Positive relationships existed between children's speech sound production abilities at 2 and 3 years of age. Conclusion Assessment of early speech sound production abilities in combination with demographic, audiologic, and linguistic variables may be useful in identifying HH children who are at risk for delays in speech sound production.

Download Full-text

Kinematics of birdsong: functional correlation of cranial movements and acoustic features in sparrows

Journal of Experimental Biology ◽

10.1242/jeb.182.1.147 ◽

1993 ◽

Vol 182 (1) ◽

pp. 147-171 ◽

Cited By ~ 10

Author(s):

M. W. Westneat ◽

J. H. Long ◽

W. Hoese ◽

S. Nowicki

Keyword(s):

Vocal Tract ◽

Active Role ◽

Acoustic Properties ◽

Zonotrichia Albicollis ◽

Low Frequencies ◽

Acoustic Frequency ◽

Mean Frequency ◽

Singing Behavior ◽

Resonance Properties ◽

Melospiza Georgiana

The movements of the head and beak of songbirds may play a functional role in vocal production by influencing the acoustic properties of songs. We investigated this possibility by synchronously measuring the acoustic frequency and amplitude and the kinematics (beak gape and head angle) of singing behavior in the white-throated sparrow (Zonotrichia albicollis) and the swamp sparrow (Melospiza georgiana). These birds are closely related emberizine sparrows, but their songs differ radically in frequency and amplitude structure. We found that the acoustic frequencies of notes in a song have a consistent, positive correlation with beak gape in both species. Beak gape increased significantly with increasing frequency during the first two notes in Z. albicollis song, with a mean frequency for note 1 of 3 kHz corresponding to a gape of 0.4 cm (a 15 degrees gape angle) and a mean frequency for note 2 of 4 kHz corresponding to a gape of 0.7 cm (a 30 degrees gape angle). The relationship between gape and frequency for the upswept third note in Z. albicollis also was significant. In M. georgiana, low frequencies of 3 kHz corresponding to beak gapes of 0.2-0.3 cm (a 10–15 degrees break angle), whereas frequencies of 7–8 kHz were associated with flaring of the beak to over 1 cm (a beak angle greater than 50 degrees). Beak gape and song amplitude are poorly correlated in both species. We conclude that cranial kinematics, particularly beak movements, influence the resonance properties of the vocal tract by varying its physical dimensions and thus play an active role in the production of birdsong.

Download Full-text

i-Vector-Based Speaker Verification on Limited Data Using Fusion Techniques

Journal of Intelligent Systems ◽

10.1515/jisys-2017-0047 ◽

2018 ◽

Vol 29 (1) ◽

pp. 565-582

Author(s):

T.R. Jayanthi Kumari ◽

H.S. Jayanna

Keyword(s):

Linear Prediction ◽

Vocal Tract ◽

Speaker Verification ◽

Excitation Source ◽

Limited Data ◽

Score Level Fusion ◽

Verification System ◽

Modeling Techniques ◽

Prediction Residual ◽

Level Fusion

Abstract In many biometric applications, limited data speaker verification plays a significant role in practical-oriented systems to verify the speaker. The performance of the speaker verification system needs to be improved by applying suitable techniques to limited data condition. The limited data represent both train and test data duration in terms of few seconds. This article shows the importance of the speaker verification system under limited data condition using feature- and score-level fusion techniques. The baseline speaker verification system uses vocal tract features like mel-frequency cepstral coefficients, linear predictive cepstral coefficients and excitation source features like linear prediction residual and linear prediction residual phase as features along with i-vector modeling techniques using the NIST 2003 data set. In feature-level fusion, the vocal tract features are fused with excitation source features. As a result, on average, equal error rate (EER) is approximately equal to 4% compared to individual feature performance. Further in this work, two different types of score-level fusion are demonstrated. In the first case, fusing the scores of vocal tract features and excitation source features at score-level-maintaining modeling technique remains the same, which provides an average reduction approximately equal to 2% EER compared to feature-level fusion performance. In the second case, scores of the different modeling techniques are combined, which has resulted in EER reduction approximately equal to 4.5% compared with score-level fusion of different features.

Download Full-text