A Noninvasive Acoustic Method Using Frequency Perturbations and Computer-Generated Vocal-Tract Shapes

1983 ◽  
Vol 26 (2) ◽  
pp. 304-314 ◽  
Author(s):  
Debra A. Beckman ◽  
Donald C. Wold ◽  
James C. Montague

This study investigated improved processing of acoustic data with two adult Down's syndrome subjects. Sustained vowel samples were processed through a fast-Fourier-transform spectrum analyzer, and digital waveform data were used to obtain period-by-period measurements of the fundamental frequencies. Unusual frequency perturbation (jitter), later identified as diplophonia, was found for one of the Drain's subjects. In addition, the first three formant frequencies of the vowels were determined and, utilizing an algorithm described by Ladefoged and his colleagues, computer-generated vocal-tract shapes were plotted. Differences in vocal-tract shapes, especially for the back vowels, were observed between the Down's female and the normal shape. Correlations between vocal-tract shapes of the Down's subjects and those for a normal man or woman were computed. A partial three-way factor analysis was carried out to determine those lead factors or coefficients for each subject that were due to individual differences. These procedures, offering synthesized techniques portraying the interparingeal/oral functioning of the speech structures, may eventually have direct noninvasive diagnostic and therapeutic benefit for voice/resonance-disordered clients.

Author(s):  
Johan Sundberg

The function of the voice organ is basically the same in classical singing as in speech. However, loud orchestral accompaniment has necessitated the use of the voice in an economical way. As a consequence, the vowel sounds tend to deviate considerably from those in speech. Male voices cluster formant three, four, and five, so that a marked peak is produced in spectrum envelope near 3,000 Hz. This helps them to get heard through a loud orchestral accompaniment. They seem to achieve this effect by widening the lower pharynx, which makes the vowels more centralized than in speech. Singers often sing at fundamental frequencies higher than the normal first formant frequency of the vowel in the lyrics. In such cases they raise the first formant frequency so that it gets somewhat higher than the fundamental frequency. This is achieved by reducing the degree of vocal tract constriction or by widening the lip and jaw openings, constricting the vocal tract in the pharyngeal end and widening it in the mouth. These deviations from speech cause difficulties in vowel identification, particularly at high fundamental frequencies. Actually, vowel identification is almost impossible above 700 Hz (pitch F5). Another great difference between vocal sound produced in speech and the classical singing tradition concerns female voices, which need to reduce the timbral differences between voice registers. Females normally speak in modal or chest register, and the transition to falsetto tends to happen somewhere above 350 Hz. The great timbral differences between these registers are avoided by establishing control over the register function, that is, over the vocal fold vibration characteristics, so that seamless transitions are achieved. In many other respects, there are more or less close similarities between speech and singing. Thus, marking phrase structure, emphasizing important events, and emotional coloring are common principles, which may make vocal artists deviate considerably from the score’s nominal description of fundamental frequency and syllable duration.


1983 ◽  
Vol 26 (2) ◽  
pp. 297-304 ◽  
Author(s):  
Bruce R. Gerratt

Involuntary movement of the articulatory structures can interfere with the accurate placement of the articulators during consonant production and may also result in distortion of vowel quality. An acoustic method was used to assess motor steadiness in the vocal tract musculature superior to the glottis during vowel production by five subjects with abnormal involuntary orofacial movements associated with tardive dyskinesia and 10 normal subjects. A linear predictive coding technique of spectral analysis yielded formant frequencies from the sustained productions of//. Based on the premise that changes in vocal tract configuration can be measured as changes in formant frequency, the sequential segment-to-segment fluctuations of the second formant frequency of these vowel samples were computed and used as an index of motor steadiness. Results showed that formant frequency fluctuation measures for four of the five tardive dyskinetic patients were substantially larger than those of the normal subjects, indicating a reduction of motor steadiness in these four subjects. Factors influencing the validity of this procedure and implications for its use are discussed.


2020 ◽  
Vol 29 (9) ◽  
pp. 090704
Author(s):  
Yu Tong ◽  
Lin Wang ◽  
Wen-Zhe Zhang ◽  
Ming-Dong Zhu ◽  
Xi Qin ◽  
...  

2015 ◽  
Vol 137 (5) ◽  
pp. 2586-2595 ◽  
Author(s):  
Matthias Echternach ◽  
Peter Birkholz ◽  
Louisa Traser ◽  
Tabea V. Flügge ◽  
Robert Kamberger ◽  
...  

1996 ◽  
Vol 26 (1) ◽  
pp. 23-40 ◽  
Author(s):  
Sandra P. Whiteside

The perception of speaker sex depends on the listener's integration of a complex range of factors. These may relate, for example, to the style of delivery, the use of particular language, pronunciation (Trudgill, 1983; Smith, 1979), the use of particular intonation patterns (McConnell-Ginet, 1983) and the perceived pitch of the speaker (Aronovitch, 1976, Elyan, 1978; Lass et al., 1976). Some acoustic-phonetic investigations have explored through instrumental analysis how speaker sex differences are perceived. These have shown that acoustic phonetic differences exist between the read speech of men and women speakers. It has been demonstrated that fundamental frequency differences exist between men and women, with men having on average, lower fundamental frequencies (Aronovitch, 1976; Coleman, 1973a). This can be explained in part by their larger larynges. However it is also acknowledged that it is not a low overall average fundamental frequency alone that contributes to the perception of an adult male voice. Some evidence shows for example that use of a wider pitch range will contribute to the perception of femininity, even where the overall pitch is low (Terrango, 1966). In addition women have been found to have on average higher formant frequencies (Coleman, 1976; Henton, 1986; Peterson & Barney, 1952; Childers & Wu, 1991; Wu & Childers, 1991) as a result of the smaller vocal tract. Women have different glottal source characteristics (Karlsson, 1989) which are reflected in the filter characteristics of the speech signal (Klatt & Klatt, 1990). There is also some evidence to suggest that other speaker sex differences exist in the temporal domain. Byrd (1992) found differences between men and women speakers in speaking rate in read speech in American English in the TIMIT database. Byrd states that under the recording conditions used for the TIMIT database, women spoke appreciably more slowly than the men and that men tended to reduce vowels to schwa ([ə]) more often than the women. Byrd also found that female speakers in the TIMIT database released stops in sentence-final position more frequently and produced more glottal stops than male speakers. All these findings were statistically significant.


Sign in / Sign up

Export Citation Format

Share Document