Acoustic Correlates of Breathy Vocal Quality: Dysphonic Voices and Continuous Speech

1996 ◽  
Vol 39 (2) ◽  
pp. 311-321 ◽  
Author(s):  
James Hillenbrand ◽  
Robert A. Houde

In an earlier study, we evaluated the effectiveness of several acoustic measures in predicting breathiness ratings for sustained vowels spoken by nonpathological talkers who were asked to produce nonbreathy, moderately breathy, and very breathy phonation (Hillenbrand, Cleveland, & Erickson, 1994). The purpose of the present study was to extend these results to speakers with laryngeal pathologies and to conduct tests using connected speech in addition to sustained vowels. Breathiness ratings were obtained from a sustained vowel and a 12-word sentence spoken by 20 pathological and 5 nonpathological talkers. Acoustic measures were made of (a) signal periodicity, (b) first harmonic amplitude, and (c) spectral tilt. For the sustained vowels, a frequency domain measure of periodicity provided the most accurate predictions of perceived breathiness, accounting for 92% of the variance in breathiness ratings. The relative amplitude of the first harmonic and two measures of spectral tilt correlated moderately with breathiness ratings. For the sentences, both signal periodicity and spectral tilt provided accurate predictions of breathiness ratings, accounting for 70%-85% of the variance.

1994 ◽  
Vol 37 (4) ◽  
pp. 769-778 ◽  
Author(s):  
James Hillenbrand ◽  
Ronald A. Cleveland ◽  
Robert L. Erickson

The purpose of this study was to evaluate the effectiveness of several acoustic measures in predicting breathiness ratings. Recordings were made of eight normal men and seven normal women producing normally phonated, moderately breathy, and very breathy sustained vowels. Twenty listeners rated the degree of breathiness using a direct magnitude estimation procedure. Acoustic measures were made of: (a) signal periodicity, (b) first harmonic amplitude, and (c) spectral tilt. Periodicity measures provided the most accurate predictions of perceived breathiness, accounting for approximately 80% of the variance in breathiness ratings. The relative amplitude of the first harmonic correlated moderately with breathiness ratings, and two measures of spectral tilt correlated weakly with perceived breathiness.


1990 ◽  
Vol 33 (2) ◽  
pp. 298-306 ◽  
Author(s):  
L. Eskenazi ◽  
D. G. Childers ◽  
D. M. Hicks

We have investigated the relationship between various voice qualities and several acoustic measures made from the vowel /i/ phonated by subjects with normal voices and patients with vocal disorders. Among the patients (pathological voices), five qualities were investigated: overall severity, hoarseness, breathiness, roughness, and vocal fry. Six acoustic measures were examined. With one exception, all measures were extracted from the residue signal obtained by inverse filtering the speech signal using the linear predictive coding (LPC) technique. A formal listening test was implemented to rate each pathological voice for each vocal quality. A formal listening test also rated overall excellence of the normal voices. A scale of 1–7 was used. Multiple linear regression analysis between the results of the listening test and the various acoustic measures was used with the prediction sums of squares (PRESS) as the selection criteria. Useful prediction equations of order two or less were obtained relating certain acoustic measures and the ratings of pathological voices for each of the five qualities. The two most useful parameters for predicting vocal quality were the Pitch Amplitude (PA) and the Harmonics-to-Noise Ratio (HNR). No acoustic measure could rank the normal voices.


2016 ◽  
Vol 59 (5) ◽  
pp. 994-1001 ◽  
Author(s):  
Bruce R. Gerratt ◽  
Jody Kreiman ◽  
Marc Garellek

Purpose The question of what type of utterance—a sustained vowel or continuous speech—is best for voice quality analysis has been extensively studied but with equivocal results. This study examines whether previously reported differences derive from the articulatory and prosodic factors occurring in continuous speech versus sustained phonation. Method Speakers with voice disorders sustained vowels and read sentences. Vowel samples were excerpted from the steadiest portion of each vowel in the sentences. In addition to sustained and excerpted vowels, a 3rd set of stimuli was created by shortening sustained vowel productions to match the duration of vowels excerpted from continuous speech. Acoustic measures were made on the stimuli, and listeners judged the severity of vocal quality deviation. Results Sustained vowels and those extracted from continuous speech contain essentially the same acoustic and perceptual information about vocal quality deviation. Conclusions Perceived and/or measured differences between continuous speech and sustained vowels derive largely from voice source variability across segmental and prosodic contexts and not from variations in vocal fold vibration in the quasisteady portion of the vowels. Approaches to voice quality assessment by using continuous speech samples average across utterances and may not adequately quantify the variability they are intended to assess.


2009 ◽  
Vol 39 (2) ◽  
pp. 162-188 ◽  
Author(s):  
Christian T. DiCanio

The Chong language uses a combination of different acoustic correlates to distinguish among its four contrastive registers (phonation types). Electroglottographic (EGG) and acoustic data were examined from original fieldwork on the Takhian Thong dialect. EGG data shows high open quotient (OQ) values for the breathy register, low OQ values for the tense register, intermediate OQ values for the modal register, and rapidly changing high to low OQ values for the breathy-tense register. Acoustic correlates indicate that H1-A3 best distinguishes between breathy and non-breathy phonation, but measures like H1-H2 and pitch are necessary to discriminate between tense and non-tense phonation. A comparison of spectral tilt and OQ measures shows the greatest correlation between OQ and H1-H2, suggesting that changes in the relative amplitude of frequencies in the upper spectrum are not directly related to changes in the open period of the glottal cycle. OQ is best correlated with changes in the degree of glottal tension.


2019 ◽  
Vol 62 (1) ◽  
pp. 60-69
Author(s):  
Areen Badwal ◽  
JoHanna Poertner ◽  
Robin A. Samlan ◽  
Julie E. Miller

Purpose The zebra finch is used as a model to study the neural circuitry of auditory-guided human vocal production. The terminology of birdsong production and acoustic analysis, however, differs from human voice production, making it difficult for voice researchers of either species to navigate the literature from the other. The purpose of this research note is to identify common terminology and measures to better compare information across species. Method Terminology used in the birdsong literature will be mapped onto terminology used in the human voice production literature. Measures typically used to quantify the percepts of pitch, loudness, and quality will be described. Measures common to the literature in both species will be made from the songs of 3 middle-age birds using Praat and Song Analysis Pro. Two measures, cepstral peak prominence (CPP) and Wiener entropy (WE), will be compared to determine if they provide similar information. Results Similarities and differences in terminology and acoustic analyses are presented. A core set of measures including frequency, frequency variability within a syllable, intensity, CPP, and WE are proposed for future studies. CPP and WE are related yet provide unique information about the syllable structure. Conclusions Using a core set of measures familiar to both human voice and birdsong researchers, along with both CPP and WE, will allow characterization of similarities and differences among birds. Standard terminology and measures will improve accessibility of the birdsong literature to human voice researchers and vice versa. Supplemental Material https://doi.org/10.23641/asha.7438964


2001 ◽  
Vol 44 (2) ◽  
pp. 327-339 ◽  
Author(s):  
Vijay Parsa ◽  
Donald G. Jamieson

We investigated the ability of acoustic measures to discriminate between normal and pathological talkers. Two groups of measures were compared: (a) those extracted from sustained vowels and (b) those based on continuous speech samples. Nine acoustic measures, which include fundamental frequency and amplitude perturbation measures, long term average spectral measures, and glottal noise measures were extracted from both sustained vowel and continuous speech samples. Our experiments were performed on a published database of 53 normal talkers and 175 talkers with a pathological voice. The classification performance of the nine acoustic measures was quantified using linear discriminant analysis and receiver operating characteristic (ROC) curve analysis. When individual measures were considered in isolation, classification was more accurate for measures extracted from sustained vowels than for those based on continuous speech samples. Classification accuracy improved when combinations of acoustic parameters were considered. For such combinations of measures, classification results were comparable for measures extracted from continuous speech samples and for those based on sustained vowels.


2009 ◽  
Vol 33 (4) ◽  
pp. 366-375 ◽  
Author(s):  
Caroline Floccia ◽  
Joseph Butler ◽  
Frédérique Girard ◽  
Jeremy Goslin

This study examines children's ability to detect accent-related information in connected speech. British English children aged 5 and 7 years old were asked to discriminate between their home accent from an Irish accent or a French accent in a sentence categorization task. Using a preliminary accent rating task with adult listeners, it was first verified that the level of accentedness was similar across the two unfamiliar accents. Results showed that whereas the younger children group behaved just above chance level in this task, the 7-year-old group could reliably distinguish between these variations of their own language, but were significantly better at detecting the foreign accent than the regional accent. These results extend and replicate a previous study (Girard, Floccia, & Goslin, 2008) in which it was found that 5-year-old French children could detect a foreign accent better than a regional accent. The factors underlying the relative lack of awareness for a regional accent as opposed to a foreign accent in childhood are discussed, especially the amount of exposure, the learnability of both types of accents, and a possible difference in the amount of vowels versus consonants variability, for which acoustic measures of vowel formants and plosives voice onset time are provided.


1974 ◽  
Vol 55 (2) ◽  
pp. 412-412
Author(s):  
Janet M. Baker ◽  
Robert Ramsey ◽  
Mark Miller ◽  
James K. Baker ◽  
Christopher Cooper

2000 ◽  
Vol 9 (2) ◽  
pp. 141-150 ◽  
Author(s):  
Mary Gordon-Brannan ◽  
Barbara Williams Hodson

Intelligibility/severity measurements were obtained for 48 prekindergarten children with varying levels of phonological proficiency/ deficiency. The measure used as the “standard” was percentage of words understood (i.e., orthographically transcribed correctly) in continuous speech in a known context by unfamiliar trained listeners. The children were divided into four groups based on the percentage of words understood from their continuous speech samples. The ranges of intelligibility for each group were: (a) 91–100% for children with “adult-like” speech; (b) 83–90% for children in the “mild” category; (c) 68–81% for children with moderate intelligibility/speech involvement; and (d) 16–63% for the 12 children in the “severe” (i.e., least intelligible) category. When the percentages of the children in the severe group were excluded, the range of the top three groups combined was 68–100% and the mean was 85%. For a child 4 years of age or older, any percentage of words understood in connected speech that falls below 66% (2 standard deviations below the mean) may be a potential indicator of speech difficulty. In addition, data were obtained from the 48 children to determine the correlations between the standard measure and the following intelligibility/severity measures: (a) imitated sentences, (b) imitated words, (c) listener ratings of intelligibility, and (d) phonological deviation averages. All five measures, including the standard measure, investigated in this study were strongly intercorrelated. Multiple regression analysis results yielded a prediction model that included listener ratings and imitated sentences measures. Results of multivariate analysis of variance (MANOVA), univariate analysis, and post-hoc Bonferroni tests indicated that differences between all pairs of groups were significant for the listener rating measure based on the continuous speech sample. For the percentage of words understood in continuous speech samples, the differences between all pairs of groups, except between the adult-like and mild groups, were also significant. The only group that differed significantly from the other three groups for all five measures was the severe group.


Sign in / Sign up

Export Citation Format

Share Document