Acoustic Correlates of Breathy Vocal Quality: Dysphonic Voices and Continuous Speech

James Hillenbrand; Robert A. Houde

doi:10.1044/jshr.3902.311

Acoustic Correlates of Breathy Vocal Quality: Dysphonic Voices and Continuous Speech

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3902.311 ◽

1996 ◽

Vol 39 (2) ◽

pp. 311-321 ◽

Cited By ~ 289

Author(s):

James Hillenbrand ◽

Robert A. Houde

Keyword(s):

Frequency Domain ◽

Relative Amplitude ◽

Harmonic Amplitude ◽

Continuous Speech ◽

Connected Speech ◽

Acoustic Measures ◽

Acoustic Correlates ◽

Two Measures ◽

Vocal Quality ◽

Spectral Tilt

In an earlier study, we evaluated the effectiveness of several acoustic measures in predicting breathiness ratings for sustained vowels spoken by nonpathological talkers who were asked to produce nonbreathy, moderately breathy, and very breathy phonation (Hillenbrand, Cleveland, & Erickson, 1994). The purpose of the present study was to extend these results to speakers with laryngeal pathologies and to conduct tests using connected speech in addition to sustained vowels. Breathiness ratings were obtained from a sustained vowel and a 12-word sentence spoken by 20 pathological and 5 nonpathological talkers. Acoustic measures were made of (a) signal periodicity, (b) first harmonic amplitude, and (c) spectral tilt. For the sustained vowels, a frequency domain measure of periodicity provided the most accurate predictions of perceived breathiness, accounting for 92% of the variance in breathiness ratings. The relative amplitude of the first harmonic and two measures of spectral tilt correlated moderately with breathiness ratings. For the sentences, both signal periodicity and spectral tilt provided accurate predictions of breathiness ratings, accounting for 70%-85% of the variance.

Download Full-text

Acoustic Correlates of Breathy Vocal Quality

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3704.769 ◽

1994 ◽

Vol 37 (4) ◽

pp. 769-778 ◽

Cited By ~ 307

Author(s):

James Hillenbrand ◽

Ronald A. Cleveland ◽

Robert L. Erickson

Keyword(s):

Magnitude Estimation ◽

Estimation Procedure ◽

Harmonic Amplitude ◽

Acoustic Measures ◽

Acoustic Correlates ◽

Two Measures ◽

Vocal Quality ◽

Normal Women ◽

Normal Men ◽

Spectral Tilt

The purpose of this study was to evaluate the effectiveness of several acoustic measures in predicting breathiness ratings. Recordings were made of eight normal men and seven normal women producing normally phonated, moderately breathy, and very breathy sustained vowels. Twenty listeners rated the degree of breathiness using a direct magnitude estimation procedure. Acoustic measures were made of: (a) signal periodicity, (b) first harmonic amplitude, and (c) spectral tilt. Periodicity measures provided the most accurate predictions of perceived breathiness, accounting for approximately 80% of the variance in breathiness ratings. The relative amplitude of the first harmonic correlated moderately with breathiness ratings, and two measures of spectral tilt correlated weakly with perceived breathiness.

Download Full-text

Acoustic Correlates of Vocal Quality

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3302.298 ◽

1990 ◽

Vol 33 (2) ◽

pp. 298-306 ◽

Cited By ~ 160

Author(s):

L. Eskenazi ◽

D. G. Childers ◽

D. M. Hicks

Keyword(s):

Linear Regression Analysis ◽

Predictive Coding ◽

Multiple Linear Regression Analysis ◽

Inverse Filtering ◽

Linear Predictive Coding ◽

Listening Test ◽

Acoustic Measures ◽

Acoustic Correlates ◽

Vocal Quality ◽

The Relationship

We have investigated the relationship between various voice qualities and several acoustic measures made from the vowel /i/ phonated by subjects with normal voices and patients with vocal disorders. Among the patients (pathological voices), five qualities were investigated: overall severity, hoarseness, breathiness, roughness, and vocal fry. Six acoustic measures were examined. With one exception, all measures were extracted from the residue signal obtained by inverse filtering the speech signal using the linear predictive coding (LPC) technique. A formal listening test was implemented to rate each pathological voice for each vocal quality. A formal listening test also rated overall excellence of the normal voices. A scale of 1–7 was used. Multiple linear regression analysis between the results of the listening test and the various acoustic measures was used with the prediction sums of squares (PRESS) as the selection criteria. Useful prediction equations of order two or less were obtained relating certain acoustic measures and the ratings of pathological voices for each of the five qualities. The two most useful parameters for predicting vocal quality were the Pitch Amplitude (PA) and the Harmonics-to-Noise Ratio (HNR). No acoustic measure could rank the normal voices.

Download Full-text

Comparing Measures of Voice Quality From Sustained Phonation and Continuous Speech

Journal of Speech Language and Hearing Research ◽

10.1044/2016_jslhr-s-15-0307 ◽

2016 ◽

Vol 59 (5) ◽

pp. 994-1001 ◽

Cited By ~ 20

Author(s):

Bruce R. Gerratt ◽

Jody Kreiman ◽

Marc Garellek

Keyword(s):

Vocal Fold ◽

Voice Quality ◽

Quality Analysis ◽

Perceptual Information ◽

Continuous Speech ◽

Voice Source ◽

Acoustic Measures ◽

Vocal Fold Vibration ◽

Vocal Quality ◽

Quality Deviation

Purpose The question of what type of utterance—a sustained vowel or continuous speech—is best for voice quality analysis has been extensively studied but with equivocal results. This study examines whether previously reported differences derive from the articulatory and prosodic factors occurring in continuous speech versus sustained phonation. Method Speakers with voice disorders sustained vowels and read sentences. Vowel samples were excerpted from the steadiest portion of each vowel in the sentences. In addition to sustained and excerpted vowels, a 3rd set of stimuli was created by shortening sustained vowel productions to match the duration of vowels excerpted from continuous speech. Acoustic measures were made on the stimuli, and listeners judged the severity of vocal quality deviation. Results Sustained vowels and those extracted from continuous speech contain essentially the same acoustic and perceptual information about vocal quality deviation. Conclusions Perceived and/or measured differences between continuous speech and sustained vowels derive largely from voice source variability across segmental and prosodic contexts and not from variations in vocal fold vibration in the quasisteady portion of the vowels. Approaches to voice quality assessment by using continuous speech samples average across utterances and may not adequately quantify the variability they are intended to assess.

Download Full-text

The phonetics of register in Takhian Thong Chong

Journal of the International Phonetic Association ◽

10.1017/s0025100309003879 ◽

2009 ◽

Vol 39 (2) ◽

pp. 162-188 ◽

Cited By ~ 41

Author(s):

Christian T. DiCanio

Keyword(s):

Relative Amplitude ◽

Acoustic Data ◽

Acoustic Correlates ◽

Spectral Tilt

The Chong language uses a combination of different acoustic correlates to distinguish among its four contrastive registers (phonation types). Electroglottographic (EGG) and acoustic data were examined from original fieldwork on the Takhian Thong dialect. EGG data shows high open quotient (OQ) values for the breathy register, low OQ values for the tense register, intermediate OQ values for the modal register, and rapidly changing high to low OQ values for the breathy-tense register. Acoustic correlates indicate that H1-A3 best distinguishes between breathy and non-breathy phonation, but measures like H1-H2 and pitch are necessary to discriminate between tense and non-tense phonation. A comparison of spectral tilt and OQ measures shows the greatest correlation between OQ and H1-H2, suggesting that changes in the relative amplitude of frequencies in the upper spectrum are not directly related to changes in the open period of the glottal cycle. OQ is best correlated with changes in the degree of glottal tension.

Download Full-text

Common Terminology and Acoustic Measures for Human Voice and Birdsong

Journal of Speech Language and Hearing Research ◽

10.1044/2018_jslhr-s-18-0218 ◽

2019 ◽

Vol 62 (1) ◽

pp. 60-69

Author(s):

Areen Badwal ◽

JoHanna Poertner ◽

Robin A. Samlan ◽

Julie E. Miller

Keyword(s):

Acoustic Analysis ◽

Voice Production ◽

Acoustic Measures ◽

Human Voice ◽

Song Analysis ◽

Core Set ◽

Standard Terminology ◽

Cepstral Peak Prominence ◽

Two Measures ◽

Similarities And Differences

Purpose The zebra finch is used as a model to study the neural circuitry of auditory-guided human vocal production. The terminology of birdsong production and acoustic analysis, however, differs from human voice production, making it difficult for voice researchers of either species to navigate the literature from the other. The purpose of this research note is to identify common terminology and measures to better compare information across species. Method Terminology used in the birdsong literature will be mapped onto terminology used in the human voice production literature. Measures typically used to quantify the percepts of pitch, loudness, and quality will be described. Measures common to the literature in both species will be made from the songs of 3 middle-age birds using Praat and Song Analysis Pro. Two measures, cepstral peak prominence (CPP) and Wiener entropy (WE), will be compared to determine if they provide similar information. Results Similarities and differences in terminology and acoustic analyses are presented. A core set of measures including frequency, frequency variability within a syllable, intensity, CPP, and WE are proposed for future studies. CPP and WE are related yet provide unique information about the syllable structure. Conclusions Using a core set of measures familiar to both human voice and birdsong researchers, along with both CPP and WE, will allow characterization of similarities and differences among birds. Standard terminology and measures will improve accessibility of the birdsong literature to human voice researchers and vice versa. Supplemental Material https://doi.org/10.23641/asha.7438964

Download Full-text

Analysing vocal quality of connected speech using Kay's computerized speech lab: a preliminary finding

Clinical Linguistics & Phonetics ◽

10.1080/02699200050023994 ◽

2000 ◽

Vol 14 (4) ◽

pp. 295-305 ◽

Cited By ~ 37

Author(s):

Edwin Yiu, Linda Worrall, Jennifer

Keyword(s):

Preliminary Finding ◽

Connected Speech ◽

Vocal Quality

Download Full-text

Acoustic Discrimination of Pathological Voice

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2001/027) ◽

2001 ◽

Vol 44 (2) ◽

pp. 327-339 ◽

Cited By ~ 175

Author(s):

Vijay Parsa ◽

Donald G. Jamieson

Keyword(s):

Operating Characteristic ◽

Classification Performance ◽

Spectral Measures ◽

Continuous Speech ◽

Roc Curve Analysis ◽

Linear Discriminant ◽

Acoustic Measures ◽

Acoustic Discrimination ◽

Pathological Voice

We investigated the ability of acoustic measures to discriminate between normal and pathological talkers. Two groups of measures were compared: (a) those extracted from sustained vowels and (b) those based on continuous speech samples. Nine acoustic measures, which include fundamental frequency and amplitude perturbation measures, long term average spectral measures, and glottal noise measures were extracted from both sustained vowel and continuous speech samples. Our experiments were performed on a published database of 53 normal talkers and 175 talkers with a pathological voice. The classification performance of the nine acoustic measures was quantified using linear discriminant analysis and receiver operating characteristic (ROC) curve analysis. When individual measures were considered in isolation, classification was more accurate for measures extracted from sustained vowels than for those based on continuous speech samples. Classification accuracy improved when combinations of acoustic parameters were considered. For such combinations of measures, classification results were comparable for measures extracted from continuous speech samples and for those based on sustained vowels.

Download Full-text

Categorization of regional and foreign accent in 5- to 7-year-old British children

International Journal of Behavioral Development ◽

10.1177/0165025409103871 ◽

2009 ◽

Vol 33 (4) ◽

pp. 366-375 ◽

Cited By ~ 46

Author(s):

Caroline Floccia ◽

Joseph Butler ◽

Frédérique Girard ◽

Jeremy Goslin

Keyword(s):

Voice Onset Time ◽

Onset Time ◽

Old French ◽

Foreign Accent ◽

Connected Speech ◽

British English ◽

Acoustic Measures ◽

Related Information ◽

Vowel Formants ◽

Better Than

This study examines children's ability to detect accent-related information in connected speech. British English children aged 5 and 7 years old were asked to discriminate between their home accent from an Irish accent or a French accent in a sentence categorization task. Using a preliminary accent rating task with adult listeners, it was first verified that the level of accentedness was similar across the two unfamiliar accents. Results showed that whereas the younger children group behaved just above chance level in this task, the 7-year-old group could reliably distinguish between these variations of their own language, but were significantly better at detecting the foreign accent than the regional accent. These results extend and replicate a previous study (Girard, Floccia, & Goslin, 2008) in which it was found that 5-year-old French children could detect a foreign accent better than a regional accent. The factors underlying the relative lack of awareness for a regional accent as opposed to a foreign accent in childhood are discussed, especially the amount of exposure, the learnability of both types of accents, and a possible difference in the amount of vowels versus consonants variability, for which acoustic measures of vowel formants and plosives voice onset time are provided.

Download Full-text

Comparative Visual Displays of Time and Frequency Domain Information in Connected Speech

The Journal of the Acoustical Society of America ◽

10.1121/1.3437284 ◽

1974 ◽

Vol 55 (2) ◽

pp. 412-412

Author(s):

Janet M. Baker ◽

Robert Ramsey ◽

Mark Miller ◽

James K. Baker ◽

Christopher Cooper

Keyword(s):

Frequency Domain ◽

Visual Displays ◽

Connected Speech ◽

Domain Information

Download Full-text

Intelligibility/Severity Measurements of Prekindergarten Children’s Speech

American Journal of Speech-Language Pathology ◽

10.1044/1058-0360.0902.141 ◽

2000 ◽

Vol 9 (2) ◽

pp. 141-150 ◽

Cited By ~ 60

Author(s):

Mary Gordon-Brannan ◽

Barbara Williams Hodson

Keyword(s):

Univariate Analysis ◽

Standard Measure ◽

Continuous Speech ◽

Connected Speech ◽

Potential Indicator ◽

Severe Group ◽

The Mean ◽

Post Hoc ◽

Children's Speech ◽

Listener Ratings

Intelligibility/severity measurements were obtained for 48 prekindergarten children with varying levels of phonological proficiency/ deficiency. The measure used as the “standard” was percentage of words understood (i.e., orthographically transcribed correctly) in continuous speech in a known context by unfamiliar trained listeners. The children were divided into four groups based on the percentage of words understood from their continuous speech samples. The ranges of intelligibility for each group were: (a) 91–100% for children with “adult-like” speech; (b) 83–90% for children in the “mild” category; (c) 68–81% for children with moderate intelligibility/speech involvement; and (d) 16–63% for the 12 children in the “severe” (i.e., least intelligible) category. When the percentages of the children in the severe group were excluded, the range of the top three groups combined was 68–100% and the mean was 85%. For a child 4 years of age or older, any percentage of words understood in connected speech that falls below 66% (2 standard deviations below the mean) may be a potential indicator of speech difficulty. In addition, data were obtained from the 48 children to determine the correlations between the standard measure and the following intelligibility/severity measures: (a) imitated sentences, (b) imitated words, (c) listener ratings of intelligibility, and (d) phonological deviation averages. All five measures, including the standard measure, investigated in this study were strongly intercorrelated. Multiple regression analysis results yielded a prediction model that included listener ratings and imitated sentences measures. Results of multivariate analysis of variance (MANOVA), univariate analysis, and post-hoc Bonferroni tests indicated that differences between all pairs of groups were significant for the listener rating measure based on the continuous speech sample. For the percentage of words understood in continuous speech samples, the differences between all pairs of groups, except between the adult-like and mild groups, were also significant. The only group that differed significantly from the other three groups for all five measures was the severe group.

Download Full-text