Using voice-quality measurements with prosodic and spectral features for speaker diarization

The effects of audio compression on voice quality measurements

The Journal of the Acoustical Society of America ◽

10.1121/10.0008576 ◽

2021 ◽

Vol 150 (4) ◽

pp. A356-A356

Author(s):

Jailyn M. Pena ◽

Alicia Mason ◽

Lisa Davidson

Keyword(s):

Voice Quality ◽

Audio Compression ◽

Quality Measurements

Download Full-text

Influence of Data Acquisition Environment on Accuracy of Acoustic Voice Quality Measurements

Journal of Voice ◽

10.1016/j.jvoice.2004.07.012 ◽

2005 ◽

Vol 19 (2) ◽

pp. 176-186 ◽

Cited By ~ 37

Author(s):

Dimitar D. Deliyski ◽

Maegan K. Evans ◽

Heather S. Shaw

Keyword(s):

Data Acquisition ◽

Voice Quality ◽

Quality Measurements

Download Full-text

A Longitudinal Ageing Analysis of Vocal Parameters of Singing Voice of Female Playback Singer

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c5012.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2050-2057

Keyword(s):

Rate Increase ◽

Voice Quality ◽

Rapid Change ◽

Spectral Features ◽

Singing Voice ◽

Acoustical Parameters ◽

Age Related ◽

Statistical Variations ◽

Singing Ability ◽

Singer Identification

Age-related changes to the vocal structure affect the singing ability of the singer. We present a longitudinal study of vocal ageing of a female professional playback singer having more than six decades of singing span (covering singer age from 19 to 80 years). The ageing analysis is performed on six vocal parameters like – fundamental frequency (F0), vibrato, formants and spectral features like spectral roll-off and centroid. Statistical variations in these vocal parameters over the entire singing span of the singer are discussed in the paper. Significant effects noted with the ageing voice were - decrease in F0, decreased vocal range, reduction in vibrato rate, increase in vibrato extent, decrease in F2 & F4 formants and rapid change in the spectral features. This investigation also studied the effect of ageing on singing voice quality through the measurement of singing power ratio (SPR). Increase in SPR measures was observed with ageing voice. The study of impact of vocal ageing with longitudinal data on singer identification (SID) is scare. The SID experimentation performed with 350 cappella songs covering entire singing span of the singer, showed a clear impact that change in acoustical parameters with ageing affected the performance of singer identification systems.

Download Full-text

Comparing Two Methods for Reducing Variability in Voice Quality Measurements

Journal of Speech Language and Hearing Research ◽

10.1044/1092-4388(2010/10-0083) ◽

2011 ◽

Vol 54 (3) ◽

pp. 803-812 ◽

Cited By ~ 12

Author(s):

Jody Kreiman ◽

Bruce R. Gerratt

Keyword(s):

Voice Quality ◽

Quality Measurements

Download Full-text

Tense-Lax Vowel Classification with Energy Trajectory and Voice Quality Measurements

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e95.d.884 ◽

2012 ◽

Vol E95-D (3) ◽

pp. 884-887

Author(s):

Suk-Myung LEE ◽

Jeung-Yoon CHOI

Keyword(s):

Voice Quality ◽

Quality Measurements

Download Full-text

Perceptual Voice Quality Measurements for Wireless Networks

Handbook of Research on Wireless Multimedia ◽

10.4018/978-1-59904-820-8.ch011 ◽

2010 ◽

pp. 274-295

Author(s):

Dorel Picovici ◽

John Nelson

Keyword(s):

Voice Quality ◽

Quality Measurement ◽

Predictive Modelling ◽

Test Case ◽

Case Scenario ◽

Perceptual Model ◽

Voice Signal ◽

Subjective Testing ◽

Objective Quantification ◽

Quality Measurements

Perceptual voice quality measurement can be defined as an objective quantification of an overall impression of the perceived stimulus. An alternative to laborious subjective testing is objective predictive modelling, which employs a perceptual model of the human auditory and cognitive system to predict the human response to a voice signal in terms of its quality. This chapter describes subjective and automated objective testing methods, and provides a test case scenario for measuring voice quality.

Download Full-text

Modulation spectral features for objective voice quality assessment

2010 4th International Symposium on Communications, Control and Signal Processing (ISCCSP) ◽

10.1109/isccsp.2010.5463313 ◽

2010 ◽

Cited By ~ 4

Author(s):

Maria Markaki ◽

Yannis Stylianou

Keyword(s):

Quality Assessment ◽

Voice Quality ◽

Spectral Features

Download Full-text

Adverse Effects of Environmental Noise on Acoustic Voice Quality Measurements

Journal of Voice ◽

10.1016/j.jvoice.2004.07.003 ◽

2005 ◽

Vol 19 (1) ◽

pp. 15-28 ◽

Cited By ~ 76

Author(s):

Dimitar D. Deliyski ◽

Heather S. Shaw ◽

Maegan K. Evans

Keyword(s):

Adverse Effects ◽

Voice Quality ◽

Environmental Noise ◽

Quality Measurements

Download Full-text

Age Norms for Auditory-Perceptual Neurophonetic Parameters: A Prerequisite for the Assessment of Childhood Dysarthria

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-19-00114 ◽

2020 ◽

Vol 63 (4) ◽

pp. 1071-1082

Author(s):

Theresa Schölderle ◽

Elisabet Haas ◽

Wolfram Ziegler

Keyword(s):

Assessment Tool ◽

Developmental Trajectories ◽

Voice Quality ◽

Typically Developing ◽

Typically Developing Children ◽

Age Norms ◽

Elementary School Age ◽

Speech Characteristics ◽

Substantial Progress ◽

Computer Based

Purpose The aim of this study was to collect auditory-perceptual data on established symptom categories of dysarthria from typically developing children between 3 and 9 years of age, for the purpose of creating age norms for dysarthria assessment. Method One hundred forty-four typically developing children (3;0–9;11 [years;months], 72 girls and 72 boys) participated. We used a computer-based game specifically designed for this study to elicit sentence repetitions and spontaneous speech samples. Speech recordings were analyzed using the auditory-perceptual criteria of the Bogenhausen Dysarthria Scales, a standardized German assessment tool for dysarthria in adults. The Bogenhausen Dysarthria Scales (scales and features) cover clinically relevant dimensions of speech and allow for an evaluation of well-established symptom categories of dysarthria. Results The typically developing children exhibited a number of speech characteristics overlapping with established symptom categories of dysarthria (e.g., breathy voice, frequent inspirations, reduced articulatory precision, decreased articulation rate). Substantial progress was observed between 3 and 9 years of age, but with different developmental trajectories across different dimensions. In several areas (e.g., respiration, voice quality), 9-year-olds still presented with salient developmental speech characteristics, while in other dimensions (e.g., prosodic modulation), features typically associated with dysarthria occurred only exceptionally, even in the 3-year-olds. Conclusions The acquisition of speech motor functions is a prolonged process not yet completed with 9 years. Various developmental influences (e.g., anatomic–physiological changes) shape children's speech specifically. Our findings are a first step toward establishing auditory-perceptual norms for dysarthria in children of kindergarten and elementary school age. Supplemental Material https://doi.org/10.23641/asha.12133380

Download Full-text

Evaluation of Acoustic Analyses of Voice in Nonoptimized Conditions

Journal of Speech Language and Hearing Research ◽

10.1044/2020_jslhr-20-00212 ◽

2020 ◽

Vol 63 (12) ◽

pp. 3991-3999

Author(s):

Benjamin van der Woerd ◽

Min Wu ◽

Vijay Parsa ◽

Philip C. Doyle ◽

Kevin Fung

Keyword(s):

Repeated Measures ◽

Voice Quality ◽

Data Sets ◽

Acoustic Measurements ◽

Sample Collection ◽

Experimental Conditions ◽

Environment Analysis ◽

Acoustic Measures ◽

Recording Conditions ◽

Cepstral Peak Prominence

Objectives This study aimed to evaluate the fidelity and accuracy of a smartphone microphone and recording environment on acoustic measurements of voice. Method A prospective cohort proof-of-concept study. Two sets of prerecorded samples (a) sustained vowels (/a/) and (b) Rainbow Passage sentence were played for recording via the internal iPhone microphone and the Blue Yeti USB microphone in two recording environments: a sound-treated booth and quiet office setting. Recordings were presented using a calibrated mannequin speaker with a fixed signal intensity (69 dBA), at a fixed distance (15 in.). Each set of recordings (iPhone—audio booth, Blue Yeti—audio booth, iPhone—office, and Blue Yeti—office), was time-windowed to ensure the same signal was evaluated for each condition. Acoustic measures of voice including fundamental frequency ( f o ), jitter, shimmer, harmonic-to-noise ratio (HNR), and cepstral peak prominence (CPP), were generated using a widely used analysis program (Praat Version 6.0.50). The data gathered were compared using a repeated measures analysis of variance. Two separate data sets were used. The set of vowel samples included both pathologic ( n = 10) and normal ( n = 10), male ( n = 5) and female ( n = 15) speakers. The set of sentence stimuli ranged in perceived voice quality from normal to severely disordered with an equal number of male ( n = 12) and female ( n = 12) speakers evaluated. Results The vowel analyses indicated that the jitter, shimmer, HNR, and CPP were significantly different based on microphone choice and shimmer, HNR, and CPP were significantly different based on the recording environment. Analysis of sentences revealed a statistically significant impact of recording environment and microphone type on HNR and CPP. While statistically significant, the differences across the experimental conditions for a subset of the acoustic measures (viz., jitter and CPP) have shown differences that fell within their respective normative ranges. Conclusions Both microphone and recording setting resulted in significant differences across several acoustic measurements. However, a subset of the acoustic measures that were statistically significant across the recording conditions showed small overall differences that are unlikely to have clinical significance in interpretation. For these acoustic measures, the present data suggest that, although a sound-treated setting is ideal for voice sample collection, a smartphone microphone can capture acceptable recordings for acoustic signal analysis.

Download Full-text