Consistency and Reliability of Voice Quality Ratings for Different Types of Speech Fragments

1994 ◽  
Vol 37 (5) ◽  
pp. 985-1000 ◽  
Author(s):  
Guus de Krom

This study describes a perception experiment in which listeners were asked to rate voice fragments obtained from a variety of speakers on grade, breathiness, and roughness. Four different types of stimuli were presented to each listener. One type of stimulus was based on connected speech fragments; the other three were based on different segments of a sustained vowel, yielding a 200 msec vowel onset stimulus, a 200 msec post-onset stimulus, and a 1000 msec whole vowel stimulus. Analyses focused on the consistency and reliability of grade, roughness, and breathiness ratings. Results indicated that stimulus type had virtually no effect on either within- or between-listener consistency of the grade, breathiness, or roughness ratings. Rating reliability too was hardly influenced by stimulus type. When determined as a function of the overall degree of deviance of a voice, the reliability of breathiness and roughness ratings was slightly higher for whole vowel and vowel onset stimuli than for connected speech and post-onset stimuli. It is concluded that connected speech stimuli are not necessarily to be preferred over vowel-type stimuli for a perceptual evaluation of grade, roughness, or breathiness. The somewhat higher reliability of ratings on vowel onset and whole vowel stimuli as compared to the post-onset stimuli is taken as an indication that the onset part of a vowel may contain voice quality cues that are less salient in the most stable part of a vowel.

2016 ◽  
Vol 25 (4) ◽  
pp. 561-575 ◽  
Author(s):  
Paul M. Evitts ◽  
Heather Starmer ◽  
Kristine Teets ◽  
Christen Montgomery ◽  
Lauren Calhoun ◽  
...  

Purpose There is currently minimal information on the impact of dysphonia secondary to phonotrauma on listeners. Considering the high incidence of voice disorders with professional voice users, it is important to understand the impact of a dysphonic voice on their audiences. Methods Ninety-one healthy listeners (39 men, 52 women; mean age = 23.62 years) were presented with speech stimuli from 5 healthy speakers and 5 speakers diagnosed with dysphonia secondary to phonotrauma. Dependent variables included processing speed (reaction time [RT] ratio), speech intelligibility, and listener comprehension. Voice quality ratings were also obtained for all speakers by 3 expert listeners. Results Statistical results showed significant differences between RT ratio and number of speech intelligibility errors between healthy and dysphonic voices. There was not a significant difference in listener comprehension errors. Multiple regression analyses showed that voice quality ratings from the Consensus Assessment Perceptual Evaluation of Voice (Kempster, Gerratt, Verdolini Abbott, Barkmeier-Kraemer, & Hillman, 2009) were able to predict RT ratio and speech intelligibility but not listener comprehension. Conclusions Results of the study suggest that although listeners require more time to process and have more intelligibility errors when presented with speech stimuli from speakers with dysphonia secondary to phonotrauma, listener comprehension may not be affected.


2015 ◽  
Vol 29 (6) ◽  
pp. 776.e7-776.e14 ◽  
Author(s):  
Lilia Brinca ◽  
Ana Paula Batista ◽  
Ana Inês Tavares ◽  
Patrícia N. Pinto ◽  
Lara Araújo

1995 ◽  
Vol 38 (4) ◽  
pp. 794-811 ◽  
Author(s):  
Guus de Krom

This study deals with the relation between listeners' ratings of pathological breathiness and roughness and certain characteristics of the voice spectrum. Two general research questions were addressed: First, which spectral parameters may serve as useful predictors of breathiness and roughness? Second, does the type of speech fragment used for analysis have an effect on the obtained regression model? Listener ratings of breathiness and roughness were obtained for three types of vowel fragments: a vowel onset segment, a mid-vowel (post-onset) segment, and a vowel segment covering the onset and the acoustically more stable post-onset parts. Results indicated that the harmonics-to-noise ratio was the best single predictor of both rated breathiness and roughness, explaining up to 54% of the true rating variance. By combining different predictors, between 75% and 80% of the breathiness variance could be explained for all three types of fragments. For roughness, a strong effect of fragment type was observed, with most variance explained in vowel onset fragments (71%), and least in post-onset fragments (52%). The effect of fragment type was also observed when regression analyses were performed with six predictors based on a factor analysis of the acoustic data.


Foods ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 1546
Author(s):  
Tomoko Hasegawa ◽  
Nobuyuki Sakai

In Japan, as in other countries, the externalization of food preparation is increasing. Japanese people are interested in the combination of food and tableware and they are concerned about transferring ready-made meals from plastic containers to natural tableware. This study aimed to examine the varying evaluations of meals due to differences in tableware. In this study, we investigated the effect of tableware on meal satisfaction, which is emphasized in Japanese culture. We studied the difference in the evaluation of ready-made meals (a rice ball, salad, croquette, and corn soup) before, during, and after a meal under two conditions: plastic tableware and natural wooden tableware. The results showed that there was no difference in the perceptual evaluation of taste and texture during the meal, except for the color of the salad and the temperature of the soup. On the other hand, meals served on natural wooden tableware were rated more positively than those served on plastic tableware before and after meals. These results suggest that, in Japan, the use of tableware, even for ready-made meals, increases the level of meal satisfaction. These findings have implications for both the providers and consumers of ready-made meals as well as the food industry.


2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Tino Haderlein ◽  
Cornelia Schwemmle ◽  
Michael Döllinger ◽  
Václav Matoušek ◽  
Martin Ptok ◽  
...  

Due to low intra- and interrater reliability, perceptual voice evaluation should be supported by objective, automatic methods. In this study, text-based, computer-aided prosodic analysis and measurements of connected speech were combined in order to model perceptual evaluation of the German Roughness-Breathiness-Hoarseness (RBH) scheme. 58 connected speech samples (43 women and 15 men;48.7±17.8years) containing the German version of the text “The North Wind and the Sun” were evaluated perceptually by 19 speech and voice therapy students according to the RBH scale. For the human-machine correlation, Support Vector Regression with measurements of the vocal fold cycle irregularities (CFx) and the closed phases of vocal fold vibration (CQx) of the Laryngograph and 33 features from a prosodic analysis module were used to model the listeners’ ratings. The best human-machine results for roughness were obtained from a combination of six prosodic features and CFx (r=0.71,ρ=0.57). These correlations were approximately the same as the interrater agreement among human raters (r=0.65,ρ=0.61). CQx was one of the substantial features of the hoarseness model. For hoarseness and breathiness, the human-machine agreement was substantially lower. Nevertheless, the automatic analysis method can serve as the basis for a meaningful objective support for perceptual analysis.


Author(s):  
Alice Crochiquia ◽  
Anders Eriksson ◽  
Mario A. S. Fontes ◽  
Sandra Madureira

ABSTRACT This work comprises an experimental investigation approach of expressive speech that integrates methodological procedures of perceptual and acoustic analyses. As the object of this work, we have focused on voice quality and vocal dynamics. Speech samples from the four main personality-distinct characters in the animated feature film “Zootopia” dubbed by Brazilian voice actors have been analysed. Due to the expressive function of voice quality, we have posed the following question: what types of voice quality and vocal dynamics settings were used by the voice actors in the Brazilian dubbing of “Zootopia” to compose the vocal profiles of the characters? Perceptual evaluation of the 54 speech stimuli was performed using the Vocal Profile Analysis protocol (Laver & Mackenzie Beck, 2007). Acoustic measures were automatically extracted using the ExpressionEvaluator script (Barbosa, 2008) for PRAAT. The profiles for each of the four characters were composed based on the psychological traits described in the film script. The results of the acoustic analysis, the perceptual analysis of voice quality and vocal dynamics settings were correlated using the MFA (Multiple Factor Analysis) method in the R environment based on 40 variables (quantitative and qualitative) and it turned out that the speech stimuli were distributed in 6 clusters according to the variables analysed. The quantitative variables that presented the highest correlation percentage were: Standard Deviation of f0 Derivative, Standard Deviation of Spectral Tilt, f0 Median. The qualitative variables that presented the highest correlation percentage were: Lowered Larynx, Lip Rounding, Breathy Voice and Minimised Pitch Range. The research has presented evidence in favor of the symbolic use of phonic matter and contributions to the understanding of how vocal stereotypes are established.


2020 ◽  
Vol 63 (12) ◽  
pp. 3974-3981
Author(s):  
Ashwini Joshi ◽  
Isha Baheti ◽  
Vrushali Angadi

Aim The purpose of this study was to develop and assess the reliability of a Hindi version of the Consensus Auditory-Perceptual Evaluation of Voice (CAPE-V). Reliability was assessed by comparing Hindi CAPE-V ratings with English CAPE-V ratings and by the Grade, Roughness, Breathiness, Asthenia and Strain (GRBAS) scale. Method Hindi sentences were created to match the phonemic load of the corresponding English CAPE-V sentences. The Hindi sentences were adapted for linguistic content. The original English and adapted Hindi CAPE-V and GRBAS were completed for 33 bilingual individuals with normal voice quality. Additionally, the Hindi CAPE-V and GRBAS were completed for 13 Hindi speakers with disordered voice quality. The agreement of CAPE-V ratings was assessed between language versions, GRBAS ratings, and two rater pairs (three raters in total). Pearson product–moment correlation was completed for all comparisons. Results A strong correlation ( r > .8, p < .01) was found between the Hindi CAPE-V scores and the English CAPE-V scores for most variables in normal voice participants. A weak correlation was found for the variable of strain ( r < .2, p = .400) in the normative group. A strong correlation ( r > .6, p < .01) was found between the overall severity/grade, roughness, and breathiness scores in the GRBAS scale and the CAPE-V scale in normal and disordered voice samples. Significant interrater reliability ( r > .75) was present in overall severity and breathiness. Conclusions The Hindi version of the CAPE-V demonstrates good interrater reliability and concurrent validity with the English CAPE-V and the GRBAS. The Hindi CAPE-V can be used for the auditory-perceptual voice assessment of Hindi speakers.


2017 ◽  
Vol 23 (1) ◽  
pp. 1-20
Author(s):  
Kathy Connaughton ◽  
Irena Yanushevskaya

Objective: This study explores the immediate impact of prolonged voice use by professional sports coaches. Method: Speech samples including sustained phonation of vowel /a/ and a short read passage were collected from two professional sports coaches. The audio recordings were made within an hour before and after a coaching session, over three sessions. Perceptual evaluation of voice quality was done using the GRBAS scale. The speech samples were subsequently analyzed using Praat. The acoustic measures included fundamental frequency (f0), jitter, shimmer, Harmonics-to-Noise ratio and Cepstral Peak Prominence. Main results: The results of perceptual and acoustic analysis suggest a slight shift towards a tenser phonation post-coaching session, which is a likely consequence of laryngeal muscle adaptation to prolonged voice use. This tendency was similar in sustained vowels and connected speech. Conclusion: Acoustic measures used in this study can be useful to capture the voice change post-coaching session. It is desirable, however, that more sophisticated and robust and at the same time intuitive and easy-to-use tools for voice assessment and monitoring be made available to clinicians and professional voice users.


Author(s):  
Hyeck Soo Son ◽  
Jung Min Lee ◽  
Ramin Khoramnia ◽  
Chul Young Choi

Abstract Purpose To analyse and compare the surface topography and roughness of three different types of diffractive multifocal IOLs. Methods Using scanning electron microscope (SEM, Inspect F, 5.0 KV, maximum magnification up to 20,000) and atomic force microscope (AFM, Park Systems, XE-100, non-contact, area profile comparison, 10 × 10 µm, 40 × 40 µm), the surface quality of the following diffractive IOLs was studied: the AcrySof IQ PanOptix (Alcon, USA), the AT LARA 829MP (Carl Zeiss Meditec, Germany), and Tecnis Symfony (Johnson&Johnson Vision, USA). The measurements were made over three representative areas (central non-diffractive optic, central diffractive optic, and diffractive step) of each IOL. Roughness profile in terms of mean arithmetic roughness (Ra) and root-mean-squared roughness (Rq) values were obtained and compared statistically. Results In SEM examination, all IOLs showed a smooth optical surface without any irregularities at low magnification. At higher magnification, Tecnis Symfony showed unique highly regular, concentric, and lineate structures in the diffractive optic area which could not be seen in the other studied diffractive IOLs. The differences in the measured Ra and Rq values of the Tecnis Symfony were statistically significant compared to the other models (p < 0.05). Conclusion Various different topographical traits were observed in three diffractive multifocal IOLs. The Ra values of all studied IOLs were within an acceptable range. Tecnis Symfony showed statistically significant higher surface Ra values at both central diffractive optic and diffractive step areas. Furthermore, compared to its counterparts, Tecnis Symfony demonstrated highly ordered, concentric pattern in its diffractive surfaces.


Author(s):  
Seung Wan Hong ◽  
Tae Won Kim ◽  
Jae Hun Kim

Abstract Physicians and nurses stand with their back towards the C-arm fluoroscope when using the computer, taking things out of closets and preparing drugs for injection or instruments for intervention. This study was conducted to investigate the relationship between the type of lead apron and radiation exposure to the backs of physicians and nurses while using C-arm fluoroscopy. We compared radiation exposure to the back in the three groups: no lead apron (group C), front coverage type (group F) and wrap-around type (group W). The other wrap-around type apron was put on the bed instead of on a patient. We ran C-arm fluoroscopy 40 times for each measurement. We collected the air kerma (AK), exposure time (ET) and effective dose (ED) of the bedside table, upper part and lower part of apron. We measured these variables 30 times for each location. In group F, ED of the upper part was the highest (p &lt; 0.001). ED of the lower part in group C and F was higher than that in group W (p = 0.012). The radiation exposure with a front coverage type apron is higher than that of the wrap-around type and even no apron at the neck or thyroid. For reducing radiation exposure to the back of physician or nurse, the wrap-around type apron is recommended. This type of apron can reduce radiation to the back when the physician turns away from the patient or C-arm fluoroscopy.


Sign in / Sign up

Export Citation Format

Share Document