Modification of Listener-Judged Naturalness in the Speech of Stutterers

1985 ◽  
Vol 28 (4) ◽  
pp. 495-504 ◽  
Author(s):  
Roger J. Ingham ◽  
Richard R. Martin ◽  
Sam K. Haroldson ◽  
Mark Onslow ◽  
Miriam Leney

This study investigated the effect of regular feedback of listener-judged speech naturalness ratings on the speech of stutterers. Six adult stutterers each participated in a time-series ABA experiment. During the treatment phase the stutterer was instructed to improve a clinician's rating, on a 9-point scale, of the naturalness of each 30-s interval of the stutterer's spontaneous speech. The results indicate that the naturalness ratings and stuttering for 5 of the subjects made favorable changes during the treatment phase. Analyses of the findings show that only some of the naturalness judgments were influenced by stuttering frequency and speech rate. A perceptual analysis of the speech of 2 subjects suggested that the speech naturalness ratings were also probably influenced by other less obvious variables.

2020 ◽  
Vol 63 (7) ◽  
pp. 2054-2069
Author(s):  
Brandon Merritt ◽  
Tessa Bent

Purpose The purpose of this study was to investigate how speech naturalness relates to masculinity–femininity and gender identification (accuracy and reaction time) for cisgender male and female speakers as well as transmasculine and transfeminine speakers. Method Stimuli included spontaneous speech samples from 20 speakers who are transgender (10 transmasculine and 10 transfeminine) and 20 speakers who are cisgender (10 male and 10 female). Fifty-two listeners completed three tasks: a two-alternative forced-choice gender identification task, a speech naturalness rating task, and a masculinity/femininity rating task. Results Transfeminine and transmasculine speakers were rated as significantly less natural sounding than cisgender speakers. Speakers rated as less natural took longer to identify and were identified less accurately in the gender identification task; furthermore, they were rated as less prototypically masculine/feminine. Conclusions Perceptual speech naturalness for both transfeminine and transmasculine speakers is strongly associated with gender cues in spontaneous speech. Training to align a speaker's voice with their gender identity may concurrently improve perceptual speech naturalness. Supplemental Material https://doi.org/10.23641/asha.12543158


Target ◽  
2020 ◽  
Vol 32 (3) ◽  
pp. 482-506 ◽  
Author(s):  
Judit Bóna ◽  
Mária Bakti

Abstract This paper investigates how variation in the complexity of speech tasks is reflected in the temporal characteristics and disfluency patterns of speech. We examined temporal characteristics (speech rate, global articulation rate, ratio of pauses, frequency of pauses, and mean duration of pauses) and disfluency markers (overall frequency of disfluencies; frequency of filled pauses, filler words, whole-word repetitions, part-word repetitions, broken words, prolonged sounds, and revisions; frequency of disfluency clusters) in four speech production tasks (consecutive interpreting, sight translation, spontaneous speech and extemporaneous speech) with twelve speakers. Our hypothesis, according to which the examined parameters will differ across the four tasks, was partly confirmed by the data; even though not all speech tasks differed significantly in all the examined parameters, our investigation revealed that there were significant differences between some tasks in four parameters, and between others in nine out of the fourteen parameters examined. Our data also suggest that in terms of the temporal characteristics and disfluency markers examined, the four tasks can be represented on a continuum based on the cognitive load associated with each task. At one end of the continuum and generating the least cognitive load is spontaneous speech, and at the other, generating the most cognitive load, is sight translation.


2021 ◽  
Vol 13 (8) ◽  
pp. 3995-4017
Author(s):  
Minghan Cheng ◽  
Xiyun Jiao ◽  
Binbin Li ◽  
Xun Yu ◽  
Mingchao Shao ◽  
...  

Abstract. Satellite observations of evapotranspiration (ET) have been widely used for water resources management in China. An accurate ET product with a high spatiotemporal resolution is required for research on drought stress and water resources management. However, such a product is currently lacking. Moreover, the performances of different ET estimation algorithms for China have not been clearly studied, especially under different environmental conditions. Therefore, the aims of this study were as follows: (1) to use multisource images to generate a long-time-series (2001–2018) daily ET product with a spatial resolution of 1 km × 1 km based on the Surface Energy Balance Algorithm for Land (SEBAL); (2) to comprehensively evaluate the performance of the SEBAL ET in China using flux observational data and hydrological observational data; and (3) to compare the performance of the SEBAL ET with the MOD16 ET product at the point scale and basin scale under different environmental conditions in China. At the point scale, both the models performed best in the conditions of forest cover, subtropical zones, hilly terrain, or summer, respectively, and SEBAL performed better in most conditions. In general, the accuracy of the SEBAL ET (rRMSE = 44.91 %) was slightly higher than that of the MOD16 ET (rRMSE = 48.72 %). In the basin-scale validation, both the models performed better than in the point-scale validation, with SEBAL obtaining results superior (rRMSE = 13.57 %) to MOD16 (rRMSE = 32.84 %). Additionally, both the models showed a negative bias, with the bias of the MOD16 ET being higher than that of the SEBAL ET. In the daily-scale validation, the SEBAL ET product showed a root mean square error (RMSE) of 0.92 mm d−1 and an r value of 0.79. In general, the SEBAL ET product can be used for the qualitative analysis and most quantitative analyses of regional ET. The SEBAL ET product is freely available at https://doi.org/10.5281/zenodo.4243988 and https://doi.org/10.5281/zenodo.4896147 (Cheng, 2020a, b). The results of this study can provide a reference for the application of remotely sensed ET products and the improvement of satellite ET observation algorithms.


2019 ◽  
Vol 41 (1) ◽  
Author(s):  
James Tanner ◽  
Morgan Sonderegger ◽  
Jane Stuart-Smith ◽  
SPADE Data Consortium

The ‘voicing effect’ – the durational difference in vowels preceding voiced and voiceless consonants – is a well-documented phenomenon in English, where it plays a key role in the production and perception of the English final voicing contrast. Despite this supposed importance, little is known as to how robust this effect is in spontaneous connected speech, which is itself subject to a range of linguistic factors. Similarly, little attention has focused on variability in the voicing effect across dialects of English, bar analysis of specific varieties. Our findings show that the voicing of the following consonant exhibits a weaker-than-expected effect in spontaneous speech, interacting with manner, vowel height, speech rate, and word frequency. English dialects appear to demonstrate a continuum of potential voicing effect sizes, where varieties with dialect-specific phonological rules exhibit the most extreme values. The results suggest that the voicing effect in English is both substantially weaker than previously assumed in spontaneous connected speech, and subject to a wide range of dialectal variability.


2014 ◽  
Author(s):  
Miguel Oliveira Jr ◽  
Ayane Nazarela Santos De Almeida ◽  
René Alain Santana De Almeida ◽  
Ebson Wilkerson Silva

2018 ◽  
Vol 26 (4) ◽  
pp. 1647 ◽  
Author(s):  
Alessandro Panunzi ◽  
Valentina Saccone

Abstract: This work presents a pilot study for a prosodic analysis of two different spoken structures in spoken Italian within the theoretical framework of the Language into Act Theory (L-AcT): (i) chains of two or more Bound Comments (COB) that do not form a compositional informative and prosodic unit; (ii) compositional Information Units formed by two or more Multiple Comments (CMM), linked together by a conventional prosodic model that implements specific meta-illocutive structures. This work analyzes COBs and CMMs from the DB-IPIC Italian Minicorpus. Different prosodic cues are taken into account: f0 reset, pauses, final lengthening, intensity lowering and initial rush. The distinctive feature for COBs is a flat trend of f0 before the boundary, with a low number of f0 reset, while CMMs vary between different f0 shapes. Vowel elongation and a no rushing speech rate cooperate in perceiving the prolongation of one COB into another. Initial rush is a characteristic feature of CMMs, while the lengthening of the last vowel of the unit is easier to find at the end of a COB than in a CMM.Keywords: prosody; spontaneous speech segmentation; non-terminal breaks; L-AcT.Resumo: Este trabalho apresenta um estudo piloto sobre uma análise prosódica de duas estruturas distintas em italiano falado, sob a perspectiva da Teoria da Língua em Ato (L-AcT): (i) cadeiras de dois ou mais Comentários Ligados (COB) que não formam uma unidade informacional e prosódica composicional; (ii) unidades informacionais composicionais formadas por dois ou mais Comentários Múltiplos (CMM), ligados entre si por um modelo prosódico convencional que implementa estruturas metailocutivas específicas. Os COBs e CMMs analisados foram extraídos do minicorpus italiano disponível no DB-IPIC. Diferentes aspectos prosódicos são levados em conta: reset de f0, pausas, alongamento final, abaixamento de intensidade e rush inicial. O traço distintivo para os COBs é uma tendência a achatamento de f0 antes da fronteira, com um baixo número de reset de f0, enquanto os CMMs variam entre diferentes formatos de f0. Alongamento de vogal e uma velocidade de fala sem rushing cooperam na percepção do prolongamento de um COB naquele que o segue. O rush inicial é um traço característico dos CMMs, enquanto o alongamento da última vogal da unidade é mais fácil de encontrar ao final de um COB do que de um CMM.Palavras-chave: prosódia; segmentação da fala espontânea; quebras não-terminais; L-Act


1998 ◽  
Vol 41 (1) ◽  
pp. 5-17 ◽  
Author(s):  
Nicholas Schiavetti ◽  
Robert L. Whitehead ◽  
Brenda Whitehead ◽  
Dale Evan Metz

This study investigated the effect of fingerspelling task length on temporal characteristics and perceived naturalness of speech produced during simultaneous communication. Stimulus words at four levels of fingerspelling task length were embedded in a sentence that was spoken and produced with simultaneous communication. Five temporal measures were calculated from acoustic recordings, and perceived speech naturalness was rated by a panel of listeners using a 9-point scale. Results indicated significant differences in temporal measures and naturalness ratings between the speech and simultaneous communication conditions and among levels of fingerspelling task length. Speech produced during simultaneous communication was rated as less natural and demonstrated increased interword interval, diphthong, word, and sentence durations. Regression analysis indicated significant correlations between temporal measures and perceived speech naturalness, and analysis of variance showed significant increases in segmental and interword interval durations and perceived speech unnaturalness as fingerspelling task length increased. These results are discussed in relation to previous findings regarding production and perception characteristics of speech that is altered in temporal parameters by a variety of conditions.


Sign in / Sign up

Export Citation Format

Share Document