Prosodic and phonetic features for speaker clustering in speaker diarization systems

Author(s):  
Janez Žibert ◽  
France Mihelič
2019 ◽  
Vol 62 (11) ◽  
pp. 4001-4014
Author(s):  
Melanie Weirich ◽  
Adrian Simpson

Purpose The study sets out to investigate inter- and intraspeaker variation in German infant-directed speech (IDS) and considers the potential impact that the factors gender, parental involvement, and speech material (read vs. spontaneous speech) may have. In addition, we analyze data from 3 time points prior to and after the birth of the child to examine potential changes in the features of IDS and, particularly also, of adult-directed speech (ADS). Here, the gender identity of a speaker is considered as an additional factor. Method IDS and ADS data from 34 participants (15 mothers, 19 fathers) is gathered by means of a reading and a picture description task. For IDS, 2 recordings were made when the baby was approximately 6 and 9 months old, respectively. For ADS, an additional recording was made before the baby was born. Phonetic analyses comprise mean fundamental frequency (f0), variation in f0, the 1st 2 formants measured in /i: ɛ a u:/, and the vowel space size. Moreover, social and behavioral data were gathered regarding parental involvement and gender identity. Results German IDS is characterized by an increase in mean f0, a larger variation in f0, vowel- and formant-specific differences, and a larger acoustic vowel space. No effect of gender or parental involvement was found. Also, the phonetic features of IDS were found in both spontaneous and read speech. Regarding ADS, changes in vowel space size in some of the fathers and in mean f0 in mothers were found. Conclusion Phonetic features of German IDS are robust with respect to the factors gender, parental involvement, speech material (read vs. spontaneous speech), and time. Some phonetic features of ADS changed within the child's first year depending on gender and parental involvement/gender identity. Thus, further research on IDS needs to address also potential changes in ADS.


2019 ◽  
Vol 28 (4) ◽  
pp. 1411-1431 ◽  
Author(s):  
Lauren Bislick ◽  
William D. Hula

Purpose This retrospective analysis examined group differences in error rate across 4 contextual variables (clusters vs. singletons, syllable position, number of syllables, and articulatory phonetic features) in adults with apraxia of speech (AOS) and adults with aphasia only. Group differences in the distribution of error type across contextual variables were also examined. Method Ten individuals with acquired AOS and aphasia and 11 individuals with aphasia participated in this study. In the context of a 2-group experimental design, the influence of 4 contextual variables on error rate and error type distribution was examined via repetition of 29 multisyllabic words. Error rates were analyzed using Bayesian methods, whereas distribution of error type was examined via descriptive statistics. Results There were 4 findings of robust differences between the 2 groups. These differences were found for syllable position, number of syllables, manner of articulation, and voicing. Group differences were less robust for clusters versus singletons and place of articulation. Results of error type distribution show a high proportion of distortion and substitution errors in speakers with AOS and a high proportion of substitution and omission errors in speakers with aphasia. Conclusion Findings add to the continued effort to improve the understanding and assessment of AOS and aphasia. Several contextual variables more consistently influenced breakdown in participants with AOS compared to participants with aphasia and should be considered during the diagnostic process. Supplemental Material https://doi.org/10.23641/asha.9701690


1974 ◽  
Vol 17 (3) ◽  
pp. 417-425
Author(s):  
Stuart I. Ritterman ◽  
Nancy C. Freeman

Thirty-two college students were required to learn the relevant dimension in each of two randomized lists of auditorily presented stimuli. The stimuli consisted of seven pairs of CV nonsense syllables differing by two relevant dimension units and from zero to seven irrelevant dimension units. Stimulus dimensions were determined according to Saporta’s units of difference. No significant differences in performance as a function of number of the irrelevant dimensions nor characteristics of the relevant dimension were observed.


Sign in / Sign up

Export Citation Format

Share Document