Prosodic and phonetic features for speaker clustering in speaker diarization systems

Purpose The study sets out to investigate inter- and intraspeaker variation in German infant-directed speech (IDS) and considers the potential impact that the factors gender, parental involvement, and speech material (read vs. spontaneous speech) may have. In addition, we analyze data from 3 time points prior to and after the birth of the child to examine potential changes in the features of IDS and, particularly also, of adult-directed speech (ADS). Here, the gender identity of a speaker is considered as an additional factor. Method IDS and ADS data from 34 participants (15 mothers, 19 fathers) is gathered by means of a reading and a picture description task. For IDS, 2 recordings were made when the baby was approximately 6 and 9 months old, respectively. For ADS, an additional recording was made before the baby was born. Phonetic analyses comprise mean fundamental frequency (f0), variation in f0, the 1st 2 formants measured in /i: ɛ a u:/, and the vowel space size. Moreover, social and behavioral data were gathered regarding parental involvement and gender identity. Results German IDS is characterized by an increase in mean f0, a larger variation in f0, vowel- and formant-specific differences, and a larger acoustic vowel space. No effect of gender or parental involvement was found. Also, the phonetic features of IDS were found in both spontaneous and read speech. Regarding ADS, changes in vowel space size in some of the fathers and in mean f0 in mothers were found. Conclusion Phonetic features of German IDS are robust with respect to the factors gender, parental involvement, speech material (read vs. spontaneous speech), and time. Some phonetic features of ADS changed within the child's first year depending on gender and parental involvement/gender identity. Thus, further research on IDS needs to address also potential changes in ADS.

Download Full-text

Perceptual Characteristics of Consonant Production in Apraxia of Speech and Aphasia

American Journal of Speech-Language Pathology ◽

10.1044/2019_ajslp-18-0169 ◽

2019 ◽

Vol 28 (4) ◽

pp. 1411-1431 ◽

Cited By ~ 3

Author(s):

Lauren Bislick ◽

William D. Hula

Keyword(s):

Error Rate ◽

Error Rates ◽

Group Differences ◽

Diagnostic Process ◽

Apraxia Of Speech ◽

Contextual Variables ◽

Error Type ◽

Type Distribution ◽

Phonetic Features ◽

Syllable Position

Purpose This retrospective analysis examined group differences in error rate across 4 contextual variables (clusters vs. singletons, syllable position, number of syllables, and articulatory phonetic features) in adults with apraxia of speech (AOS) and adults with aphasia only. Group differences in the distribution of error type across contextual variables were also examined. Method Ten individuals with acquired AOS and aphasia and 11 individuals with aphasia participated in this study. In the context of a 2-group experimental design, the influence of 4 contextual variables on error rate and error type distribution was examined via repetition of 29 multisyllabic words. Error rates were analyzed using Bayesian methods, whereas distribution of error type was examined via descriptive statistics. Results There were 4 findings of robust differences between the 2 groups. These differences were found for syllable position, number of syllables, manner of articulation, and voicing. Group differences were less robust for clusters versus singletons and place of articulation. Results of error type distribution show a high proportion of distortion and substitution errors in speakers with AOS and a high proportion of substitution and omission errors in speakers with aphasia. Conclusion Findings add to the continued effort to improve the understanding and assessment of AOS and aphasia. Several contextual variables more consistently influenced breakdown in participants with AOS compared to participants with aphasia and should be considered during the diagnostic process. Supplemental Material https://doi.org/10.23641/asha.9701690

Download Full-text

Distinctive Phonetic Features as Relevant and Irrelevant Stimulus Dimensions in Speech-Sound Discrimination Learning

Journal of Speech and Hearing Research ◽

10.1044/jshr.1703.417 ◽

1974 ◽

Vol 17 (3) ◽

pp. 417-425

Author(s):

Stuart I. Ritterman ◽

Nancy C. Freeman

Keyword(s):

College Students ◽

Discrimination Learning ◽

Irrelevant Dimension ◽

Speech Sound ◽

Irrelevant Stimulus ◽

Relevant Dimension ◽

Phonetic Features ◽

Sound Discrimination ◽

Nonsense Syllables

Thirty-two college students were required to learn the relevant dimension in each of two randomized lists of auditorily presented stimuli. The stimuli consisted of seven pairs of CV nonsense syllables differing by two relevant dimension units and from zero to seven irrelevant dimension units. Stimulus dimensions were determined according to Saporta’s units of difference. No significant differences in performance as a function of number of the irrelevant dimensions nor characteristics of the relevant dimension were observed.

Download Full-text