Combining temporal and cepstral features for the automatic perceptual categorization of disordered connected speech

Automatic perceptual categorization of disordered connected speech

10.21437/interspeech.2010-696 ◽

2010 ◽

Author(s):

Ali Alpan ◽

Jean Schoentgen ◽

Youri Maryn ◽

Francis Grenez

Keyword(s):

Perceptual Categorization ◽

Connected Speech

Download Full-text

Performance Deviations in the Connected Speech of Adults With No Brain Damage and Adults With Aphasia

American Journal of Speech-Language Pathology ◽

10.1044/1058-0360.0404.118 ◽

1995 ◽

Vol 4 (4) ◽

pp. 118-123 ◽

Cited By ~ 10

Author(s):

Robert H. Brookshire ◽

Linda E. Nicholas

Keyword(s):

Brain Damage ◽

Connected Speech

Download Full-text

Effects of Concurrent Manual Task Performance on Connected Speech Acoustics in Individuals With Parkinson Disease

Journal of Speech Language and Hearing Research ◽

10.1044/2019_jslhr-s-msc18-18-0190 ◽

2019 ◽

Vol 62 (7) ◽

pp. 2099-2117 ◽

Cited By ~ 2

Author(s):

Jason A. Whitfield ◽

Zoe Kriegel ◽

Adam M. Fullenkamp ◽

Daryush D. Mehta

Keyword(s):

Task Performance ◽

Speech Production ◽

Dual Task ◽

Motor Task ◽

Task Condition ◽

Connected Speech ◽

Speech Acoustics ◽

Single Task ◽

Manual Task ◽

Task Conditions

Purpose Prior investigations suggest that simultaneous performance of more than 1 motor-oriented task may exacerbate speech motor deficits in individuals with Parkinson disease (PD). The purpose of the current investigation was to examine the extent to which performing a low-demand manual task affected the connected speech in individuals with and without PD. Method Individuals with PD and neurologically healthy controls performed speech tasks (reading and extemporaneous speech tasks) and an oscillatory manual task (a counterclockwise circle-drawing task) in isolation (single-task condition) and concurrently (dual-task condition). Results Relative to speech task performance, no changes in speech acoustics were observed for either group when the low-demand motor task was performed with the concurrent reading tasks. Speakers with PD exhibited a significant decrease in pause duration between the single-task (speech only) and dual-task conditions for the extemporaneous speech task, whereas control participants did not exhibit changes in any speech production variable between the single- and dual-task conditions. Conclusions Overall, there were little to no changes in speech production when a low-demand oscillatory motor task was performed with concurrent reading. For the extemporaneous task, however, individuals with PD exhibited significant changes when the speech and manual tasks were performed concurrently, a pattern that was not observed for control speakers. Supplemental Material https://doi.org/10.23641/asha.8637008

Download Full-text

Adaptation of the Connected Speech Test: Rerecording and Passage Equivalency

American Journal of Audiology ◽

10.1044/2019_aja-19-00052 ◽

2020 ◽

Vol 29 (2) ◽

pp. 259-264 ◽

Cited By ~ 2

Author(s):

Hasan K. Saleh ◽

Paula Folkeard ◽

Ewan Macpherson ◽

Susan Scollie

Keyword(s):

Speech Perception ◽

North American ◽

First Language ◽

Intraclass Correlation ◽

Normal Hearing ◽

Connected Speech ◽

Internal Reliability ◽

Speech Test ◽

The Mean ◽

Unit Standard

Purpose The original Connected Speech Test (CST; Cox et al., 1987) is a well-regarded and often utilized speech perception test. The aim of this study was to develop a new version of the CST using a neutral North American accent and to assess the use of this updated CST on participants with normal hearing. Method A female English speaker was recruited to read the original CST passages, which were recorded as the new CST stimuli. A study was designed to assess the newly recorded CST passages' equivalence and conduct normalization. The study included 19 Western University students (11 females and eight males) with normal hearing and with English as a first language. Results Raw scores for the 48 tested passages were converted to rationalized arcsine units, and average passage scores more than 1 rationalized arcsine unit standard deviation from the mean were excluded. The internal reliability of the 32 remaining passages was assessed, and the two-way random effects intraclass correlation was .944. Conclusion The aim of our study was to create new CST stimuli with a more general North American accent in order to minimize accent effects on the speech perception scores. The study resulted in 32 passages of equivalent difficulty for listeners with normal hearing.

Download Full-text

Threshold of intelligibility/comprehensibility of rapid connected speech: Method and instrumentation.

PsycEXTRA Dataset ◽

10.1037/e438292004-001 ◽

1978 ◽

Author(s):

Henry J. DeHaan ◽

John R. Schjelderup

Keyword(s):

Connected Speech

Download Full-text

Decision Criteria Shift in Perceptual Categorization (But Not in Yes-No Detection)

PsycEXTRA Dataset ◽

10.1037/e501882009-359 ◽

2000 ◽

Author(s):

J. L. Dodd ◽

J. D. Balakrishnan ◽

W. Todd Maddox

Keyword(s):

Perceptual Categorization ◽

Decision Criteria

Download Full-text

A Comparative Study of Auditory-Perceptive Evaluation and Acoustic Analyses on Vowels and Connected Speech of Dysphonia

Journal of speech-language & hearing disorders ◽

10.15724/jslhd.2011.20.2.006 ◽

2011 ◽

Vol 20 (2) ◽

pp. 87-105

Author(s):

이명순

Keyword(s):

Comparative Study ◽

Connected Speech ◽

Acoustic Analyses

Download Full-text

Spoof Detection Using Source, Instantaneous Frequency and Cepstral Features

10.21437/interspeech.2017-930 ◽

2017 ◽

Cited By ~ 24

Author(s):

Sarfaraz Jelil ◽

Rohan Kumar Das ◽

S.R. Mahadeva Prasanna ◽

Rohit Sinha

Keyword(s):

Instantaneous Frequency ◽

Cepstral Features

Download Full-text

Determinaton of Lateral Pharyngeal WaIl Motion during Connected Speech by Use of Pulsed Ultrasound

Science ◽

10.1126/science.161.3847.1259 ◽

1968 ◽

Vol 161 (3847) ◽

pp. 1259-1260 ◽

Cited By ~ 3

Author(s):

C. A. Kelsey ◽

S. J. Ewanowski ◽

T. J. Hixon ◽

F. D. Minifie

Keyword(s):

Pulsed Ultrasound ◽

Connected Speech

Download Full-text

On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition

Sensors ◽

10.3390/s21051888 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1888

Author(s):

Juraj Kacur ◽

Boris Puterka ◽

Jarmila Pavlovicova ◽

Milos Oravec

Keyword(s):

Emotion Recognition ◽

Linear Prediction ◽

Filter Banks ◽

Vocal Tract ◽

Statistical Tests ◽

Extraction Methods ◽

Speech Emotion Recognition ◽

Speech Characteristics ◽

Evaluation Phase ◽

Cepstral Features

Many speech emotion recognition systems have been designed using different features and classification methods. Still, there is a lack of knowledge and reasoning regarding the underlying speech characteristics and processing, i.e., how basic characteristics, methods, and settings affect the accuracy, to what extent, etc. This study is to extend physical perspective on speech emotion recognition by analyzing basic speech characteristics and modeling methods, e.g., time characteristics (segmentation, window types, and classification regions—lengths and overlaps), frequency ranges, frequency scales, processing of whole speech (spectrograms), vocal tract (filter banks, linear prediction coefficient (LPC) modeling), and excitation (inverse LPC filtering) signals, magnitude and phase manipulations, cepstral features, etc. In the evaluation phase the state-of-the-art classification method and rigorous statistical tests were applied, namely N-fold cross validation, paired t-test, rank, and Pearson correlations. The results revealed several settings in a 75% accuracy range (seven emotions). The most successful methods were based on vocal tract features using psychoacoustic filter banks covering the 0–8 kHz frequency range. Well scoring are also spectrograms carrying vocal tract and excitation information. It was found that even basic processing like pre-emphasis, segmentation, magnitude modifications, etc., can dramatically affect the results. Most findings are robust by exhibiting strong correlations across tested databases.

Download Full-text