Monitoring Cognitive Workload Using Vocal Tract and Voice Source Features

Eydis Huld Magnusdottir; Michal Borsky; Manuela Meier; Kamilla Johannsdottir; Jon Gudnason

doi:10.3311/ppee.10414

Monitoring Cognitive Workload Using Vocal Tract and Voice Source Features

Periodica Polytechnica Electrical Engineering and Computer Science ◽

10.3311/ppee.10414 ◽

2017 ◽

Vol 61 (4) ◽

pp. 297 ◽

Cited By ~ 3

Author(s):

Eydis Huld Magnusdottir ◽

Michal Borsky ◽

Manuela Meier ◽

Kamilla Johannsdottir ◽

Jon Gudnason

Keyword(s):

Speech Signal ◽

Vocal Tract ◽

Memory Span ◽

Svm Classifier ◽

Cognitive Workload ◽

Voice Source ◽

Human Decision ◽

Rest Time ◽

The Voice ◽

Stroop Tasks

Monitoring cognitive workload from speech signals has received much attention from researchers in the past few years as it has the potential to improve performance and fidelity in human decision making. The bulk of the research has focused on classifying speech from talkers participating in cognitive workload experiments using simple reading tasks, memory span tests and the Stroop test, typically into three levels of low, medium and high cognitive workload. This study focuses on using parameters extracted from the vocal tract and the voice source components of the speech signal for cognitive workload monitoring. The experiment used in this study contains 98 participants, the levels were obtained by using a reading task and three Stroop tasks which were randomly ordered for each participant and an adequate rest time was used inbetween tasks to mitigate the effect of cognitive workload from one task affecting the subsequent one. Vocal tract features were obtained from the first three formants and voice source features were extracted using signal analysis on the inverse filtered speech signal. The results show that on their own, the vocal tract features outperform the voice source features. The MCR of 33.92% ± 1.05 was achieved with a SVM classifier. A weighted combination of vocal tract and voice source features classified with SWM classifier fused at the output level achieved the lowest MCR of 32.5%.

Download Full-text

Acoustic Properties of the Voice Source and the Vocal Tract: Are They Perceptually Independent?

Journal of Voice ◽

10.1016/j.jvoice.2015.11.010 ◽

2016 ◽

Vol 30 (6) ◽

pp. 772.e9-772.e22 ◽

Cited By ~ 2

Author(s):

Molly L. Erickson

Keyword(s):

Vocal Tract ◽

Acoustic Properties ◽

Voice Source ◽

The Voice

Download Full-text

The Singing Voice

The Oxford Handbook of Voice Perception ◽

10.1093/oxfordhb/9780198743187.013.6 ◽

2018 ◽

pp. 116-142

Author(s):

Johan Sundberg

Keyword(s):

Mechanical Properties ◽

Vocal Tract ◽

Voice Quality ◽

Vocal Folds ◽

Sound Level ◽

Vowel Sound ◽

Formant Frequency ◽

Voice Source ◽

Subglottal Pressure ◽

The Voice

The sound quality of singing is determined by three basic factors—the air pressure under the vocal folds (or the subglottal pressure), the mechanical properties of the vocal folds, and the resonance properties of the vocal tract. Subglottal pressure is controlled by the respiratory apparatus. It regulates vocal loudness and is varied with pitch in singing. Together with the mechanical properties of the folds, which are controlled by laryngeal muscles, it has a decisive influence on vocal fold vibrationswhich convert the tracheal airstream to a pulsating airflow, the voice source. The voice source determines pitch, vibrato, and register, and also the overall slope of the spectrum. The sound of the voice source is filtered by the resonances of the vocal tract, or the formants, of which the two lowest determine the vowel quality and the higher ones the personal voice quality. Timing is crucial for creating emotional expressivity; it uses an acoustic code that shows striking similarities to that used in speech. The perceived loudness of a vowel sound seems more closely related to the subglottal pressure with which it was produced than with the acoustical sound level. Some investigations of acoustical correlates of tone placement and variation of larynx height are described, as are properties that affect the perceived naturalness of synthesized singing. Finally, subglottal pressure, voice source, and formant-frequency characteristics of some non-classical styles of singing are discussed.

Download Full-text

Vocal tract and voice source features for monitoring cognitive workload

2016 7th IEEE International Conference on Cognitive Infocommunications (CogInfoCom) ◽

10.1109/coginfocom.2016.7804532 ◽

2016 ◽

Cited By ~ 7

Author(s):

Manuela Meier ◽

Michal Borsky ◽

Eydis H. Magnusdottir ◽

Kamilla R. Johannsdottir ◽

Jon Gudnason

Keyword(s):

Vocal Tract ◽

Cognitive Workload ◽

Voice Source

Download Full-text

Acoustic interactions of the voice source with the lower vocal tract

The Journal of the Acoustical Society of America ◽

10.1121/1.418246 ◽

1997 ◽

Vol 101 (4) ◽

pp. 2234-2243 ◽

Cited By ~ 178

Author(s):

Ingo R. Titze ◽

Brad H. Story

Keyword(s):

Vocal Tract ◽

Voice Source ◽

The Voice

Download Full-text

Speech Emotional Features Extraction Based on Electroglottograph

Neural Computation ◽

10.1162/neco_a_00523 ◽

2013 ◽

Vol 25 (12) ◽

pp. 3294-3317 ◽

Cited By ~ 7

Author(s):

Lijiang Chen ◽

Xia Mao ◽

Pengfei Wei ◽

Angelo Compare

Keyword(s):

Emotion Recognition ◽

Speech Signal ◽

Vocal Tract ◽

Vocal Folds ◽

Distribution Coefficients ◽

Speech Emotion Recognition ◽

Support Vector ◽

Power Law Distribution ◽

Transform Coefficients ◽

Better Than

This study proposes two classes of speech emotional features extracted from electroglottography (EGG) and speech signal. The power-law distribution coefficients (PLDC) of voiced segments duration, pitch rise duration, and pitch down duration are obtained to reflect the information of vocal folds excitation. The real discrete cosine transform coefficients of the normalized spectrum of EGG and speech signal are calculated to reflect the information of vocal tract modulation. Two experiments are carried out. One is of proposed features and traditional features based on sequential forward floating search and sequential backward floating search. The other is the comparative emotion recognition based on support vector machine. The results show that proposed features are better than those commonly used in the case of speaker-independent and content-independent speech emotion recognition.

Download Full-text

The Spectral Characteristics Research of the Voice-Speech Signal in Dysphonia

10.1109/apeie52976.2021.9647441 ◽

2021 ◽

Author(s):

Olga A. Loskutova ◽

Anastasia. V. Nenko ◽

Yana. A. Berg ◽

Daria V. Borovikova ◽

Anton V. Yupashevsky

Keyword(s):

Speech Signal ◽

Spectral Characteristics ◽

The Voice

Download Full-text

Physiology and its Impact on the Performance of Singing

The Oxford Handbook of Singing ◽

10.1093/oxfordhb/9780199660773.013.23 ◽

2015 ◽

pp. 66-84

Author(s):

Filipa M. B. Lã ◽

Brian P. Gill

Keyword(s):

Vocal Tract ◽

Teaching Method ◽

Vocal Technique ◽

Voice Source ◽

Advantages And Disadvantages ◽

Physiological Processes ◽

Subglottal Pressure ◽

Source Signal ◽

Frequency Components ◽

Singing Performance

Singing performance is highly competitive; thus, finding strategies to accelerate the acquisition of knowledge that results in an efficient and effective vocal technique is of the utmost importance. There are many ways in which a singer may acquire an efficient and effective vocal technique, which can be based on the physiological processes of voice production. This chapter explores these processes within the context of singing performance. The authors examine three major aspects of singing: 1) efficient control of breathing, such that optimal airflow and subglottal pressure are available as needed, for a given frequency and intensity; 2) maximized laryngeal coordination, so that the voice source signal contains all the necessary frequency components for the desired tone; and 3) the modulation of the source signal by subtle shaping of the vocal tract. The advantages and disadvantages of various pedagogical methods are discussed, including breath management, known as appoggio, and different resonant strategies. The authors advocate for a scientifically-grounded teaching method, which allows for physiological differences between individuals, genders, and voice classifications.

Download Full-text

Formant structure of the voice during the intensive acute hypoxia

Vojnosanitetski pregled ◽

10.2298/vsp0302155o ◽

2003 ◽

Vol 60 (2) ◽

pp. 155-159 ◽

Cited By ~ 2

Author(s):

Jovisa Obrenovic ◽

Milkica Nesic ◽

Vladimir Nesic ◽

Snezana Cekic

Keyword(s):

Speech Signal ◽

Signal Analysis ◽

Acute Hypoxia ◽

Initial Period ◽

Formant Frequencies ◽

Hypoxia Exposure ◽

Structure Changes ◽

The Voice ◽

Reversed Order ◽

Different Altitudes

The influence of intensive acute hypoxia on the frequency-amplitude formant vocal O characteristics was investigated in this study. Examinees were exposed to the simulated altitudes of 5 500 m and 6 700 m in climabaro chamber and resolved Lotig?s test in the conditions of normoxia, i.e. pronounced the three-digit numbers beginning from 900, but in reversed order. Frequency and intensity values of vocal O (F1, F2, F3 and F4) extracted from the context of the pronunciation of the word eight (osam in Serbian), were measured by spectral speech signal analysis. Changes in frequency values and the intensity of the formants were examined. The obtained results showed that there were no significant changes of the formant frequencies in hypoxia condition compared to normoxia. Though significant changes of formant?s intensities were found compared to normoxia on the cited altitudes. The rise of formants intensities was found at the altitude of 5 500 m. Hypoxia at the altitude of 6 700 m caused the significant fall of the intensities in the initial period, compared to normoxia. The prolonged hypoxia exposure caused the rise of the formant intensities compared to the altitude of 5 500 m. In may be concluded that due to different altitudes, hypoxia causes different effects on the formants structure changes, compared to normoxia.

Download Full-text

A Machine Learning Approach for Speech Detection in Modern Wireless Communication Environment

International Journal of Machine Learning and Networked Collaborative Engineering ◽

10.30991/ijmlnce.2018v02i04.004 ◽

2018 ◽

Vol 2 (4) ◽

Author(s):

Shibanee Dash . ◽

Mihir Narayan Mohanty .

Keyword(s):

Wireless Communication ◽

Speech Signal ◽

Symbol Error Rate ◽

Speech Communication ◽

Noise Levels ◽

Speech Detection ◽

Voice Signal ◽

Proposed Model ◽

Machine Learning Approach ◽

The Voice

Modern wireless communication has gained a improved position as compared to previous time. Similarly, speech communication is the major focus area of research in respective applications. Many developments are done in this field. In this work, we have chosen the OFDM modulation based communication system, as it has importance in both licensed and unlicensed wireless communication platform. The voice signal is passed though the proposed model to obtain at the receiver end. Due to different circumstances, the signal may be corrupted partially at the user end. Authors try to achieve a better signal for reception using a neural network model of RBFN. The parameters are chosen for the RBFN model, as energy, ZCR, ACF, and fundamental frequency of the speech signal. In one part these parameters have eligibility to eliminate noise partially, where as in other part the RBFN model with these parameters proves its efficacy for both noisy speech signals with noisy channel as Gaussian channel. The efficiency of OFDM model is verified in terms of symbol error rate and the transmitted speech signal is evaluated in term of SNR that shows the reduction of noise. For visual inspection, a sample of signal, noisy signal and received signal is also shown. The experiment is performed with 5dB, 10dB, 15dB noise levels. The result proves the performance of RBFN model as the filter.The performance is measured as the listener’s voice in each condition. The results show that, at the time of the voice in noise environment, proposed technique improves the intelligibility on speech quality.

Download Full-text

Experiment on investigating the voice cords functioning by the speech signal

Acoustical Physics ◽

10.1134/1.1494032 ◽

2002 ◽

Vol 48 (4) ◽

pp. 497-501

Author(s):

A. V. Nikolaev

Keyword(s):

Speech Signal ◽

The Voice

Download Full-text