Bidirectional neural network for pathological voice detection

Experimental Evaluation of Deep Learning Methods for an Intelligent Pathological Voice Detection System Using the Saarbruecken Voice Database

Applied Sciences ◽

10.3390/app11157149 ◽

2021 ◽

Vol 11 (15) ◽

pp. 7149

Author(s):

Ji-Yeoun Lee

Keyword(s):

Neural Network ◽

Deep Learning ◽

Linear Prediction ◽

Detection System ◽

Mel Frequency Cepstral Coefficients ◽

Learning Methods ◽

Acoustic Measures ◽

Voice Detection ◽

Voice Data ◽

Pathological Voice

This work is focused on deep learning methods, such as feedforward neural network (FNN) and convolutional neural network (CNN), for pathological voice detection using mel-frequency cepstral coefficients (MFCCs), linear prediction cepstrum coefficients (LPCCs), and higher-order statistics (HOSs) parameters. In total, 518 voice data samples were obtained from the publicly available Saarbruecken voice database (SVD), comprising recordings of 259 healthy and 259 pathological women and men, respectively, and using /a/, /i/, and /u/ vowels at normal pitch. Significant differences were observed between the normal and the pathological voice signals for normalized skewness (p = 0.000) and kurtosis (p = 0.000), except for normalized kurtosis (p = 0.051) that was estimated in the /u/ samples in women. These parameters are useful and meaningful for classifying pathological voice signals. The highest accuracy, 82.69%, was achieved by the CNN classifier with the LPCCs parameter in the /u/ vowel in men. The second-best performance, 80.77%, was obtained with a combination of the FNN classifier, MFCCs, and HOSs for the /i/ vowel samples in women. There was merit in combining the acoustic measures with HOS parameters for better characterization in terms of accuracy. The combination of various parameters and deep learning methods was also useful for distinguishing normal from pathological voices.

Download Full-text

Automatic Estimation of Pathological Voice Quality Based on Recurrent Neural Network Using Amplitude and Phase Spectrogram

10.21437/interspeech.2020-3228 ◽

2020 ◽

Author(s):

Shunsuke Hidaka ◽

Yogaku Lee ◽

Kohei Wakamiya ◽

Takashi Nakagawa ◽

Tokihiko Kaburagi

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Voice Quality ◽

Automatic Estimation ◽

Pathological Voice

Download Full-text

An improved algorithm for the Low Band Spectral Tilt estimation for pathological voice detection

2019 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA) ◽

10.23919/spa.2019.8936765 ◽

2019 ◽

Author(s):

Hugo Cordeiro ◽

Carlos Meneses

Keyword(s):

Voice Detection ◽

Pathological Voice ◽

Spectral Tilt ◽

Improved Algorithm

Download Full-text

Convolutional Neural Networks for Pathological Voice Detection

2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) ◽

10.1109/embc.2018.8513222 ◽

2018 ◽

Cited By ~ 4

Author(s):

Huiyi Wu ◽

John Soraghan ◽

Anja Lowit ◽

Gaetano Di Caterina

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Voice Detection ◽

Pathological Voice

Download Full-text

Artificial Intelligence Application for Vocal Fold Disease Prediction Through Voice Recognition: Development and Usability Study (Preprint)

10.2196/preprints.25247 ◽

2020 ◽

Author(s):

Hao-Chun Hu ◽

Shyue-Yih Chang ◽

Chuen-Heng Wang ◽

Kai-Jun Li ◽

Hsiao-Yun Cho ◽

...

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Primary Care ◽

Convolutional Neural Network ◽

Vocal Fold ◽

Voice Recognition ◽

Spasmodic Dysphonia ◽

Adductor Spasmodic Dysphonia ◽

Normal Voice ◽

Pathological Voice

BACKGROUND Dysphonia influences the quality of life by interfering with communication. However, laryngoscopic examination is expensive and not readily accessible in primary care units. Experienced laryngologists are required to achieve an accurate diagnosis. OBJECTIVE This study sought to detect various vocal fold diseases through pathological voice recognition using artificial intelligence. METHODS We collected 29 normal voice samples and 527 samples of individuals with voice disorders, including vocal atrophy (n=210), unilateral vocal paralysis (n=43), organic vocal fold lesions (n=244), and adductor spasmodic dysphonia (n=30). The 556 samples were divided into two sets: 440 samples as the training set and 116 samples as the testing set. A convolutional neural network approach was applied to train the model and findings were compared with human specialists. RESULTS The convolutional neural network model achieved a sensitivity of 0.70, a specificity of 0.90, and an overall accuracy of 65.5% for distinguishing normal voice, vocal atrophy, unilateral vocal paralysis, organic vocal fold lesions, and adductor spasmodic dysphonia. Compared to human specialists, the overall accuracy was 58.6% and 49.1% for the two laryngologists, and 38.8% and 34.5% for the two general ear, nose, and throat doctors. CONCLUSIONS We developed an artificial intelligence-based screening tool for common vocal fold diseases, which possessed high specificity after training with our Mandarin pathological voice database. This approach has clinical potential to use artificial intelligence for general vocal fold disease screening via voice and includes a quick survey during a general health examination. It can be applied in telemedicine for areas that lack laryngoscopic abilities in primary care units.

Download Full-text

A Deep Learning Method for Pathological Voice Detection Using Convolutional Deep Belief Networks

10.21437/interspeech.2018-1351 ◽

2018 ◽

Cited By ~ 7

Author(s):

Huiyi Wu ◽

John Soraghan ◽

Anja Lowit ◽

Gaetano Di-Caterina

Keyword(s):

Deep Learning ◽

Learning Method ◽

Belief Networks ◽

Deep Belief Networks ◽

Voice Detection ◽

Pathological Voice

Download Full-text

A Survey on Signal Processing Based Pathological Voice Detection Techniques

IEEE Access ◽

10.1109/access.2020.2985280 ◽

2020 ◽

Vol 8 ◽

pp. 66749-66776

Author(s):

Rumana Islam ◽

Mohammed Tarique ◽

Esam Abdel-Raheem

Keyword(s):

Signal Processing ◽

Detection Techniques ◽

Voice Detection ◽

Pathological Voice

Download Full-text

Pathological Voice Detection Using Transfer Learning Methods

10.1109/icsmd53520.2021.9670828 ◽

2021 ◽

Author(s):

Zhang Yihua ◽

Zhu Xincheng ◽

Wu Yuanbo ◽

Zhang Xiaojun ◽

Xu Yishen ◽

...

Keyword(s):

Transfer Learning ◽

Learning Methods ◽

Voice Detection ◽

Pathological Voice

Download Full-text

Deep Neural Network for Automatic Classification of Pathological Voice Signals

Journal of Voice ◽

10.1016/j.jvoice.2020.05.029 ◽

2020 ◽

Author(s):

Lili Chen ◽

Junjiang Chen

Keyword(s):

Neural Network ◽

Deep Neural Network ◽

Automatic Classification ◽

Pathological Voice

Download Full-text

Pathological voice detection and binary classification using MPEG-7 audio features

Biomedical Signal Processing and Control ◽

10.1016/j.bspc.2014.02.001 ◽

2014 ◽

Vol 11 ◽

pp. 1-9 ◽

Cited By ~ 51

Author(s):

Ghulam Muhammad ◽

Moutasem Melhem

Keyword(s):

Binary Classification ◽

Audio Features ◽

Voice Detection ◽

Pathological Voice

Download Full-text