Development of Machine Learning for Asthmatic and Healthy Voluntary Cough Sounds: A Proof of Concept Study

Hwan Ing Hee; BT Balamurali; Arivazhagan Karunakaran; Dorien Herremans; Onn Hoe Teoh; Khai Pin Lee; Sung Shin Teng; Simon Lui; Jer Ming Chen

doi:10.3390/app9142833

Development of Machine Learning for Asthmatic and Healthy Voluntary Cough Sounds: A Proof of Concept Study

Applied Sciences ◽

10.3390/app9142833 ◽

2019 ◽

Vol 9 (14) ◽

pp. 2833 ◽

Cited By ~ 6

Author(s):

Hwan Ing Hee ◽

BT Balamurali ◽

Arivazhagan Karunakaran ◽

Dorien Herremans ◽

Onn Hoe Teoh ◽

...

Keyword(s):

Machine Learning ◽

Demographic Data ◽

Predictive Performance ◽

Gaussian Mixture ◽

Classification Model ◽

Healthy Children ◽

Mel Frequency Cepstral Coefficients ◽

Voluntary Cough ◽

Audio Features ◽

Cepstral Coefficients

(1) Background: Cough is a major presentation in childhood asthma. Here, we aim to develop a machine-learning based cough sound classifier for asthmatic and healthy children. (2) Methods: Children less than 16 years old were randomly recruited in a Children’s Hospital, from February 2017 to April 2018, and were divided into 2 cohorts—healthy children and children with acute asthma presenting with cough. Children with other concurrent respiratory conditions were excluded in the asthmatic cohort. Demographic data, duration of cough, and history of respiratory status were obtained. Children were instructed to produce voluntary cough sounds. These clinically labeled cough sounds were randomly divided into training and testing sets. Audio features such as Mel-Frequency Cepstral Coefficients and Constant-Q Cepstral Coefficients were extracted. Using a training set, a classification model was developed with Gaussian Mixture Model–Universal Background Model (GMM-UBM). Its predictive performance was tested using the test set against the physicians’ labels. (3) Results: Asthmatic cough sounds from 89 children (totaling 1192 cough sounds) and healthy coughs from 89 children (totaling 1140 cough sounds) were analyzed. The sensitivity and specificity of the audio-based classification model was 82.81% and 84.76%, respectively, when differentiating coughs from asthmatic children versus coughs from ‘healthy’ children. (4) Conclusion: Audio-based classification using machine learning is a potentially useful technique in assisting the differentiation of asthmatic cough sounds from healthy voluntary cough sounds in children.

Download Full-text

Comparison of feature extraction and normalization methods for speaker recognition using grid-audiovisual database

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i2.pp782-789 ◽

2020 ◽

Vol 18 (2) ◽

pp. 782

Author(s):

Musab T. S. Al-Kaltakchi ◽

Haithem Abd Al-Raheem Taha ◽

Mohanad Abd Shehab ◽

Mohamed A.M. Abdullah

Keyword(s):

Feature Extraction ◽

Speaker Recognition ◽

Speaker Identification ◽

Gaussian Mixture ◽

Identification Accuracy ◽

Identification System ◽

Good Representation ◽

Mel Frequency Cepstral Coefficients ◽

Normalization Methods ◽

Cepstral Coefficients

<p><span lang="EN-GB">In this paper, different feature extraction and feature normalization methods are investigated for speaker recognition. With a view to give a good representation of acoustic speech signals, Power Normalized Cepstral Coefficients (PNCCs) and Mel Frequency Cepstral Coefficients (MFCCs) are employed for feature extraction. Then, to mitigate the effect of linear channel, Cepstral Mean-Variance Normalization (CMVN) and feature warping are utilized. The current paper investigates Text-independent speaker identification system by using 16 coefficients from both the MFCCs and PNCCs features. Eight different speakers are selected from the GRID-Audiovisual database with two females and six males. The speakers are modeled using the coupling between the Universal Background Model and Gaussian Mixture Models (GMM-UBM) in order to get a fast scoring technique and better performance. The system shows 100% in terms of speaker identification accuracy. The results illustrated that PNCCs features have better performance compared to the MFCCs features to identify females compared to male speakers. Furthermore, feature wrapping reported better performance compared to the CMVN method. </span></p>

Download Full-text

Faults Detection Using Gaussian Mixture Models, Mel-Frequency Cepstral Coefficients and Kurtosis

2006 IEEE International Conference on Systems, Man and Cybernetics ◽

10.1109/icsmc.2006.384397 ◽

2006 ◽

Cited By ~ 12

Author(s):

Fulufhelo V. Nelwamondo ◽

Tshilidzi Marwala

Keyword(s):

Mixture Models ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Mel Frequency Cepstral Coefficients ◽

Cepstral Coefficients

Download Full-text

The impact of data-complexity and team characteristics on performance in the classification model

International Journal of Business Analytics ◽

10.4018/ijban.288517 ◽

2022 ◽

Vol 9 (1) ◽

pp. 0-0

Keyword(s):

Machine Learning ◽

Binary Classification ◽

Class Imbalance ◽

Predictive Ability ◽

Predictive Performance ◽

Classification Model ◽

Data Complexity ◽

High Performing ◽

Machine Learning Model ◽

The Impact

This article investigates the impact of data-complexity and team-specific characteristics on machine learning competition scores. Data from five real-world binary classification competitions hosted on Kaggle.com were analyzed. The data-complexity characteristics were measured in four aspects including standard measures, sparsity measures, class imbalance measures, and feature-based measures. The results showed that the higher the level of the data-complexity characteristics was, the lower the predictive ability of the machine learning model was as well. Our empirical evidence revealed that the imbalance ratio of the target variable was the most important factor and exhibited a nonlinear relationship with the model’s predictive abilities. The imbalance ratio adversely affected the predictive performance when it reached a certain level. However, mixed results were found for the impact of team-specific characteristics measured by team size, team expertise, and the number of submissions on team performance. For high-performing teams, these factors had no impact on team score.

Download Full-text

Environment Recognition Using Selected MPEG-7 Audio Features and Mel-Frequency Cepstral Coefficients

2010 Fifth International Conference on Digital Telecommunications ◽

10.1109/icdt.2010.10 ◽

2010 ◽

Cited By ~ 19

Author(s):

Ghulam Muhammad ◽

Yousef A. Alotaibi ◽

Mansour Alsulaiman ◽

Mohammad Nurul Huda

Keyword(s):

Mel Frequency Cepstral Coefficients ◽

Audio Features ◽

Environment Recognition ◽

Cepstral Coefficients

Download Full-text

Convolutive ICA-Based Forensic Speaker Identification Using Mel Frequency Cepstral Coefficients and Gaussian Mixture Models

The International Journal of Forensic Computer Science ◽

10.5769/j201301004 ◽

2013 ◽

Vol 8 (1) ◽

pp. 27-34 ◽

Cited By ~ 4

Author(s):

Matheus Silveira ◽

Cezar Schroeder ◽

João Paulo Costa ◽

Celso Oliveira ◽

José Apolinário Junior ◽

...

Keyword(s):

Mixture Models ◽

Speaker Identification ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Mel Frequency Cepstral Coefficients ◽

Cepstral Coefficients ◽

Forensic Speaker Identification

Download Full-text

Quartiles and Mel Frequency Cepstral Coefficients vectors in Hidden Markov-Gaussian Mixture Models classification of merged heart sounds and lung sounds signals

2015 International Conference on High Performance Computing & Simulation (HPCS) ◽

10.1109/hpcsim.2015.7237053 ◽

2015 ◽

Cited By ~ 3

Author(s):

Pedro Mayorga ◽

Daniela Ibarra ◽

Vesna Zeljkovic ◽

Christopher Druzgalski

Keyword(s):

Mixture Models ◽

Hidden Markov ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Heart Sounds ◽

Lung Sounds ◽

Mel Frequency Cepstral Coefficients ◽

Cepstral Coefficients

Download Full-text

Speech Based Arithmetic Calculator Using Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models

Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics - Smart Innovation, Systems and Technologies ◽

10.1007/978-81-322-2538-6_22 ◽

2015 ◽

pp. 209-218

Author(s):

Moula Husain ◽

S. M. Meena ◽

Manjunath K. Gonal

Keyword(s):

Mixture Models ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Mel Frequency Cepstral Coefficients ◽

Cepstral Coefficients

Download Full-text

Voice control for a gripper using Mel-Frequency Cepstral Coefficients and Gaussian Mixture Models

2015 20th Symposium on Signal Processing, Images and Computer Vision (STSIVA) ◽

10.1109/stsiva.2015.7330391 ◽

2015 ◽

Cited By ~ 1

Author(s):

Gustavo Velasco-Hernandez ◽

Andres Diaz-Toro

Keyword(s):

Mixture Models ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Mel Frequency Cepstral Coefficients ◽

Voice Control ◽

Cepstral Coefficients

Download Full-text

Artificial Neural Network Based Amharic Language Speaker Recognition

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i3.2043 ◽

2021 ◽

Vol 12 (3) ◽

pp. 5105-5116

Author(s):

Gizachew Belayneh Gebre Et. al.

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Speaker Recognition ◽

Gaussian Mixture ◽

Fourth Generation ◽

Mel Frequency Cepstral Coefficients ◽

Authorized User ◽

Artificial Neural ◽

Cepstral Coefficients ◽

High Level

In this artificial intelligence time, speaker recognition is the most useful biometric recognition technique. Security is a big issue that needs careful attention because of every activities have been becoming automated and internet based. For security purpose, unique features of authorized user are highly needed. Voice is one of the wonderful unique biometric features. So, developing speaker recognition based on scientific research is the most concerned issue. Nowadays, criminal activities are increasing day to day in different clever way. So, every country should have strengthen forensic investigation using such technologies. The study was done by inspiration of contextualizing this concept for our country. In this study, text-independent Amharic language speaker recognition model was developed using Mel-Frequency Cepstral Coefficients to extract features from preprocessed speech signals and Artificial Neural Network to model the feature vector obtained from the Mel-Frequency Cepstral Coefficients and to classify objects while testing. The researcher used 20 sampled speeches of 10 each speaker (total of 200 speech samples) for training and testing separately. By setting the number of hidden neurons to 15, 20, and 25, three different models have been developed and evaluated for accuracy. The fourth-generation high-level programming language and interactive environment MATLAB is used to conduct the overall study implementations. At the end, very promising findings have been obtained. The study achieved better performance than other related researches which used Vector Quantization and Gaussian Mixture Model modelling techniques. Implementable result could obtain for the future by increasing number of speakers and speech samples and including the four Amharic accents.

Download Full-text

Classifying Autism from Crowdsourced Semi-Structured Speech Recordings: A Machine Learning System (Preprint)

10.2196/preprints.35406 ◽

2021 ◽

Author(s):

Nathan Chi ◽

Peter Washington ◽

Aaron Kline ◽

Arman Husic ◽

Cathy Hou ◽

...

Keyword(s):

Machine Learning ◽

Neurodevelopmental Disorder ◽

Model Performance ◽

Autism Spectrum ◽

Learning System ◽

Learning Approaches ◽

Mel Frequency Cepstral Coefficients ◽

Audio Features ◽

Specialized Equipment ◽

Child Speech

BACKGROUND Autism spectrum disorder (ASD) is a neurodevelopmental disorder which results in altered behavior, social development, and communication patterns. In past years, autism prevalence has tripled, with 1 in 54 children now affected. Given that traditional diagnosis is a lengthy, labor-intensive process which requires the work of trained physicians, significant attention has been given to developing systems that automatically diagnose and screen for autism. OBJECTIVE Prosody abnormalities are among the most clear signs of autism, with affected children displaying speech idiosyncrasies (including echolalia, monotonous intonation, atypical pitch, and irregular linguistic stress patterns). In this work, we present a suite of machine learning approaches to detect autism in self-recorded speech audio captured from autistic and neurotypical (NT) children in home environments. METHODS We consider three methods to detect autism in child speech: first, Random Forests trained on extracted audio features (including Mel-frequency cepstral coefficients); second, convolutional neural networks (CNNs) trained on spectrograms; and third, fine-tuned wav2vec 2.0—a state-of-the-art Transformer-based speech recognition model. We train our classifiers on our novel dataset of cellphone-recorded child speech audio curated from Stanford’s Guess What? mobile game, an app designed to crowdsource videos of autistic and neurotypical children in a natural home environment. RESULTS The Random Forest classifier achieves 70% accuracy, the fine-tuned wav2vec 2.0 model achieves 77% accuracy, and the CNN achieves 79% accuracy when classifying children’s audio as either ASD or NT. We use five-fold cross-validation to evaluate model performance. CONCLUSIONS Our models were able to predict autism status when training on a varied selection of home audio clips with inconsistent recording qualities, which may be more generalizable to real world conditions. The results demonstrate that machine learning methods offer promise in detecting autism automatically from speech without specialized equipment.

Download Full-text