Listen2Cough

Author(s):  
Xuhai Xu ◽  
Ebrahim Nemati ◽  
Korosh Vatanparvar ◽  
Viswam Nathan ◽  
Tousif Ahmed ◽  
...  

The prevalence of ubiquitous computing enables new opportunities for lung health monitoring and assessment. In the past few years, there have been extensive studies on cough detection using passively sensed audio signals. However, the generalizability of a cough detection model when applied to external datasets, especially in real-world implementation, is questionable and not explored adequately. Beyond detecting coughs, researchers have looked into how cough sounds can be used in assessing lung health. However, due to the challenges in collecting both cough sounds and lung health condition ground truth, previous studies have been hindered by the limited datasets. In this paper, we propose Listen2Cough to address these gaps. We first build an end-to-end deep learning architecture using public cough sound datasets to detect coughs within raw audio recordings. We employ a pre-trained MobileNet and integrate a number of augmentation techniques to improve the generalizability of our model. Without additional fine-tuning, our model is able to achieve an F1 score of 0.948 when tested against a new clean dataset, and 0.884 on another in-the-wild noisy dataset, leading to an advantage of 5.8% and 8.4% on average over the best baseline model, respectively. Then, to mitigate the issue of limited lung health data, we propose to transform the cough detection task to lung health assessment tasks so that the rich cough data can be leveraged. Our hypothesis is that these tasks extract and utilize similar effective representation from cough sounds. We embed the cough detection model into a multi-instance learning framework with the attention mechanism and further tune the model for lung health assessment tasks. Our final model achieves an F1-score of 0.912 on healthy v.s. unhealthy, 0.870 on obstructive v.s. non-obstructive, and 0.813 on COPD v.s. asthma classification, outperforming the baseline by 10.7%, 6.3%, and 3.7%, respectively. Moreover, the weight value in the attention layer can be used to identify important coughs highly correlated with lung health, which can potentially provide interpretability for expert diagnosis in the future.

2021 ◽  
Vol 27 (1) ◽  
Author(s):  
Paulo Drews-Jr ◽  
Isadora de Souza ◽  
Igor P. Maurell ◽  
Eglen V. Protas ◽  
Silvia S. C. Botelho

AbstractImage segmentation is an important step in many computer vision and image processing algorithms. It is often adopted in tasks such as object detection, classification, and tracking. The segmentation of underwater images is a challenging problem as the water and particles present in the water scatter and absorb the light rays. These effects make the application of traditional segmentation methods cumbersome. Besides that, to use the state-of-the-art segmentation methods to face this problem, which are based on deep learning, an underwater image segmentation dataset must be proposed. So, in this paper, we develop a dataset of real underwater images, and some other combinations using simulated data, to allow the training of two of the best deep learning segmentation architectures, aiming to deal with segmentation of underwater images in the wild. In addition to models trained in these datasets, fine-tuning and image restoration strategies are explored too. To do a more meaningful evaluation, all the models are compared in the testing set of real underwater images. We show that methods obtain impressive results, mainly when trained with our real dataset, comparing with manually segmented ground truth, even using a relatively small number of labeled underwater training images.


Geomatics ◽  
2021 ◽  
Vol 1 (1) ◽  
pp. 34-49
Author(s):  
Mael Moreni ◽  
Jerome Theau ◽  
Samuel Foucher

The combination of unmanned aerial vehicles (UAV) with deep learning models has the capacity to replace manned aircrafts for wildlife surveys. However, the scarcity of animals in the wild often leads to highly unbalanced, large datasets for which even a good detection method can return a large amount of false detections. Our objectives in this paper were to design a training method that would reduce training time, decrease the number of false positives and alleviate the fine-tuning effort of an image classifier in a context of animal surveys. We acquired two highly unbalanced datasets of deer images with a UAV and trained a Resnet-18 classifier using hard-negative mining and a series of recent techniques. Our method achieved sub-decimal false positive rates on two test sets (1 false positive per 19,162 and 213,312 negatives respectively), while training on small but relevant fractions of the data. The resulting training times were therefore significantly shorter than they would have been using the whole datasets. This high level of efficiency was achieved with little tuning effort and using simple techniques. We believe this parsimonious approach to dealing with highly unbalanced, large datasets could be particularly useful to projects with either limited resources or extremely large datasets.


2010 ◽  
Vol 138 (11-12) ◽  
pp. 746-751
Author(s):  
Momcilo Mirkovic ◽  
Snezana Simic ◽  
Jelena Marinkovic ◽  
Sladjana Djuric

Introduction. For health assessment, beside the data of routine health statistics, it is necessary to include and data obtained by a health survey of the citizens. Objective. The aim of this study was to establish how northern Kosovska Mitrovica adults assess their health and which diseases are most common among the population, as well as to investigate differences in relation to demographic and socioeconomic characteristics, the characteristics of social interaction and health behavior and habits. Methods. The research was conducted as a cross-sectional study conducted on the representative sample of adult citizens in northern Kosovska Mitrovica in 2006. Two hundred-eighteen respondents were included in the survey. In the research we used a questionnaire identical to the Health Survey conducted in Serbia in 2006. The significance of differences in responses about self-rated health and chronic diseases in relation to the characteristics of respondents? responses were determined by X2-test with the significance level of 0.05. Results. Over half of the respondents (54.7%) assessed their health condition as good or very good. There was a significant difference in self-rated health in relation to the respondents? age (?2=202.036; p=0.000), education (?2=72.412; p=0.000), social support (?2=12.416; p=0.015), smoking (?2=11.675; p=0.020) and physical activity (?2=61.842; p=0.000). The leading health problems among the respondents were high blood pressure, rheumatologic diseases of joints, ulcer of the duodenal or gastric ulcer, gall bladder disease and high blood fat. Conclusion. Adult residents of northern Kosovska Mitrovica assessed their health as better than the residents of Serbia without Kosovo and Metohia. The diseases in which stress plays the major role among etiological factors are in the leading position. The obtained data on the population level of specific areas represent the basis in the planning of health education and health promotion activities.


Author(s):  
Laszlo Arvai

The recent achievements in mobile technology and wearable OS makes possible to create comfortably wearable and very capable smartwatches. They have many different sensors and powerful hardware combined with general purpose OS and all this available for reasonable price. It makes it ideal device for elderly care. Monitoring the elderly’s basic health condition is very straightforward, but using smartwatch as an indoor localization device, monitoring the motion activity, recognizing the typical motion patterns of wandering is not simple. Even those watches are really capable devices, they are not equipped with direct indoor localization sensors and we would like to avoid installing special equipment’s, markers, transmitters in the home of elderly. Using only a commercially available smartwatch hardware for indoor localization is a challenging task, several filtering and data processing algorithms needs to be combined in order to provide acceptable indoor localization function. The algorithms, their connection and fine-tuning methods are explained in this article.


2016 ◽  
Vol 2016 ◽  
pp. 1-18 ◽  
Author(s):  
A. Romero ◽  
Y. Lage ◽  
S. Soua ◽  
B. Wang ◽  
T.-H. Gan

Reliable monitoring for the early fault diagnosis of gearbox faults is of great concern for the wind industry. This paper presents a novel approach for health condition monitoring (CM) and fault diagnosis in wind turbine gearboxes using vibration analysis. This methodology is based on a machine learning algorithm that generates a baseline for the identification of deviations from the normal operation conditions of the turbine and the intrinsic characteristic-scale decomposition (ICD) method for fault type recognition. Outliers picked up during the baseline stage are decomposed by the ICD method to obtain the product components which reveal the fault information. The new methodology proposed for gear and bearing defect identification was validated by laboratory and field trials, comparing well with the methods reviewed in the literature.


Stroke ◽  
2021 ◽  
Vol 52 (Suppl_1) ◽  
Author(s):  
Yannan Yu ◽  
Soren Christensen ◽  
Yuan Xie ◽  
Enhao Gong ◽  
Maarten G Lansberg ◽  
...  

Objective: Ischemic core prediction from CT perfusion (CTP) remains inaccurate compared with gold standard diffusion-weighted imaging (DWI). We evaluated if a deep learning model to predict the DWI lesion from MR perfusion (MRP) could facilitate ischemic core prediction on CTP. Method: Using the multi-center CRISP cohort of acute ischemic stroke patient with CTP before thrombectomy, we included patients with major reperfusion (TICI score≥2b), adequate image quality, and follow-up MRI at 3-7 days. Perfusion parameters including Tmax, mean transient time, cerebral blood flow (CBF), and cerebral blood volume were reconstructed by RAPID software. Core lab experts outlined the stroke lesion on the follow-up MRI. A previously trained MRI model in a separate group of patients was used as a starting point, which used MRP parameters as input and RAPID ischemic core on DWI as ground truth. We fine-tuned this model, using CTP parameters as input, and follow-up MRI as ground truth. Another model was also trained from scratch with only CTP data. 5-fold cross validation was used. Performance of the models was compared with ischemic core (rCBF≤30%) from RAPID software to identify the presence of a large infarct (volume>70 or >100ml). Results: 94 patients in the CRISP trial met the inclusion criteria (mean age 67±15 years, 52% male, median baseline NIHSS 18, median 90-day mRS 2). Without fine-tuning, the MRI model had an agreement of 73% in infarct >70ml, and 69% in >100ml; the MRI model fine-tuned on CT improved the agreement to 77% and 73%; The CT model trained from scratch had agreements of 73% and 71%; All of the deep learning models outperformed the rCBF segmentation from RAPID, which had agreements of 51% and 64%. See Table and figure. Conclusions: It is feasible to apply MRP-based deep learning model to CT. Fine-tuning with CTP data further improves the predictions. All deep learning models predict the stroke lesion after major recanalization better than thresholding approaches based on rCBF.


Symmetry ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 803 ◽  
Author(s):  
Yung-Hui Li ◽  
Muhammad Saqlain Aslam ◽  
Kai-Lin Yang ◽  
Chung-An Kao ◽  
Shin-You Teng

There is a growing demand for alternative or complementary medicine in health care disciplines that uses a non-invasive instrument to evaluate the health status of various organs inside the human body. In this regard, we proposed a real-time, non-invasive, and painless technique to assess an individual’s health condition. Our approach is based on the combination of iridology and the philosophy of traditional Chinese medicine (TCM). The iridology chart presents perfect symmetry between the left and right eyes, and such a unique representation reveals the body constitution based on TCM philosophy, which classifies the aforementioned body constitution into a combination of nine categories to describe the varieties of genomic traits. In addition, we applied a deep-learning method along with the combination of iridology and TCM to predict the possible physiological or psychological strength or weakness of the subjects and give advice to them about how to take care of their health according to the body constitution assessment. We used several pre-trained convolutional neural networks (CNNs, or ConvNet), such as a residual neural network (ResNet50), InceptionV3, and dense convolutional network (DenseNet201), to classify the body constitution using iris images. In the experiments, the CASIA-Iris-Thousand database was used to perform this task. The experimental results showed that the proposed iris-based health assessment method achieved an 82.9% accuracy.


Sensors ◽  
2020 ◽  
Vol 20 (4) ◽  
pp. 1087
Author(s):  
Muhammad Naveed Riaz ◽  
Yao Shen ◽  
Muhammad Sohail ◽  
Minyi Guo

Facial expression recognition has been well studied for its great importance in the areas of human–computer interaction and social sciences. With the evolution of deep learning, there have been significant advances in this area that also surpass human-level accuracy. Although these methods have achieved good accuracy, they are still suffering from two constraints (high computational power and memory), which are incredibly critical for small hardware-constrained devices. To alleviate this issue, we propose a new Convolutional Neural Network (CNN) architecture eXnet (Expression Net) based on parallel feature extraction which surpasses current methods in accuracy and contains a much smaller number of parameters (eXnet: 4.57 million, VGG19: 14.72 million), making it more efficient and lightweight for real-time systems. Several modern data augmentation techniques are applied for generalization of eXnet; these techniques improve the accuracy of the network by overcoming the problem of overfitting while containing the same size. We provide an extensive evaluation of our network against key methods on Facial Expression Recognition 2013 (FER-2013), Extended Cohn-Kanade Dataset (CK+), and Real-world Affective Faces Database (RAF-DB) benchmark datasets. We also perform ablation evaluation to show the importance of different components of our architecture. To evaluate the efficiency of eXnet on embedded systems, we deploy it on Raspberry Pi 4B. All these evaluations show the superiority of eXnet for emotion recognition in the wild in terms of accuracy, the number of parameters, and size on disk.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Zhengqiao Zhao ◽  
Alexandru Cristian ◽  
Gail Rosen

Abstract Background It is a computational challenge for current metagenomic classifiers to keep up with the pace of training data generated from genome sequencing projects, such as the exponentially-growing NCBI RefSeq bacterial genome database. When new reference sequences are added to training data, statically trained classifiers must be rerun on all data, resulting in a highly inefficient process. The rich literature of “incremental learning” addresses the need to update an existing classifier to accommodate new data without sacrificing much accuracy compared to retraining the classifier with all data. Results We demonstrate how classification improves over time by incrementally training a classifier on progressive RefSeq snapshots and testing it on: (a) all known current genomes (as a ground truth set) and (b) a real experimental metagenomic gut sample. We demonstrate that as a classifier model’s knowledge of genomes grows, classification accuracy increases. The proof-of-concept naïve Bayes implementation, when updated yearly, now runs in 1/4th of the non-incremental time with no accuracy loss. Conclusions It is evident that classification improves by having the most current knowledge at its disposal. Therefore, it is of utmost importance to make classifiers computationally tractable to keep up with the data deluge. The incremental learning classifier can be efficiently updated without the cost of reprocessing nor the access to the existing database and therefore save storage as well as computation resources.


Symmetry ◽  
2020 ◽  
Vol 12 (11) ◽  
pp. 1832
Author(s):  
Tomasz Hachaj ◽  
Patryk Mazurek

Deep learning-based feature extraction methods and transfer learning have become common approaches in the field of pattern recognition. Deep convolutional neural networks trained using tripled-based loss functions allow for the generation of face embeddings, which can be directly applied to face verification and clustering. Knowledge about the ground truth of face identities might improve the effectiveness of the final classification algorithm; however, it is also possible to use ground truth clusters previously discovered using an unsupervised approach. The aim of this paper is to evaluate the potential improvement of classification results of state-of-the-art supervised classification methods trained with and without ground truth knowledge. In this study, we use two sufficiently large data sets containing more than 200,000 “taken in the wild” images, each with various resolutions, visual quality, and face poses which, in our opinion, guarantee the statistical significance of the results. We examine several clustering and supervised pattern recognition algorithms and find that knowledge about the ground truth has a very small influence on the Fowlkes–Mallows score (FMS) of the classification algorithm. In the case of the classification algorithm that obtained the highest accuracy in our experiment, the FMS improved by only 5.3% (from 0.749 to 0.791) in the first data set and by 6.6% (from 0.652 to 0.718) in the second data set. Our results show that, beside highly secure systems in which face verification is a key component, face identities discovered by unsupervised approaches can be safely used for training supervised classifiers. We also found that the Silhouette Coefficient (SC) of unsupervised clustering is positively correlated with the Adjusted Rand Index, V-measure score, and Fowlkes–Mallows score and, so, we can use the SC as an indicator of clustering performance when the ground truth of face identities is not known. All of these conclusions are important findings for large-scale face verification problems. The reason for this is the fact that skipping the verification of people’s identities before supervised training saves a lot of time and resources.


Sign in / Sign up

Export Citation Format

Share Document