Classification of breath and snore sounds using audio data recorded with smartphones in the home environment

Deep learning has been getting more attention towards the researchers for transforming input data into an effective representation through various learning algorithms. Hence it requires a large and variety of datasets to ensure good performance and generalization. But manually labeling a dataset is really a time consuming and expensive process, limiting its size. Some of websites like YouTube and Freesound etc. provide large volume of audio data along with their metadata. General purpose audio tagging is one of the newly proposed tasks in DCASE that can give valuable insights into classification of various acoustic sound events. The proposed work analyzes a large scale imbalanced audio data for a audio tagging system. The baseline of the proposed audio tagging system is based on Convolutional Neural Network with Mel Frequency Cepstral Coefficients. Audio tagging system is developed with Google Colaboratory on free Telsa K80 GPU using keras, Tensorflow, and PyTorch. The experimental result shows the performance of proposed audio tagging system with an average mean precision of 0.92 .

Download Full-text

Spoken Digit Classification by In-Materio Reservoir Computing With Neuromorphic Atomic Switch Networks

Frontiers in Nanotechnology ◽

10.3389/fnano.2021.675792 ◽

2021 ◽

Vol 3 ◽

Author(s):

Sam Lilak ◽

Walt Woods ◽

Kelsey Scharnhorst ◽

Christopher Dunham ◽

Christof Teuscher ◽

...

Keyword(s):

Silver Iodide ◽

High Accuracy ◽

Reservoir Computing ◽

New Class ◽

Audio Data ◽

Atomic Switch ◽

Adaptation And Learning ◽

Nanowire Networks ◽

Hardware Platforms

Atomic Switch Networks comprising silver iodide (AgI) junctions, a material previously unexplored as functional memristive elements within highly interconnected nanowire networks, were employed as a neuromorphic substrate for physical Reservoir Computing This new class of ASN-based devices has been physically characterized and utilized to classify spoken digit audio data, demonstrating the utility of substrate-based device architectures where intrinsic material properties can be exploited to perform computation in-materio. This work demonstrates high accuracy in the classification of temporally analyzed Free-Spoken Digit Data These results expand upon the class of viable memristive materials available for the production of functional nanowire networks and bolster the utility of ASN-based devices as unique hardware platforms for neuromorphic computing applications involving memory, adaptation and learning.

Download Full-text

Music Genre Classification of MPEG AAC Audio Data

2014 IEEE International Symposium on Multimedia ◽

10.1109/ism.2014.25 ◽

2014 ◽

Cited By ~ 2

Author(s):

Michihiro Kobayakawa ◽

Mamoru Hoshi ◽

Koichiro Yuzawa

Keyword(s):

Genre Classification ◽

Audio Data ◽

Music Genre ◽

Music Genre Classification

Download Full-text

Classification of the Excitation Location of Snore Sounds in the Upper Airway by Acoustic Multifeature Analysis

IEEE Transactions on Biomedical Engineering ◽

10.1109/tbme.2016.2619675 ◽

2017 ◽

Vol 64 (8) ◽

pp. 1731-1741 ◽

Cited By ~ 26

Author(s):

Kun Qian ◽

Christoph Janott ◽

Vedhas Pandit ◽

Zixing Zhang ◽

Clemens Heiser ◽

...

Keyword(s):

Upper Airway ◽

Snore Sounds

Download Full-text