Learning time-frequency mask for noisy speech enhancement using gaussian-bernoulli pre-trained deep neural networks

Nasir Saleem; Muhammad Irfan Khattak; Mu’ath Al-Hasan; Atif Jan

doi:10.3233/jifs-201014

Learning time-frequency mask for noisy speech enhancement using gaussian-bernoulli pre-trained deep neural networks

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201014 ◽

2021 ◽

Vol 40 (1) ◽

pp. 849-864

Author(s):

Nasir Saleem ◽

Muhammad Irfan Khattak ◽

Mu’ath Al-Hasan ◽

Atif Jan

Keyword(s):

Neural Networks ◽

Speech Enhancement ◽

Speech Intelligibility ◽

Deep Neural Networks ◽

Training Data ◽

Learning Approaches ◽

Performance Gain ◽

Noisy Speech ◽

Time Frequency ◽

Training Scheme

Speech enhancement is a very important problem in various speech processing applications. Recently, supervised speech enhancement using deep learning approaches to estimate a time-frequency mask have proved remarkable performance gain. In this paper, we have proposed time-frequency masking-based supervised speech enhancement method for improving intelligibility and quality of the noisy speech. We believe that a large performance gain can be achieved if deep neural networks (DNNs) are layer-wise pre-trained by stacking Gaussian-Bernoulli Restricted Boltzmann Machine (GB-RBM). The proposed DNN is called as Gaussian-Bernoulli Deep Belief Network (GB-DBN) and are optimized by minimizing errors between the estimated and pre-defined masks. Non-linear Mel-Scale weighted mean square error (LMW-MSE) loss function is used as training criterion. We have examined the performance of the proposed pre-training scheme using different DNNs which are established on three time-frequency masks comprised of the ideal amplitude mask (IAM), ideal ratio mask (IRM), and phase sensitive mask (PSM). The results in different noisy conditions demonstrated that when DNNs are pre-trained by the proposed scheme provided a persistent performance gain in terms of the perceived speech intelligibility and quality. Also, the proposed pre-training scheme is effective and robust in noisy training data.

Download Full-text

Applying Deep Neural Networks and Ensemble Machine Learning Methods to Forecast Airborne Ambrosia Pollen

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph16111992 ◽

2019 ◽

Vol 16 (11) ◽

pp. 1992 ◽

Cited By ~ 6

Author(s):

Gebreab K. Zewdie ◽

David J. Lary ◽

Estelle Levetin ◽

Gemechu F. Garuma

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Land Surface ◽

Deep Neural Networks ◽

Airborne Pollen ◽

Training Data ◽

Gradient Boosting ◽

Learning Approaches ◽

Ambrosia Pollen ◽

Extreme Gradient Boosting

Allergies to airborne pollen are a significant issue affecting millions of Americans. Consequently, accurately predicting the daily concentration of airborne pollen is of significant public benefit in providing timely alerts. This study presents a method for the robust estimation of the concentration of airborne Ambrosia pollen using a suite of machine learning approaches including deep learning and ensemble learners. Each of these machine learning approaches utilize data from the European Centre for Medium-Range Weather Forecasts (ECMWF) atmospheric weather and land surface reanalysis. The machine learning approaches used for developing a suite of empirical models are deep neural networks, extreme gradient boosting, random forests and Bayesian ridge regression methods for developing our predictive model. The training data included twenty-four years of daily pollen concentration measurements together with ECMWF weather and land surface reanalysis data from 1987 to 2011 is used to develop the machine learning predictive models. The last six years of the dataset from 2012 to 2017 is used to independently test the performance of the machine learning models. The correlation coefficients between the estimated and actual pollen abundance for the independent validation datasets for the deep neural networks, random forest, extreme gradient boosting and Bayesian ridge were 0.82, 0.81, 0.81 and 0.75 respectively, showing that machine learning can be used to effectively forecast the concentrations of airborne pollen.

Download Full-text

Perceptual weighting deep neural networks for single-channel speech enhancement

2016 12th World Congress on Intelligent Control and Automation (WCICA) ◽

10.1109/wcica.2016.7578300 ◽

2016 ◽

Cited By ~ 2

Author(s):

Wei Han ◽

Xiongwei Zhang ◽

Gang Min ◽

Xingyu Zhou ◽

Wei Zhang

Keyword(s):

Neural Networks ◽

Speech Enhancement ◽

Deep Neural Networks ◽

Single Channel ◽

Perceptual Weighting

Download Full-text

Prediction of NMF-based Wiener Filter for Speech Enhancement Using Deep Neural Networks

2020 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC) ◽

10.1109/icspcc50002.2020.9259477 ◽

2020 ◽

Author(s):

Zhigang Bai ◽

Changchun Bao ◽

Zihao Cui

Keyword(s):

Neural Networks ◽

Speech Enhancement ◽

Deep Neural Networks ◽

Wiener Filter

Download Full-text

Fusion of Amplitude and Complex Domains based on Deep Neural Networks for Speech Enhancement

2020 28th Iranian Conference on Electrical Engineering (ICEE) ◽

10.1109/icee50131.2020.9260798 ◽

2020 ◽

Author(s):

Mohammad Saeed Deylami ◽

Sanaz Seyedin

Keyword(s):

Neural Networks ◽

Speech Enhancement ◽

Deep Neural Networks

Download Full-text

Text-informed speech enhancement with deep neural networks

10.21437/interspeech.2015-409 ◽

2015 ◽

Author(s):

Keisuke Kinoshita ◽

Marc Delcroix ◽

Atsunori Ogawa ◽

Tomohiro Nakatani

Keyword(s):

Neural Networks ◽

Speech Enhancement ◽

Deep Neural Networks

Download Full-text

Evaluation of Power Insulator Detection Efficiency with the Use of Limited Training Dataset

Applied Sciences ◽

10.3390/app10062104 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2104

Author(s):

Michał Tomaszewski ◽

Paweł Michalski ◽

Jakub Osuchowski

Keyword(s):

Neural Network ◽

Neural Networks ◽

Object Detection ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Detection Efficiency ◽

Training Data ◽

Training Dataset ◽

Training Set ◽

Convolutional Network

This article presents an analysis of the effectiveness of object detection in digital images with the application of a limited quantity of input. The possibility of using a limited set of learning data was achieved by developing a detailed scenario of the task, which strictly defined the conditions of detector operation in the considered case of a convolutional neural network. The described solution utilizes known architectures of deep neural networks in the process of learning and object detection. The article presents comparisons of results from detecting the most popular deep neural networks while maintaining a limited training set composed of a specific number of selected images from diagnostic video. The analyzed input material was recorded during an inspection flight conducted along high-voltage lines. The object detector was built for a power insulator. The main contribution of the presented papier is the evidence that a limited training set (in our case, just 60 training frames) could be used for object detection, assuming an outdoor scenario with low variability of environmental conditions. The decision of which network will generate the best result for such a limited training set is not a trivial task. Conducted research suggests that the deep neural networks will achieve different levels of effectiveness depending on the amount of training data. The most beneficial results were obtained for two convolutional neural networks: the faster region-convolutional neural network (faster R-CNN) and the region-based fully convolutional network (R-FCN). Faster R-CNN reached the highest AP (average precision) at a level of 0.8 for 60 frames. The R-FCN model gained a worse AP result; however, it can be noted that the relationship between the number of input samples and the obtained results has a significantly lower influence than in the case of other CNN models, which, in the authors’ assessment, is a desired feature in the case of a limited training set.

Download Full-text

Towards better performance with heterogeneous training data in acoustic modeling using deep neural networks

10.21437/interspeech.2014-214 ◽

2014 ◽

Author(s):

Yan Huang ◽

Malcolm Slaney ◽

Michael L. Seltzer ◽

Yifan Gong

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Training Data ◽

Acoustic Modeling

Download Full-text

Speech Enhancement Using Neuro-Fuzzy Classifier

Advances in Data Mining and Database Management - Handbook of Research on Automated Feature Engineering and Advanced Applications in Data Science ◽

10.4018/978-1-7998-6659-6.ch009 ◽

2021 ◽

pp. 164-181

Author(s):

Judith Justin ◽

Vanithamani R.

Keyword(s):

Feature Extraction ◽

Speech Enhancement ◽

The Other ◽

Objective Measures ◽

Noise Levels ◽

Fuzzy Classifier ◽

Noisy Speech ◽

Enhancement Technique ◽

Time Frequency ◽

Neuro Fuzzy

In this chapter, a speech enhancement technique is implemented using a neuro-fuzzy classifier. Noisy speech sentences from NOIZEUS and AURORA databases are taken for the study. Feature extraction is implemented through modifications in amplitude magnitude spectrograms. A four class neuro-fuzzy classifier splits the noisy speech samples into noise-only part, signal only part, more noise-less signal part, and more signal-less noise part of the time-frequency units. Appropriate weights are applied in the enhancement phase. The enhanced speech sentence is evaluated using objective measures. An analysis of the performance of the Neuro-Fuzzy 4 (NF 4) classifier is done. A comparison of the performance of the classifier with other conventional techniques is done for various noises at different noise levels. It is observed that the numerical values of the measures obtained are better when compared to the others. An overall comparison of the performance of the NF 4 classifier is done and it is inferred that NF4 outperforms the other techniques in speech enhancement.

Download Full-text

Howling Noise Cancellation in Time–Frequency Domain by Deep Neural Networks

10.1007/978-981-16-2380-6_28 ◽

2021 ◽

pp. 319-332

Author(s):

Huaguo Gan ◽

Gaoyong Luo ◽

Yaqing Luo ◽

Wenbin Luo

Keyword(s):

Neural Networks ◽

Frequency Domain ◽

Deep Neural Networks ◽

Noise Cancellation ◽

Time Frequency

Download Full-text

MSpectraAI: a powerful platform for deciphering proteome profiling of multi-tumor mass spectrometry data by using deep neural networks

BMC Bioinformatics ◽

10.1186/s12859-020-03783-0 ◽

2020 ◽

Vol 21 (1) ◽

Author(s):

Shisheng Wang ◽

Hongwen Zhu ◽

Hu Zhou ◽

Jingqiu Cheng ◽

Hao Yang

Keyword(s):

Mass Spectrometry ◽

Neural Networks ◽

Large Scale ◽

Deep Neural Networks ◽

Spectral Feature ◽

Mass Spectrometry Data ◽

Learning Approaches ◽

Proteomics Data ◽

Proteome Profiling ◽

Analytical Technique

Abstract Background Mass spectrometry (MS) has become a promising analytical technique to acquire proteomics information for the characterization of biological samples. Nevertheless, most studies focus on the final proteins identified through a suite of algorithms by using partial MS spectra to compare with the sequence database, while the pattern recognition and classification of raw mass-spectrometric data remain unresolved. Results We developed an open-source and comprehensive platform, named MSpectraAI, for analyzing large-scale MS data through deep neural networks (DNNs); this system involves spectral-feature swath extraction, classification, and visualization. Moreover, this platform allows users to create their own DNN model by using Keras. To evaluate this tool, we collected the publicly available proteomics datasets of six tumor types (a total of 7,997,805 mass spectra) from the ProteomeXchange consortium and classified the samples based on the spectra profiling. The results suggest that MSpectraAI can distinguish different types of samples based on the fingerprint spectrum and achieve better prediction accuracy in MS1 level (average 0.967). Conclusion This study deciphers proteome profiling of raw mass spectrometry data and broadens the promising application of the classification and prediction of proteomics data from multi-tumor samples using deep learning methods. MSpectraAI also shows a better performance compared to the other classical machine learning approaches.

Download Full-text