Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

Jianfeng Wu; Yongzhu Hua; Shengying Yang; Hongshuai Qin; Huibin Qin

doi:10.3390/app9163396

Speech Enhancement Using Generative Adversarial Network by Distilling Knowledge from Statistical Method

Applied Sciences ◽

10.3390/app9163396 ◽

2019 ◽

Vol 9 (16) ◽

pp. 3396 ◽

Cited By ~ 3

Author(s):

Jianfeng Wu ◽

Yongzhu Hua ◽

Shengying Yang ◽

Hongshuai Qin ◽

Huibin Qin

Keyword(s):

Neural Network ◽

Statistical Method ◽

Speech Enhancement ◽

Data Sets ◽

Generative Adversarial Network ◽

Adversarial Learning ◽

Noisy Speech ◽

Adversarial Network ◽

Knowledge Distillation ◽

Enhancement Algorithm

This paper presents a new deep neural network (DNN)-based speech enhancement algorithm by integrating the distilled knowledge from the traditional statistical-based method. Unlike the other DNN-based methods, which usually train many different models on the same data and then average their predictions, or use a large number of noise types to enlarge the simulated noisy speech, the proposed method does not train a whole ensemble of models and does not require a mass of simulated noisy speech. It first trains a discriminator network and a generator network simultaneously using the adversarial learning method. Then, the discriminator network and generator network are re-trained by distilling knowledge from the statistical method, which is inspired by the knowledge distillation in a neural network. Finally, the generator network is fine-tuned using real noisy speech. Experiments on CHiME4 data sets demonstrate that the proposed method achieves a more robust performance than the compared DNN-based method in terms of perceptual speech quality.

Download Full-text

Realistic Face Image Generation System Based on GANs

Fuzzy Systems and Data Mining VI - Frontiers in Artificial Intelligence and Applications ◽

10.3233/faia200707 ◽

2020 ◽

Author(s):

Zhike Han ◽

Bin Yang ◽

Yiren Du ◽

Xingyu Du ◽

Hao Xing ◽

...

Keyword(s):

Neural Network ◽

Network Model ◽

Neural Network Model ◽

Generative Adversarial Networks ◽

Data Sets ◽

Generation System ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Adversarial Networks ◽

Face Generation

The purpose of this paper is to study the help of generative adversarial networks (GAN) for face generation, and to explore whether the network can have an effect on complex face generation. Training an image translation neural network model based on a generative adversarial network with the help of a large number of real human face data sets. Using the CV2-based face tagging algorithm and the HED-based face edge extraction algorithm to obtain input information, and then based on the translation neural network model Developing a face generation system through Tensorflow, Torch and other frameworks to realize the function of generating real faces through sketches or “changing faces” through existing faces. Finally, this model provides training configuration and training information.

Download Full-text

Adversarial Gaussian Denoiser for Multiple-Level Image Denoising

Sensors ◽

10.3390/s21092998 ◽

2021 ◽

Vol 21 (9) ◽

pp. 2998

Author(s):

Aamir Khan ◽

Weidong Jin ◽

Amir Haider ◽

MuhibUr Rahman ◽

Desheng Wang

Keyword(s):

Neural Network ◽

Image Processing ◽

Computer Vision ◽

Image Denoising ◽

Theoretical Study ◽

State Of The Art ◽

Multiple Level ◽

Generative Adversarial Network ◽

Adversarial Learning ◽

Adversarial Network

Image denoising is a challenging task that is essential in numerous computer vision and image processing problems. This study proposes and applies a generative adversarial network-based image denoising training architecture to multiple-level Gaussian image denoising tasks. Convolutional neural network-based denoising approaches come across a blurriness issue that produces denoised images blurry on texture details. To resolve the blurriness issue, we first performed a theoretical study of the cause of the problem. Subsequently, we proposed an adversarial Gaussian denoiser network, which uses the generative adversarial network-based adversarial learning process for image denoising tasks. This framework resolves the blurriness problem by encouraging the denoiser network to find the distribution of sharp noise-free images instead of blurry images. Experimental results demonstrate that the proposed framework can effectively resolve the blurriness problem and achieve significant denoising efficiency than the state-of-the-art denoising methods.

Download Full-text

Self-Attention Generative Adversarial Network for Speech Enhancement

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp39728.2021.9414265 ◽

2021 ◽

Author(s):

Huy Phan ◽

Huy Le Nguyen ◽

Oliver Y. Chen ◽

Philipp Koch ◽

Ngoc Q. K. Duong ◽

...

Keyword(s):

Speech Enhancement ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

3D Convolutional Neural Network for Hyperspectral Image Classification Using Generative Adversarial Network

10.1109/icicta51737.2020.00065 ◽

2020 ◽

Author(s):

QiRui Yang ◽

Yu Liu ◽

Tong Zhou ◽

YuanXi Peng ◽

YuHua Tang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Image Classification ◽

Hyperspectral Image ◽

Hyperspectral Image Classification ◽

Generative Adversarial Network ◽

Adversarial Network

Download Full-text

AN AI-BASED APPROACH TO ENHANCED FRACTURE RESOLUTION IN IMAGE LOGS

10.30632/spwla-2021-0081 ◽

2021 ◽

Author(s):

James Howard ◽

◽

Joe Tracey ◽

Mike Shen ◽

Shawn Zhang ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Nearest Neighbor ◽

Rock Fracture ◽

Short Interval ◽

Acoustic Properties ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Deep Learning Neural Network ◽

Borehole Image

Borehole image logs are used to identify the presence and orientation of fractures, both natural and induced, found in reservoir intervals. The contrast in electrical or acoustic properties of the rock matrix and fluid-filled fractures is sufficiently large enough that sub-resolution features can be detected by these image logging tools. The resolution of these image logs is based on the design and operation of the tools, and generally is in the millimeter per pixel range. Hence the quantitative measurement of actual width remains problematic. An artificial intelligence (AI) -based workflow combines the statistical information obtained from a Machine-Learning (ML) segmentation process with a multiple-layer neural network that defines a Deep Learning process that enhances fractures in a borehole image. These new images allow for a more robust analysis of fracture widths, especially those that are sub-resolution. The images from a BHTV log were first segmented into rock and fluid-filled fractures using a ML-segmentation tool that applied multiple image processing filters that captured information to describe patterns in fracture-rock distribution based on nearest-neighbor behavior. The robust ML analysis was trained by users to identify these two components over a short interval in the well, and then the regression model-based coefficients applied to the remaining log. Based on the training, each pixel was assigned a probability value between 1.0 (being a fracture) and 0.0 (pure rock), with most of the pixels assigned one of these two values. Intermediate probabilities represented pixels on the edge of rock-fracture interface or the presence of one or more sub-resolution fractures within the rock. The probability matrix produced a map or image of the distribution of probabilities that determined whether a given pixel in the image was a fracture or partially filled with a fracture. The Deep Learning neural network was based on a Conditional Generative Adversarial Network (cGAN) approach where the probability map was first encoded and combined with a noise vector that acted as a seed for diverse feature generation. This combination was used to generate new images that represented the BHTV response. The second layer of the neural network, the adversarial or discriminator portion, determined whether the generated images were representative of the actual BHTV by comparing the generated images with actual images from the log and producing an output probability of whether it was real or fake. This probability was then used to train the generator and discriminator models that were then applied to the entire log. Several scenarios were run with different probability maps. The enhanced BHTV images brought out fractures observed in the core photos that were less obvious in the original BTHV log through enhanced continuity and improved resolution on fracture widths.

Download Full-text

GAINESIS: Generative Artificial Intelligence NEtlists SynthesIS

Electronics ◽

10.3390/electronics11020245 ◽

2022 ◽

Vol 11 (2) ◽

pp. 245

Author(s):

Konstantinos G. Liakos ◽

Georgios K. Georgakilas ◽

Fotis C. Plessas ◽

Paris Kitsos

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Power Analysis ◽

Public Libraries ◽

Data Sets ◽

Hardware Trojan ◽

Generative Adversarial Network ◽

Data Set ◽

Encrypted Data ◽

Adversarial Network

A significant problem in the field of hardware security consists of hardware trojan (HT) viruses. The insertion of HTs into a circuit can be applied for each phase of the circuit chain of production. HTs degrade the infected circuit, destroy it or leak encrypted data. Nowadays, efforts are being made to address HTs through machine learning (ML) techniques, mainly for the gate-level netlist (GLN) phase, but there are some restrictions. Specifically, the number and variety of normal and infected circuits that exist through the free public libraries, such as Trust-HUB, are based on the few samples of benchmarks that have been created from circuits large in size. Thus, it is difficult, based on these data, to develop robust ML-based models against HTs. In this paper, we propose a new deep learning (DL) tool named Generative Artificial Intelligence Netlists SynthesIS (GAINESIS). GAINESIS is based on the Wasserstein Conditional Generative Adversarial Network (WCGAN) algorithm and area–power analysis features from the GLN phase and synthesizes new normal and infected circuit samples for this phase. Based on our GAINESIS tool, we synthesized new data sets, different in size, and developed and compared seven ML classifiers. The results demonstrate that our new generated data sets significantly enhance the performance of ML classifiers compared with the initial data set of Trust-HUB.

Download Full-text

Convolutional Neural Network Audio Classifier for Alarm Sound Detection

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8866.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 4554-4557

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Short Term Memory ◽

Sound Recognition ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Differential Network ◽

Sound Detection ◽

Long Short Term Memory ◽

Lstm Network

Neural Networks (ANN) has evolved through many stages in the last three decades with many researchers contributing in this challenging field. With the power of math complex problems can also be solved by ANNs. ANNs like Convolutional Neural Network (CNN), Deep Neural network, Generative Adversarial Network (GAN), Long Short Term Memory (LSTM) network, Recurrent Neural Network (RNN), Ordinary Differential Network etc., are playing promising roles in many MNCs and IT industries for their predictions and accuracy. In this paper, Convolutional Neural Network is used for prediction of Beep sounds in high noise levels. Based on Supervised Learning, the research is developed the best CNN architecture for Beep sound recognition in noisy situations. The proposed method gives better results with an accuracy of 96%. The prototype is tested with few architectures for the training and test data out of which a two layer CNN classifier predictions were the best.

Download Full-text

Speech Enhancement Using Deep Learning Methods: A Review

Jurnal Elektronika dan Telekomunikasi ◽

10.14203/jet.v21.19-26 ◽

2021 ◽

Vol 21 (1) ◽

pp. 19

Author(s):

Asri Rizki Yuliani ◽

M. Faizal Amri ◽

Endang Suryawati ◽

Ade Ramdan ◽

Hilman Ferdinandus Pardede

Keyword(s):

Neural Network ◽

Deep Learning ◽

Speech Enhancement ◽

Speech Signal ◽

Research Field ◽

Learning Technologies ◽

Learning Approaches ◽

Speech Signal Processing ◽

Generative Adversarial Network ◽

Advantages And Disadvantages

Speech enhancement, which aims to recover the clean speech of the corrupted signal, plays an important role in the digital speech signal processing. According to the type of degradation and noise in the speech signal, approaches to speech enhancement vary. Thus, the research topic remains challenging in practice, specifically when dealing with highly non-stationary noise and reverberation. Recent advance of deep learning technologies has provided great support for the progress in speech enhancement research field. Deep learning has been known to outperform the statistical model used in the conventional speech enhancement. Hence, it deserves a dedicated survey. In this review, we described the advantages and disadvantages of recent deep learning approaches. We also discussed challenges and trends of this field. From the reviewed works, we concluded that the trend of the deep learning architecture has shifted from the standard deep neural network (DNN) to convolutional neural network (CNN), which can efficiently learn temporal information of speech signal, and generative adversarial network (GAN), that utilize two networks training.

Download Full-text

JANE: Jointly Adversarial Network Embedding

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/192 ◽

2020 ◽

Author(s):

Liang Yang ◽

Yuexue Wang ◽

Junhua Gu ◽

Chuan Wang ◽

Xiaochun Cao ◽

...

Keyword(s):

Link Prediction ◽

Real Data ◽

Semantic Space ◽

Network Embedding ◽

Generative Adversarial Network ◽

Adversarial Learning ◽

Adversarial Network ◽

Node Clustering ◽

Topology Information ◽

Embedding Methods

Motivated by the capability of Generative Adversarial Network on exploring the latent semantic space and capturing semantic variations in the data distribution, adversarial learning has been adopted in network embedding to improve the robustness. However, this important ability is lost in existing adversarially regularized network embedding methods, because their embedding results are directly compared to the samples drawn from perturbation (Gaussian) distribution without any rectification from real data. To overcome this vital issue, a novel Joint Adversarial Network Embedding (JANE) framework is proposed to jointly distinguish the real and fake combinations of the embeddings, topology information and node features. JANE contains three pluggable components, Embedding module, Generator module and Discriminator module. The overall objective function of JANE is defined in a min-max form, which can be optimized via alternating stochastic gradient. Extensive experiments demonstrate the remarkable superiority of the proposed JANE on link prediction (3% gains in both AUC and AP) and node clustering (5% gain in F1 score).

Download Full-text