scholarly journals Learning Ratio Mask with Cascaded Deep Neural Networks for Echo Cancellation in Laser Monitoring Signals

Electronics ◽  
2020 ◽  
Vol 9 (5) ◽  
pp. 856
Author(s):  
Haitao Lang ◽  
Jie Yang

Laser monitoring has received more and more attention in many application fields thanks to its essential advantages. The analysis shows that the target speech in the laser monitoring signals is often interfered by the echoes, resulting in a decline in speech intelligibility and quality, which in turn affects the identification of useful information. The cancellation of echoes in laser monitoring signals is not a trivial task. In this article, we formulate it as a simple but effective additive echo noise model and propose a cascade deep neural networks (C-DNNs) as the mapping function from the acoustic feature of noisy speech to the ratio mask of clean signal. To validate the feasibility and effectiveness of the proposed method, we investigated the effect of echo intensity, echo delay, and training target on the performance. We also compared the proposed C-DNNs to some traditional and newly emerging DNN-based supervised learning methods. Extensive experiments demonstrated the proposed method can greatly improve the speech intelligibility and speech quality of the echo-cancelled signals and outperform the comparison methods.

Author(s):  
Yun-Peng Liu ◽  
Ning Xu ◽  
Yu Zhang ◽  
Xin Geng

The performances of deep neural networks (DNNs) crucially rely on the quality of labeling. In some situations, labels are easily corrupted, and therefore some labels become noisy labels. Thus, designing algorithms that deal with noisy labels is of great importance for learning robust DNNs. However, it is difficult to distinguish between clean labels and noisy labels, which becomes the bottleneck of many methods. To address the problem, this paper proposes a novel method named Label Distribution based Confidence Estimation (LDCE). LDCE estimates the confidence of the observed labels based on label distribution. Then, the boundary between clean labels and noisy labels becomes clear according to confidence scores. To verify the effectiveness of the method, LDCE is combined with the existing learning algorithm to train robust DNNs. Experiments on both synthetic and real-world datasets substantiate the superiority of the proposed algorithm against state-of-the-art methods.


2021 ◽  
Vol 37 (2) ◽  
pp. 123-143
Author(s):  
Tuan Minh Luu ◽  
Huong Thanh Le ◽  
Tan Minh Hoang

Deep neural networks have been applied successfully to extractive text summarization tasks with the accompany of large training datasets. However, when the training dataset is not large enough, these models reveal certain limitations that affect the quality of the system’s summary. In this paper, we propose an extractive summarization system basing on a Convolutional Neural Network and a Fully Connected network for sentence selection. The pretrained BERT multilingual model is used to generate embeddings vectors from the input text. These vectors are combined with TF-IDF values to produce the input of the text summarization system. Redundant sentences from the output summary are eliminated by the Maximal Marginal Relevance method. Our system is evaluated with both English and Vietnamese languages using CNN and Baomoi datasets, respectively. Experimental results show that our system achieves better results comparing to existing works using the same dataset. It confirms that our approach can be effectively applied to summarize both English and Vietnamese languages.


2021 ◽  
Author(s):  
Viktória Burkus ◽  
Attila Kárpáti ◽  
László Szécsi

Surface reconstruction for particle-based fluid simulation is a computational challenge on par with the simula- tion itself. In real-time applications, splatting-style rendering approaches based on forward rendering of particle impostors are prevalent, but they suffer from noticeable artifacts. In this paper, we present a technique that combines forward rendering simulated features with deep-learning image manipulation to improve the rendering quality of splatting-style approaches to be perceptually similar to ray tracing solutions, circumventing the cost, complexity, and limitations of exact fluid surface rendering by replacing it with the flat cost of a neural network pass. Our solution is based on the idea of training generative deep neural networks with image pairs consisting of cheap particle impostor renders and ground truth high quality ray-traced images.


10.29007/p655 ◽  
2018 ◽  
Author(s):  
Sai Prabhakar Pandi Selvaraj ◽  
Manuela Veloso ◽  
Stephanie Rosenthal

Significant advances in the performance of deep neural networks, such as Convolutional Neural Networks (CNNs) for image classification, have created a drive for understanding how they work. Different techniques have been proposed to determine which features (e.g., image pixels) are most important for a CNN’s classification. However, the important features output by these techniques have typically been judged subjectively by a human to assess whether the important features capture the features relevant to the classification and not whether the features were actually important to classifier itself. We address the need for an objective measure to assess the quality of different feature importance measures. In particular, we propose measuring the ratio of a CNN’s accuracy on the whole image com- pared to an image containing only the important features. We also consider scaling this ratio by the relative size of the important region in order to measure the conciseness. We demonstrate that our measures correlate well with prior subjective comparisons of important features, but importantly do not require their human studies. We also demonstrate that the features on which multiple techniques agree are important have a higher impact on accuracy than those features that only one technique finds.


Processes ◽  
2021 ◽  
Vol 9 (12) ◽  
pp. 2286
Author(s):  
Ammar Amjad ◽  
Lal Khan ◽  
Hsien-Tsung Chang

Recently, identifying speech emotions in a spontaneous database has been a complex and demanding study area. This research presents an entirely new approach for recognizing semi-natural and spontaneous speech emotions with multiple feature fusion and deep neural networks (DNN). A proposed framework extracts the most discriminative features from hybrid acoustic feature sets. However, these feature sets may contain duplicate and irrelevant information, leading to inadequate emotional identification. Therefore, an support vector machine (SVM) algorithm is utilized to identify the most discriminative audio feature map after obtaining the relevant features learned by the fusion approach. We investigated our approach utilizing the eNTERFACE05 and BAUM-1s benchmark databases and observed a significant identification accuracy of 76% for a speaker-independent experiment with SVM and 59% accuracy with, respectively. Furthermore, experiments on the eNTERFACE05 and BAUM-1s dataset indicate that the suggested framework outperformed current state-of-the-art techniques on the semi-natural and spontaneous datasets.


Symmetry ◽  
2022 ◽  
Vol 14 (1) ◽  
pp. 151
Author(s):  
Xintao Duan ◽  
Lei Li ◽  
Yao Su ◽  
Wenxin Wang ◽  
En Zhang ◽  
...  

Data hiding is the technique of embedding data into video or audio media. With the development of deep neural networks (DNN), the quality of images generated by novel data hiding methods based on DNN is getting better. However, there is still room for the similarity between the original images and the images generated by the DNN models which were trained based on the existing hiding frameworks to improve, and it is hard for the receiver to distinguish whether the container image is from the real sender. We propose a framework by introducing a key_img for using the over-fitting characteristic of DNN and combined with difference image grafting symmetrically, named difference image grafting deep hiding (DIGDH). The key_img can be used to identify whether the container image is from the real sender easily. The experimental results show that without changing the structures of networks, the models trained based on the proposed framework can generate images with higher similarity to original cover and secret images. According to the analysis results of the steganalysis tool named StegExpose, the container images generated by the hiding model trained based on the proposed framework is closer to the random distribution.


2021 ◽  
Author(s):  
Murat Seckin Ayhan ◽  
Louis Benedikt Kuemmerle ◽  
Laura Kuehlewein ◽  
Werner Inhoffen ◽  
Gulnar Aliyeva ◽  
...  

Deep neural networks (DNNs) have achieved physician-level accuracy on many imaging-based medical diagnostic tasks, for example classification of retinal images in ophthalmology. However, their decision mechanisms are often considered impenetrable leading to a lack of trust by clinicians and patients. To alleviate this issue, a range of explanation methods have been proposed to expose the inner workings of DNNs leading to their decisions. For imaging-based tasks, this is often achieved via saliency maps. The quality of these maps are typically evaluated via perturbation analysis without experts involved. To facilitate the adoption and success of such automated systems, however, it is crucial to validate saliency maps against clinicians. In this study, we used two different network architectures and developed ensembles of DNNs to detect diabetic retinopathy and neovascular age-related macular degeneration from retinal fundus images and optical coherence tomography scans, respectively. We used a variety of explanation methods and obtained a comprehensive set of saliency maps for explaining the ensemble-based diagnostic decisions. Then, we systematically validated saliency maps against clinicians through two main analyses --- a direct comparison of saliency maps with the expert annotations of disease-specific pathologies and perturbation analyses using also expert annotations as saliency maps. We found the choice of DNN architecture and explanation method to significantly influence the quality of saliency maps. Guided Backprop showed consistently good performance across disease scenarios and DNN architectures, suggesting that it provides a suitable starting point for explaining the decisions of DNNs on retinal images.


Sign in / Sign up

Export Citation Format

Share Document