scholarly journals An Input-Perceptual Reconstruction Adversarial Network for Paired Image-to-Image Conversion

Sensors ◽  
2020 ◽  
Vol 20 (15) ◽  
pp. 4161
Author(s):  
Aamir Khan ◽  
Weidong Jin ◽  
Muqeet Ahmad ◽  
Rizwan Ali Naqvi ◽  
Desheng Wang

Image-to-image conversion based on deep learning techniques is a topic of interest in the fields of robotics and computer vision. A series of typical tasks, such as applying semantic labels to building photos, edges to photos, and raining to de-raining, can be seen as paired image-to-image conversion problems. In such problems, the image generation network learns from the information in the form of input images. The input images and the corresponding targeted images must share the same basic structure to perfectly generate target-oriented output images. However, the shared basic structure between paired images is not as ideal as assumed, which can significantly affect the output of the generating model. Therefore, we propose a novel Input-Perceptual and Reconstruction Adversarial Network (IP-RAN) as an all-purpose framework for imperfect paired image-to-image conversion problems. We demonstrate, through the experimental results, that our IP-RAN method significantly outperforms the current state-of-the-art techniques.

2021 ◽  
Vol 13 (19) ◽  
pp. 3836
Author(s):  
Clément Dechesne ◽  
Pierre Lassalle ◽  
Sébastien Lefèvre

In recent years, numerous deep learning techniques have been proposed to tackle the semantic segmentation of aerial and satellite images, increase trust in the leaderboards of main scientific contests and represent the current state-of-the-art. Nevertheless, despite their promising results, these state-of-the-art techniques are still unable to provide results with the level of accuracy sought in real applications, i.e., in operational settings. Thus, it is mandatory to qualify these segmentation results and estimate the uncertainty brought about by a deep network. In this work, we address uncertainty estimations in semantic segmentation. To do this, we relied on a Bayesian deep learning method, based on Monte Carlo Dropout, which allows us to derive uncertainty metrics along with the semantic segmentation. Built on the most widespread U-Net architecture, our model achieves semantic segmentation with high accuracy on several state-of-the-art datasets. More importantly, uncertainty maps are also derived from our model. While they allow for the performance of a sounder qualitative evaluation of the segmentation results, they also include valuable information to improve the reference databases.


2019 ◽  
Vol 11 (12) ◽  
pp. 1499 ◽  
Author(s):  
David Griffiths ◽  
Jan Boehm

Over the past decade deep learning has driven progress in 2D image understanding. Despite these advancements, techniques for automatic 3D sensed data understanding, such as point clouds, is comparatively immature. However, with a range of important applications from indoor robotics navigation to national scale remote sensing there is a high demand for algorithms that can learn to automatically understand and classify 3D sensed data. In this paper we review the current state-of-the-art deep learning architectures for processing unstructured Euclidean data. We begin by addressing the background concepts and traditional methodologies. We review the current main approaches, including RGB-D, multi-view, volumetric and fully end-to-end architecture designs. Datasets for each category are documented and explained. Finally, we give a detailed discussion about the future of deep learning for 3D sensed data, using literature to justify the areas where future research would be most valuable.


Information ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 321
Author(s):  
Nicola Convertini ◽  
Vincenzo Dentamaro ◽  
Donato Impedovo ◽  
Giuseppe Pirlo ◽  
Lucia Sarcinella

This benchmarking study aims to examine and discuss the current state-of-the-art techniques for in-video violence detection, and also provide benchmarking results as a reference for the future accuracy baseline of violence detection systems. In this paper, the authors review 11 techniques for in-video violence detection. They re-implement five carefully chosen state-of-the-art techniques over three different and publicly available violence datasets, using several classifiers, all in the same conditions. The main contribution of this work is to compare feature-based violence detection techniques and modern deep-learning techniques, such as Inception V3.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4486
Author(s):  
Niall O’Mahony ◽  
Sean Campbell ◽  
Lenka Krpalkova ◽  
Anderson Carvalho ◽  
Joseph Walsh ◽  
...  

Fine-grained change detection in sensor data is very challenging for artificial intelligence though it is critically important in practice. It is the process of identifying differences in the state of an object or phenomenon where the differences are class-specific and are difficult to generalise. As a result, many recent technologies that leverage big data and deep learning struggle with this task. This review focuses on the state-of-the-art methods, applications, and challenges of representation learning for fine-grained change detection. Our research focuses on methods of harnessing the latent metric space of representation learning techniques as an interim output for hybrid human-machine intelligence. We review methods for transforming and projecting embedding space such that significant changes can be communicated more effectively and a more comprehensive interpretation of underlying relationships in sensor data is facilitated. We conduct this research in our work towards developing a method for aligning the axes of latent embedding space with meaningful real-world metrics so that the reasoning behind the detection of change in relation to past observations may be revealed and adjusted. This is an important topic in many fields concerned with producing more meaningful and explainable outputs from deep learning and also for providing means for knowledge injection and model calibration in order to maintain user confidence.


Recently, DDoS attacks is the most significant threat in network security. Both industry and academia are currently debating how to detect and protect against DDoS attacks. Many studies are provided to detect these types of attacks. Deep learning techniques are the most suitable and efficient algorithm for categorizing normal and attack data. Hence, a deep neural network approach is proposed in this study to mitigate DDoS attacks effectively. We used a deep learning neural network to identify and classify traffic as benign or one of four different DDoS attacks. We will concentrate on four different DDoS types: Slowloris, Slowhttptest, DDoS Hulk, and GoldenEye. The rest of the paper is organized as follow: Firstly, we introduce the work, Section 2 defines the related works, Section 3 presents the problem statement, Section 4 describes the proposed methodology, Section 5 illustrate the results of the proposed methodology and shows how the proposed methodology outperforms state-of-the-art work and finally Section VI concludes the paper.


2021 ◽  
Author(s):  
Phongsathorn Kittiworapanya ◽  
Kitsuchart Pasupa ◽  
Peter Auer

<div>We assessed several state-of-the-art deep learning algorithms and computer vision techniques for estimating the particle size of mixed commercial waste from images. In waste management, the first step is often coarse shredding, using the particle size to set up the shredder machine. The difficulty is separating the waste particles in an image, which can not be performed well. This work focused on estimating size by using the texture from the input image, captured at a fixed height from the camera lens to the ground. We found that EfficientNet achieved the best performance of 0.72 on F1-Score and 75.89% on accuracy.<br></div>


Author(s):  
Jwalin Bhatt ◽  
Khurram Azeem Hashmi ◽  
Muhammad Zeshan Afzal ◽  
Didier Stricker

In any document, graphical elements like tables, figures, and formulas contain essential information. The processing and interpretation of such information require specialized algorithms. Off-the-shelf OCR components cannot process this information reliably. Therefore, an essential step in document analysis pipelines is to detect these graphical components. It leads to a high-level conceptual understanding of the documents that makes digitization of documents viable. Since the advent of deep learning, the performance of deep learning-based object detection has improved many folds. In this work, we outline and summarize the deep learning approaches for detecting graphical page objects in the document images. Therefore, we discuss the most relevant deep learning-based approaches and state-of-the-art graphical page object detection in document images. This work provides a comprehensive understanding of the current state-of-the-art and related challenges. Furthermore, we discuss leading datasets along with the quantitative evaluation. Moreover, it discusses briefly the promising directions that can be utilized for further improvements.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Chunyu Li ◽  
Lei Wang

Along with the urban renewal and development, the urban living environment has given rise to various problems that need to be solved. With an eye on the future development model of residential communities, an experimental preliminary design for the construction of architectural space, public space, and landscape space based on people’s actual needs is carried out in an attempt to alleviate the more urgent symbiotic relationship between people and urban environment. To this end, this paper proposes a planning and design generation framework for the constructed external spatial environment of building groups based on a recursive double-adversarial network model. Firstly, we extract the features of the constructed external spatial environment of the building group in depth and generate the expression feature map, which is used as a supervisory signal to generate an expression seed image of the constructed external spatial environment of the building group; then we use the generated seed image together with the constructed external spatial environment of the original target building group as the input to generate a feature-holding image as the output of the current frame, and the feature-holding image is also used as the input for the next. Finally, the seed image generation network and the feature-holding image generation network are recursively used to generate the next frame, and the video sequence of the expressions of the constructed external spatial environment of the building group with the same feature-holding expressions as the original input is recursively obtained several times. The experimental results on the building group database show that the proposed method can generate clear and natural video frames of the constructed external spatial environment of the building group, which can be gradually derived from the design of building units to the construction of the building group and penetrate into the planning and design of the external spatial environment in order to comprehensively improve the living environment of urban population and provide a design method and theoretical support for the design of future urban residential communities.


Sign in / Sign up

Export Citation Format

Share Document