scholarly journals Multiple-Oriented and Small Object Detection with Convolutional Neural Networks for Aerial Image

2019 ◽  
Vol 11 (18) ◽  
pp. 2176 ◽  
Author(s):  
Chen ◽  
Zhong ◽  
Tan

Detecting objects in aerial images is a challenging task due to multiple orientations and relatively small size of the objects. Although many traditional detection models have demonstrated an acceptable performance by using the imagery pyramid and multiple templates in a sliding-window manner, such techniques are inefficient and costly. Recently, convolutional neural networks (CNNs) have successfully been used for object detection, and they have demonstrated considerably superior performance than that of traditional detection methods; however, this success has not been expanded to aerial images. To overcome such problems, we propose a detection model based on two CNNs. One of the CNNs is designed to propose many object-like regions that are generated from the feature maps of multi scales and hierarchies with the orientation information. Based on such a design, the positioning of small size objects becomes more accurate, and the generated regions with orientation information are more suitable for the objects arranged with arbitrary orientations. Furthermore, another CNN is designed for object recognition; it first extracts the features of each generated region and subsequently makes the final decisions. The results of the extensive experiments performed on the vehicle detection in aerial imagery (VEDAI) and overhead imagery research data set (OIRDS) datasets indicate that the proposed model performs well in terms of not only the detection accuracy but also the detection speed.

2020 ◽  
Vol 12 (21) ◽  
pp. 3630
Author(s):  
Jin Liu ◽  
Haokun Zheng

Object detection and recognition in aerial and remote sensing images has become a hot topic in the field of computer vision in recent years. As these images are usually taken from a bird’s-eye view, the targets often have different shapes and are densely arranged. Therefore, using an oriented bounding box to mark the target is a mainstream choice. However, this general method is designed based on horizontal box annotation, while the improved method for detecting an oriented bounding box has a high computational complexity. In this paper, we propose a method called ellipse field network (EFN) to organically integrate semantic segmentation and object detection. It predicts the probability distribution of the target and obtains accurate oriented bounding boxes through a post-processing step. We tested our method on the HRSC2016 and DOTA data sets, achieving mAP values of 0.863 and 0.701, respectively. At the same time, we also tested the performance of EFN on natural images and obtained a mAP of 84.7 in the VOC2012 data set. These extensive experiments demonstrate that EFN can achieve state-of-the-art results in aerial image tests and can obtain a good score when considering natural images.


Author(s):  
C. Chen ◽  
W. Gong ◽  
Y. Hu ◽  
Y. Chen ◽  
Y. Ding

The automated building detection in aerial images is a fundamental problem encountered in aerial and satellite images analysis. Recently, thanks to the advances in feature descriptions, Region-based CNN model (R-CNN) for object detection is receiving an increasing attention. Despite the excellent performance in object detection, it is problematic to directly leverage the features of R-CNN model for building detection in single aerial image. As we know, the single aerial image is in vertical view and the buildings possess significant directional feature. However, in R-CNN model, direction of the building is ignored and the detection results are represented by horizontal rectangles. For this reason, the detection results with horizontal rectangle cannot describe the building precisely. To address this problem, in this paper, we proposed a novel model with a key feature related to orientation, namely, Oriented R-CNN (OR-CNN). Our contributions are mainly in the following two aspects: 1) Introducing a new oriented layer network for detecting the rotation angle of building on the basis of the successful VGG-net R-CNN model; 2) the oriented rectangle is proposed to leverage the powerful R-CNN for remote-sensing building detection. In experiments, we establish a complete and bran-new data set for training our oriented R-CNN model and comprehensively evaluate the proposed method on a publicly available building detection data set. We demonstrate State-of-the-art results compared with the previous baseline methods.


Author(s):  
M. Madadikhaljan ◽  
R. Bahmanyar ◽  
S. M. Azimi ◽  
P. Reinartz ◽  
U. Sörgel

Abstract. Haze contains floating particles in the air which can result in image quality degradation and visibility reduction in airborne data. Haze removal task has several applications in image enhancement and can improve the performance of automatic image analysis systems, namely object detection and segmentation. Unlike rich haze removal literature in ground imagery, there is a lack of methods specifically designed for aerial imagery, considering the fact that there is a characteristic difference between the aerial imagery domain and ground one. In this paper, we propose a method to dehaze aerial images using Convolutional Neural Networks (CNNs). Currently, there is no available data for dehazing methods in aerial imagery. To address this issue, we have created a syntheticallyhazed aerial image dataset to train the neural network on aerial hazy image dataset. We train All-in-One dehazing network (AODNet) as the base approach on hazy aerial images and compare the performance of our proposed approach against the classical model. We have tested our model on natural as well as the synthetically-hazed aerial images. Both qualitative and quantitative results of the adapted network show an improvement in dehazing results. We show that the adapted AOD-Net on our aerial image test set increases PSNR and SSim by 2.2% and 9%, respectively.


2021 ◽  
Vol 42 (1) ◽  
pp. e90289
Author(s):  
Carlos Eduardo Belman López

Given that it is fundamental to detect positive COVID-19 cases and treat affected patients quickly to mitigate the impact of the virus, X-ray images have been subjected to research regarding COVID-19, together with deep learning models, eliminating disadvantages such as the scarcity of RT-PCR test kits, their elevated costs, and the long wait for results. The contribution of this paper is to present new models for detecting COVID-19 and other cases of pneumonia using chest X-ray images and convolutional neural networks, thus providing accurate diagnostics in binary and 4-classes classification scenarios. Classification accuracy was improved, and overfitting was prevented by following 2 actions: (1) increasing the data set size while the classification scenarios were balanced; and (2) adding regularization techniques and performing hyperparameter optimization. Additionally, the network capacity and size in the models were reduced as much as possible, making the final models a perfect option to be deployed locally on devices with limited capacities and without the need for Internet access. The impact of key hyperparameters was tested using modern deep learning packages. The final models obtained a classification accuracy of 99,17 and 94,03% for the binary and categorical scenarios, respectively, achieving superior performance compared to other studies in the literature, and requiring a significantly lower number of parameters. The models can also be placed on a digital platform to provide instantaneous diagnostics and surpass the shortage of experts and radiologists.


Author(s):  
Ivan Rodriguez-Conde ◽  
Celso Campos ◽  
Florentino Fdez-Riverola

AbstractConvolutional neural networks have pushed forward image analysis research and computer vision over the last decade, constituting a state-of-the-art approach in object detection today. The design of increasingly deeper and wider architectures has made it possible to achieve unprecedented levels of detection accuracy, albeit at the cost of both a dramatic computational burden and a large memory footprint. In such a context, cloud systems have become a mainstream technological solution due to their tremendous scalability, providing researchers and practitioners with virtually unlimited resources. However, these resources are typically made available as remote services, requiring communication over the network to be accessed, thus compromising the speed of response, availability, and security of the implemented solution. In view of these limitations, the on-device paradigm has emerged as a recent yet widely explored alternative, pursuing more compact and efficient networks to ultimately enable the execution of the derived models directly on resource-constrained client devices. This study provides an up-to-date review of the more relevant scientific research carried out in this vein, circumscribed to the object detection problem. In particular, the paper contributes to the field with a comprehensive architectural overview of both the existing lightweight object detection frameworks targeted to mobile and embedded devices, and the underlying convolutional neural networks that make up their internal structure. More specifically, it addresses the main structural-level strategies used for conceiving the various components of a detection pipeline (i.e., backbone, neck, and head), as well as the most salient techniques proposed for adapting such structures and the resulting architectures to more austere deployment environments. Finally, the study concludes with a discussion of the specific challenges and next steps to be taken to move toward a more convenient accuracy–speed trade-off.


2021 ◽  
Vol 13 (1) ◽  
pp. 49-57
Author(s):  
Brahim Jabir ◽  
Noureddine Falih ◽  
Asmaa Sarih ◽  
Adil Tannouche

Researchers in precision agriculture regularly use deep learning that will help growers and farmers control and monitor crops during the growing season; these tools help to extract meaningful information from large-scale aerial images received from the field using several techniques in order to create a strategic analytics for making a decision. The information result of the operation could be exploited for many reasons, such as sub-plot specific weed control. Our focus in this paper is on weed identification and control in sugar beet fields, particularly the creation and optimization of a Convolutional Neural Networks model and train it according to our data set to predict and identify the most popular weed strains in the region of Beni Mellal, Morocco. All that could help select herbicides that work on the identified weeds, we explore the way of transfer learning approach to design the networks, and the famous library Tensorflow for deep learning models, and Keras which is a high-level API built on Tensorflow.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1677
Author(s):  
Jongwon Kim ◽  
Jeongho Cho

An essential component for the autonomous flight or air-to-ground surveillance of a UAV is an object detection device. It must possess a high detection accuracy and requires real-time data processing to be employed for various tasks such as search and rescue, object tracking and disaster analysis. With the recent advancements in multimodal data-based object detection architectures, autonomous driving technology has significantly improved, and the latest algorithm has achieved an average precision of up to 96%. However, these remarkable advances may be unsuitable for the image processing of UAV aerial data directly onboard for object detection because of the following major problems: (1) Objects in aerial views generally have a smaller size than in an image and they are uneven and sparsely distributed throughout an image; (2) Objects are exposed to various environmental changes, such as occlusion and background interference; and (3) The payload weight of a UAV is limited. Thus, we propose employing a new real-time onboard object detection architecture, an RGB aerial image and a point cloud data (PCD) depth map image network (RGDiNet). A faster region-based convolutional neural network was used as the baseline detection network and an RGD, an integration of the RGB aerial image and the depth map reconstructed by the light detection and ranging PCD, was utilized as an input for computational efficiency. Performance tests and evaluation of the proposed RGDiNet were conducted under various operating conditions using hand-labeled aerial datasets. Consequently, it was shown that the proposed method has a superior performance for the detection of vehicles and pedestrians than conventional vision-based methods.


Sign in / Sign up

Export Citation Format

Share Document