scholarly journals Knowledge-Driven GeoAI: Integrating Spatial Knowledge into Multi-Scale Deep Learning for Mars Crater Detection

2021 ◽  
Vol 13 (11) ◽  
pp. 2116
Author(s):  
Chia-Yu Hsu ◽  
Wenwen Li ◽  
Sizhe Wang

This paper introduces a new GeoAI solution to support automated mapping of global craters on the Mars surface. Traditional crater detection algorithms suffer from the limitation of working only in a semiautomated or multi-stage manner, and most were developed to handle a specific dataset in a small subarea of Mars’ surface, hindering their transferability for global crater detection. As an alternative, we propose a GeoAI solution based on deep learning to tackle this problem effectively. Three innovative features are integrated into our object detection pipeline: (1) a feature pyramid network is leveraged to generate feature maps with rich semantics across multiple object scales; (2) prior geospatial knowledge based on the Hough transform is integrated to enable more accurate localization of potential craters; and (3) a scale-aware classifier is adopted to increase the prediction accuracy of both large and small crater instances. The results show that the proposed strategies bring a significant increase in crater detection performance than the popular Faster R-CNN model. The integration of geospatial domain knowledge into the data-driven analytics moves GeoAI research up to the next level to enable knowledge-driven GeoAI. This research can be applied to a wide variety of object detection and image analysis tasks.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Maiki Higa ◽  
Shinya Tanahara ◽  
Yoshitaka Adachi ◽  
Natsumi Ishiki ◽  
Shin Nakama ◽  
...  

AbstractIn this report, we propose a deep learning technique for high-accuracy estimation of the intensity class of a typhoon from a single satellite image, by incorporating meteorological domain knowledge. By using the Visual Geometric Group’s model, VGG-16, with images preprocessed with fisheye distortion, which enhances a typhoon’s eye, eyewall, and cloud distribution, we achieved much higher classification accuracy than that of a previous study, even with sequential-split validation. Through comparison of t-distributed stochastic neighbor embedding (t-SNE) plots for the feature maps of VGG with the original satellite images, we also verified that the fisheye preprocessing facilitated cluster formation, suggesting that our model could successfully extract image features related to the typhoon intensity class. Moreover, gradient-weighted class activation mapping (Grad-CAM) was applied to highlight the eye and the cloud distributions surrounding the eye, which are important regions for intensity classification; the results suggest that our model qualitatively gained a viewpoint similar to that of domain experts. A series of analyses revealed that the data-driven approach using only deep learning has limitations, and the integration of domain knowledge could bring new breakthroughs.


2020 ◽  
Vol 2020 ◽  
pp. 1-18 ◽  
Author(s):  
Nhat-Duy Nguyen ◽  
Tien Do ◽  
Thanh Duc Ngo ◽  
Duy-Dinh Le

Small object detection is an interesting topic in computer vision. With the rapid development in deep learning, it has drawn attention of several researchers with innovations in approaches to join a race. These innovations proposed comprise region proposals, divided grid cell, multiscale feature maps, and new loss function. As a result, performance of object detection has recently had significant improvements. However, most of the state-of-the-art detectors, both in one-stage and two-stage approaches, have struggled with detecting small objects. In this study, we evaluate current state-of-the-art models based on deep learning in both approaches such as Fast RCNN, Faster RCNN, RetinaNet, and YOLOv3. We provide a profound assessment of the advantages and limitations of models. Specifically, we run models with different backbones on different datasets with multiscale objects to find out what types of objects are suitable for each model along with backbones. Extensive empirical evaluation was conducted on 2 standard datasets, namely, a small object dataset and a filtered dataset from PASCAL VOC 2007. Finally, comparative results and analyses are then presented.


Author(s):  
Seokyong Shin ◽  
Hyunho Han ◽  
Sang Hun Lee

YOLOv3 is a deep learning-based real-time object detector and is mainly used in applications such as video surveillance and autonomous vehicles. In this paper, we proposed an improved YOLOv3 (You Only Look Once version 3) applied Duplex FPN, which enhanced large object detection by utilizing low-level feature information. The conventional YOLOv3 improved the small object detection performance by applying FPN (Feature Pyramid Networks) structure to YOLOv2. However, YOLOv3 with an FPN structure specialized in detecting small objects, so it is difficult to detect large objects. Therefore, this paper proposed an improved YOLOv3 applied Duplex FPN, which can utilize low-level location information in high-level feature maps instead of the existing FPN structure of YOLOv3. This improved the detection accuracy of large objects. Also, an extra detection layer was added to the top-level feature map to prevent failure of detection of parts of large objects. Further, dimension clusters of each detection layer were reassigned to learn quickly how to accurately detect objects. The proposed method was compared and analyzed in the PASCAL VOC dataset. The experimental results showed that the bounding box accuracy of large objects improved owing to the Duplex FPN and extra detection layer, and the proposed method succeeded in detecting large objects that the existing YOLOv3 did not.


2021 ◽  
Vol 2 (4) ◽  
Author(s):  
Tiago Mota ◽  
Mohan Sridharan ◽  
Aleš Leonardis

AbstractA robot’s ability to provide explanatory descriptions of its decisions and beliefs promotes effective collaboration with humans. Providing the desired transparency in decision making is challenging in integrated robot systems that include knowledge-based reasoning methods and data-driven learning methods. As a step towards addressing this challenge, our architecture combines the complementary strengths of non-monotonic logical reasoning with incomplete commonsense domain knowledge, deep learning, and inductive learning. During reasoning and learning, the architecture enables a robot to provide on-demand explanations of its decisions, the evolution of associated beliefs, and the outcomes of hypothetical actions, in the form of relational descriptions of relevant domain objects, attributes, and actions. The architecture’s capabilities are illustrated and evaluated in the context of scene understanding tasks and planning tasks performed using simulated images and images from a physical robot manipulating tabletop objects. Experimental results indicate the ability to reliably acquire and merge new information about the domain in the form of constraints, preconditions, and effects of actions, and to provide accurate explanations in the presence of noisy sensing and actuation.


2019 ◽  
Vol 11 (20) ◽  
pp. 2376 ◽  
Author(s):  
Li ◽  
Zhang ◽  
Wu

Object detection in remote sensing images on a satellite or aircraft has important economic and military significance and is full of challenges. This task requires not only accurate and efficient algorithms, but also highperformance and low power hardware architecture. However, existing deep learning based object detection algorithms require further optimization in small objects detection, reduced computational complexity and parameter size. Meanwhile, the generalpurpose processor cannot achieve better power efficiency, and the previous design of deep learning processor has still potential for mining parallelism. To address these issues, we propose an efficient contextbased feature fusion single shot multibox detector (CBFFSSD) framework, using lightweight MobileNet as the backbone network to reduce parameters and computational complexity, adding feature fusion units and detecting feature maps to enhance the recognition of small objects and improve detection accuracy. Based on the analysis and optimization of the calculation of each layer in the algorithm, we propose efficient hardware architecture of deep learning processor with multiple neural processing units (NPUs) composed of 2D processing elements (PEs), which can simultaneously calculate multiple output feature maps. The parallel architecture, hierarchical onchip storage organization, and the local register are used to achieve parallel processing, sharing and reuse of data, and make the calculation of processor more efficient. Extensive experiments and comprehensive evaluations on the public NWPU VHR10 dataset and comparisons with some stateoftheart approaches demonstrate the effectiveness and superiority of the proposed framework. Moreover, for evaluating the performance of proposed hardware architecture, we implement it on Xilinx XC7Z100 field programmable gate array (FPGA) and test on the proposed CBFFSSD and VGG16 models. Experimental results show that our processor are more power efficient than general purpose central processing units (CPUs) and graphics processing units (GPUs), and have better performance density than other stateoftheart FPGAbased designs.


Author(s):  
Y. Dai ◽  
J. S. Xiao ◽  
B. S. Yi ◽  
J. F. Lei ◽  
Z. Y. Du

Abstract. Aiming at multi-class artificial object detection in remote sensing images, the detection framework based on deep learning is used to extract and localize the numerous targets existing in very high resolution remote sensing images. In order to realize rapid and efficient detection of the typical artificial targets on the remote sensing image, this paper proposes an end-to-end multi-category object detection method in remote sensing image based on the convolutional neural network to solve several challenges, including dense objects and objects with arbitrary direction and large aspect ratios. Specifically, in this paper, the feature extraction process is improved by utilizing a more advanced backbone network with deeper layers and combining multiple feature maps including the high-resolution features maps with more location details and low-resolution feature maps with highly-abstracted information. And a Rotating Regional Proposal Network is adopted into the Faster R-CNN network to generate candidate object-like regions with different orientations and to improve the sensitivity to dense and cluttered objects. The rotation factor is added into the regional proposal network to control the generation of anchor box’s angle and to cover enough directions of typical man-made objects. Meanwhile, the misalignment caused by the two quantifications operations in the pooling process is eliminated and a convolution layer is appended before the fully connected layer of the final classification network to reduce the feature parameters and avoid overfitting. Compared with current generic object detection method, the proposed algorithm focus on the arbitrary oriented and dense artificial targets in remote sensing images. After comprehensive evaluation with several state-of-the-art object detection algorithms, our method is proved to be effective to detect multi-class artificial object in remote sensing image. Experiments demonstrate that the proposed method combines the powerful features extracted by the improved convolutional neural networks with multi-scale features and rotating region network is more accurate in the public DOTA dataset.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Hoanh Nguyen

License plate detection is a key problem in intelligent transportation systems. Recently, many deep learning-based networks have been proposed and achieved incredible success in general object detection, such as faster R-CNN, SSD, and R-FCN. However, directly applying these deep general object detection networks on license plate detection without modifying may not achieve good enough performance. This paper proposes a novel deep learning-based framework for license plate detection in traffic scene images based on predicted anchor region proposal and balanced feature pyramid. In the proposed framework, ResNet-34 architecture is first adopted for generating the base convolution feature maps. A balanced feature pyramid generation module is then used to generate balanced feature pyramid, of which each feature level obtains equal information from other feature levels. Furthermore, this paper designs a multiscale region proposal network with a novel predicted location anchor scheme to generate high-quality proposals. Finally, a detection network which includes a region of interest pooling layer and fully connected layers is adopted to further classify and regress the coordinates of detected license plates. Experimental results on public datasets show that the proposed approach achieves better detection performance compared with other state-of-the-art methods on license plate detection.


2019 ◽  
Vol 277 ◽  
pp. 02029
Author(s):  
Aniruddha V Patil ◽  
Pankaj Rabha

In this survey we present a complete landscape of joint object detection and pose estimation methods that use monocular vision. Descriptions of traditional approaches that involve descriptors or models and various estimation methods have been provided. These descriptors or models include chordiograms, shape-aware deformable parts model, bag of boundaries, distance transform templates, natural 3D markers and facet features whereas the estimation methods include iterative clustering estimation, probabilistic networks and iterative genetic matching. Hybrid approaches that use handcrafted feature extraction followed by estimation by deep learning methods have been outlined. We have investigated and compared, wherever possible, pure deep learning based approaches (single stage and multi stage) for this problem. Comprehensive details of the various accuracy measures and metrics have been illustrated. For the purpose of giving a clear overview, the characteristics of relevant datasets are discussed. The trends that prevailed from the infancy of this problem until now have also been highlighted.


Author(s):  
M. N. Favorskaya ◽  
L. C. Jain

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.


Sign in / Sign up

Export Citation Format

Share Document