MsRi-CCF: Multi-Scale and Rotation-Insensitive Convolutional Channel Features for Geospatial Object Detection

Xin Wu; Danfeng Hong; Pedram Ghamisi; Wei Li; Ran Tao

doi:10.3390/rs10121990

MsRi-CCF: Multi-Scale and Rotation-Insensitive Convolutional Channel Features for Geospatial Object Detection

Remote Sensing ◽

10.3390/rs10121990 ◽

2018 ◽

Vol 10 (12) ◽

pp. 1990 ◽

Cited By ~ 12

Author(s):

Xin Wu ◽

Danfeng Hong ◽

Pedram Ghamisi ◽

Wei Li ◽

Ran Tao

Keyword(s):

Object Detection ◽

Power Law ◽

Computational Cost ◽

Detection Performance ◽

Fine Tuning ◽

Feature Generation ◽

Feature Maps ◽

Outlier Removal ◽

Low Level ◽

Multi Scale

Geospatial object detection is a fundamental but challenging problem in the remote sensing community. Although deep learning has shown its power in extracting discriminative features, there is still room for improvement in its detection performance, particularly for objects with large ranges of variations in scale and direction. To this end, a novel approach, entitled multi-scale and rotation-insensitive convolutional channel features (MsRi-CCF), is proposed for geospatial object detection by integrating robust low-level feature generation, classifier generation with outlier removal, and detection with a power law. The low-level feature generation step consists of rotation-insensitive and multi-scale convolutional channel features, which were obtained by learning a regularized convolutional neural network (CNN) and integrating multi-scaled convolutional feature maps, followed by the fine-tuning of high-level connections in the CNN, respectively. Then, these generated features were fed into AdaBoost (chosen due to its lower computation and storage costs) with outlier removal to construct an object detection framework that facilitates robust classifier training. In the test phase, we adopted a log-space sampling approach instead of fine-scale sampling by using the fast feature pyramid strategy based on a computable power law. Extensive experimental results demonstrate that compared with several state-of-the-art baselines, the proposed MsRi-CCF approach yields better detection results, with 90.19% precision with the satellite dataset and 81.44% average precision with the NWPU VHR-10 datasets. Importantly, MsRi-CCF incurs no additional computational cost, which is only 0.92 s and 0.7 s per test image on the two datasets. Furthermore, we determined that most previous methods fail to gain an acceptable detection performance, particularly when they face several obstacles, such as deformations in objects (e.g., rotation, illumination, and scaling). Yet, these factors are effectively addressed by MsRi-CCF, yielding a robust geospatial object detection method.

Download Full-text

Multiscale Object Detection in Infrared Streetscape Images Based on Deep Learning and Instance Level Data Augmentation

Applied Sciences ◽

10.3390/app9030565 ◽

2019 ◽

Vol 9 (3) ◽

pp. 565 ◽

Cited By ~ 6

Author(s):

Hao Qu ◽

Lilian Zhang ◽

Xuesong Wu ◽

Xiaofeng He ◽

Xiaoping Hu ◽

...

Keyword(s):

Object Detection ◽

Data Augmentation ◽

Region Of Interest ◽

Complex Environments ◽

Feature Maps ◽

Multi Scale ◽

Level Data ◽

Training Stage ◽

Street Scene ◽

Layer Region

The development of object detection in infrared images has attracted more attention in recent years. However, there are few studies on multi-scale object detection in infrared street scene images. Additionally, the lack of high-quality infrared datasets hinders research into such algorithms. In order to solve these issues, we firstly make a series of modifications based on Faster Region-Convolutional Neural Network (R-CNN). In this paper, a double-layer region proposal network (RPN) is proposed to predict proposals of different scales on both fine and coarse feature maps. Secondly, a multi-scale pooling module is introduced into the backbone of the network to explore the response of objects on different scales. Furthermore, the inception4 module and the position sensitive region of interest (ROI) align (PSalign) pooling layer are utilized to explore richer features of the objects. Thirdly, this paper proposes instance level data augmentation, which takes into account the imbalance between categories while enlarging dataset. In the training stage, the online hard example mining method is utilized to further improve the robustness of the algorithm in complex environments. The experimental results show that, compared with baseline, our detection method has state-of-the-art performance.

Download Full-text

A Novel Multi-Scale Feature Fusion Method for Region Proposal Network in Fast Object Detection

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2020070107 ◽

2020 ◽

Vol 16 (3) ◽

pp. 132-145

Author(s):

Gang Liu ◽

Chuyi Wang

Keyword(s):

Object Detection ◽

Multiple Scales ◽

Feature Fusion ◽

Uniform Space ◽

Fusion Method ◽

Well Performance ◽

Feature Maps ◽

Neural Network Models ◽

Scale Feature ◽

Multi Scale

Neural network models have been widely used in the field of object detecting. The region proposal methods are widely used in the current object detection networks and have achieved well performance. The common region proposal methods hunt the objects by generating thousands of the candidate boxes. Compared to other region proposal methods, the region proposal network (RPN) method improves the accuracy and detection speed with several hundred candidate boxes. However, since the feature maps contains insufficient information, the ability of RPN to detect and locate small-sized objects is poor. A novel multi-scale feature fusion method for region proposal network to solve the above problems is proposed in this article. The proposed method is called multi-scale region proposal network (MS-RPN) which can generate suitable feature maps for the region proposal network. In MS-RPN, the selected feature maps at multiple scales are fine turned respectively and compressed into a uniform space. The generated fusion feature maps are called refined fusion features (RFFs). RFFs incorporate abundant detail information and context information. And RFFs are sent to RPN to generate better region proposals. The proposed approach is evaluated on PASCAL VOC 2007 and MS COCO benchmark tasks. MS-RPN obtains significant improvements over the comparable state-of-the-art detection models.

Download Full-text

Glassboxing Deep Learning to Enhance Aircraft Detection from SAR Imagery

Remote Sensing ◽

10.3390/rs13183650 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3650

Author(s):

Ru Luo ◽

Jin Xing ◽

Lifu Chen ◽

Zhouhao Pan ◽

Xingmin Cai ◽

...

Keyword(s):

Deep Learning ◽

Feature Fusion ◽

Detection Performance ◽

Great Success ◽

Sar Image ◽

Feature Maps ◽

Multi Scale ◽

Learning Techniques ◽

Sar Imagery ◽

Aircraft Detection

Although deep learning has achieved great success in aircraft detection from SAR imagery, its blackbox behavior has been criticized for low comprehensibility and interpretability. Such challenges have impeded the trustworthiness and wide application of deep learning techniques in SAR image analytics. In this paper, we propose an innovative eXplainable Artificial Intelligence (XAI) framework to glassbox deep neural networks (DNN) by using aircraft detection as a case study. This framework is composed of three parts: hybrid global attribution mapping (HGAM) for backbone network selection, path aggregation network (PANet), and class-specific confidence scores mapping (CCSM) for visualization of the detector. HGAM integrates the local and global XAI techniques to evaluate the effectiveness of DNN feature extraction; PANet provides advanced feature fusion to generate multi-scale prediction feature maps; while CCSM relies on visualization methods to examine the detection performance with given DNN and input SAR images. This framework can select the optimal backbone DNN for aircraft detection and map the detection performance for better understanding of the DNN. We verify its effectiveness with experiments using Gaofen-3 imagery. Our XAI framework offers an explainable approach to design, develop, and deploy DNN for SAR image analytics.

Download Full-text

Fast Object Detection and Recognition Algorithm Based on Improved Multi-Scale Feature Maps

Laser & Optoelectronics Progress ◽

10.3788/lop56.021002 ◽

2019 ◽

Vol 56 (2) ◽

pp. 021002

Author(s):

单倩文 Shan Qianwen ◽

郑新波 Zheng Xinbo ◽

何小海 He Xiaohai ◽

滕奇志 Teng Qizhi ◽

吴晓红 Wu Xiaohong

Keyword(s):

Object Detection ◽

Recognition Algorithm ◽

Feature Maps ◽

Scale Feature ◽

Multi Scale ◽

Detection And Recognition

Download Full-text

Multi-scale Pyramid Feature Maps for Object Detection

2017 16th International Symposium on Distributed Computing and Applications to Business, Engineering and Science (DCABES) ◽

10.1109/dcabes.2017.59 ◽

2017 ◽

Author(s):

Hao Huijun ◽

Ye Ronghua ◽

Chen Zhongyu ◽

Zheng Zhonglong

Keyword(s):

Object Detection ◽

Feature Maps ◽

Multi Scale

Download Full-text

PSANet: Pyramid Splitting and Aggregation Network for 3D Object Detection in Point Cloud

Sensors ◽

10.3390/s21010136 ◽

2020 ◽

Vol 21 (1) ◽

pp. 136

Author(s):

Fangyu Li ◽

Weizheng Jin ◽

Cien Fan ◽

Lian Zou ◽

Qingsheng Chen ◽

...

Keyword(s):

Object Detection ◽

Point Clouds ◽

Autonomous Driving ◽

Feature Maps ◽

3D Object ◽

Multi Scale ◽

Backbone Network ◽

3D Object Detection ◽

Different Levels ◽

Fine Branch

3D object detection in LiDAR point clouds has been extensively used in autonomous driving, intelligent robotics, and augmented reality. Although the one-stage 3D detector has satisfactory training and inference speed, there are still some performance problems due to insufficient utilization of bird’s eye view (BEV) information. In this paper, a new backbone network is proposed to complete the cross-layer fusion of multi-scale BEV feature maps, which makes full use of various information for detection. Specifically, our proposed backbone network can be divided into a coarse branch and a fine branch. In the coarse branch, we use the pyramidal feature hierarchy (PFH) to generate multi-scale BEV feature maps, which retain the advantages of different levels and serves as the input of the fine branch. In the fine branch, our proposed pyramid splitting and aggregation (PSA) module deeply integrates different levels of multi-scale feature maps, thereby improving the expressive ability of the final features. Extensive experiments on the challenging KITTI-3D benchmark show that our method has better performance in both 3D and BEV object detection compared with some previous state-of-the-art methods. Experimental results with average precision (AP) prove the effectiveness of our network.

Download Full-text

Improving multi-class Boosting-based object detection

Integrated Computer-Aided Engineering ◽

10.3233/ica-200636 ◽

2020 ◽

Vol 28 (1) ◽

pp. 81-96

Author(s):

José Miguel Buenaposada ◽

Luis Baumela

Keyword(s):

Deep Learning ◽

Object Detection ◽

Data Augmentation ◽

Detection Performance ◽

Significant Progress ◽

Training Techniques ◽

Multi Scale ◽

Bounding Box ◽

Open Issue ◽

The Impact

In recent years we have witnessed significant progress in the performance of object detection in images. This advance stems from the use of rich discriminative features produced by deep models and the adoption of new training techniques. Although these techniques have been extensively used in the mainstream deep learning-based models, it is still an open issue to analyze their impact in alternative, and computationally more efficient, ensemble-based approaches. In this paper we evaluate the impact of the adoption of data augmentation, bounding box refinement and multi-scale processing in the context of multi-class Boosting-based object detection. In our experiments we show that use of these training advancements significantly improves the object detection performance.

Download Full-text

Voxel-FPN: Multi-Scale Voxel Feature Aggregation for 3D Object Detection from LIDAR Point Clouds

Sensors ◽

10.3390/s20030704 ◽

2020 ◽

Vol 20 (3) ◽

pp. 704 ◽

Cited By ~ 6

Author(s):

Hongwu Kuang ◽

Bei Wang ◽

Jianping An ◽

Ming Zhang ◽

Zehan Zhang

Keyword(s):

Object Detection ◽

Point Clouds ◽

Autonomous Driving ◽

Feature Maps ◽

3D Object ◽

Cloud Data ◽

Multi Scale ◽

Feature Pyramid ◽

Point Data ◽

3D Object Detection

Object detection in point cloud data is one of the key components in computer vision systems, especially for autonomous driving applications. In this work, we present Voxel-Feature Pyramid Network, a novel one-stage 3D object detector that utilizes raw data from LIDAR sensors only. The core framework consists of an encoder network and a corresponding decoder followed by a region proposal network. Encoder extracts and fuses multi-scale voxel information in a bottom-up manner, whereas decoder fuses multiple feature maps from various scales by Feature Pyramid Network in a top-down way. Extensive experiments show that the proposed method has better performance on extracting features from point data and demonstrates its superiority over some baselines on the challenging KITTI-3D benchmark, obtaining good performance on both speed and accuracy in real-world scenarios.

Download Full-text

Small Object Detection in Traffic Scenes Based on Attention Feature Fusion

Sensors ◽

10.3390/s21093031 ◽

2021 ◽

Vol 21 (9) ◽

pp. 3031

Author(s):

Jing Lian ◽

Yuhang Yin ◽

Linhui Li ◽

Zhenghao Wang ◽

Yafu Zhou

Keyword(s):

Object Detection ◽

Feature Fusion ◽

Contextual Information ◽

Detection Accuracy ◽

Small Object ◽

Limited Information ◽

Feature Maps ◽

Multi Scale ◽

Validation Set ◽

Small Object Detection

There are many small objects in traffic scenes, but due to their low resolution and limited information, their detection is still a challenge. Small object detection is very important for the understanding of traffic scene environments. To improve the detection accuracy of small objects in traffic scenes, we propose a small object detection method in traffic scenes based on attention feature fusion. First, a multi-scale channel attention block (MS-CAB) is designed, which uses local and global scales to aggregate the effective information of the feature maps. Based on this block, an attention feature fusion block (AFFB) is proposed, which can better integrate contextual information from different layers. Finally, the AFFB is used to replace the linear fusion module in the object detection network and obtain the final network structure. The experimental results show that, compared to the benchmark model YOLOv5s, this method has achieved a higher mean Average Precison (mAP) under the premise of ensuring real-time performance. It increases the mAP of all objects by 0.9 percentage points on the validation set of the traffic scene dataset BDD100K, and at the same time, increases the mAP of small objects by 3.5%.

Download Full-text

AF-EMS Detector: Improve the Multi-Scale Detection Performance of the Anchor-Free Detector

Remote Sensing ◽

10.3390/rs13020160 ◽

2021 ◽

Vol 13 (2) ◽

pp. 160

Author(s):

Jiangqiao Yan ◽

Liangjin Zhao ◽

Wenhui Diao ◽

Hongqi Wang ◽

Xian Sun

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Feature Fusion ◽

Detection Algorithm ◽

Detection Performance ◽

Natural Image ◽

Detection Model ◽

Multi Scale ◽

Feature Pyramid ◽

Scale Detection

As a precursor step for computer vision algorithms, object detection plays an important role in various practical application scenarios. With the objects to be detected becoming more complex, the problem of multi-scale object detection has attracted more and more attention, especially in the field of remote sensing detection. Early convolutional neural network detection algorithms are mostly based on artificially preset anchor-boxes to divide different regions in the image, and then obtain the prior position of the target. However, the anchor box is difficult to set reasonably and will cause a large amount of computational redundancy, which affects the generality of the detection model obtained under fixed parameters. In the past two years, anchor-free detection algorithm has achieved remarkable development in the field of detection on natural image. However, there is no sufficient research on how to deal with multi-scale detection more effectively in anchor-free framework and use these detectors on remote sensing images. In this paper, we propose a specific-attention Feature Pyramid Network (FPN) module, which is able to generate a feature pyramid, basing on the characteristics of objects with various sizes. In addition, this pyramid suits multi-scale object detection better. Besides, a scale-aware detection head is proposed which contains a multi-receptive feature fusion module and a size-based feature compensation module. The new anchor-free detector can obtain a more effective multi-scale feature expression. Experiments on challenging datasets show that our approach performs favorably against other methods in terms of the multi-scale object detection performance.

Download Full-text