Object Detection with the Addition of New Classes Based on the Method of RNOL

Mathematical Problems in Engineering ◽

10.1155/2020/9205373 ◽

2020 ◽

Vol 2020 ◽

pp. 1-6

Author(s):

Haiquan Fang ◽

Feijia Zhu

Keyword(s):

Object Detection ◽

State Of The Art ◽

Fine Tuning ◽

Detection Methods ◽

Detection Accuracy ◽

Important Research ◽

Training Time ◽

Tuning Method ◽

High Detection ◽

Computer Vision Applications

Object detection plays an important role in many computer vision applications. Innovative object detection methods based on deep learning such as Faster R-CNN, YOLO, and SSD have achieved state-of the-art results in terms of detection accuracy. There have been few studies to date on object detection with the addition of new classes, however, though this problem is often encountered in the industry. Therefore, this issue has important research significance and practical value. On the premise that the old class samples are available, a method of reserving nodes in advance in the output layer (RNOL) was established in this study. Experiments show that RNOL can achieve high detection accuracy in both new and old classes over a short training time while outperforming the traditional fine-tuning method.

Download Full-text

Real-Time Small Drones Detection Based on Pruned YOLOv4

Sensors ◽

10.3390/s21103374 ◽

2021 ◽

Vol 21 (10) ◽

pp. 3374

Author(s):

Hansen Liu ◽

Kuangang Fan ◽

Qinghua Ouyang ◽

Na Li

Keyword(s):

Object Detection ◽

Real Time ◽

Processing Speed ◽

State Of The Art ◽

Detection Methods ◽

Detection Accuracy ◽

Small Object ◽

Art Object ◽

Real Time Detection ◽

Small Object Detection

To address the threat of drones intruding into high-security areas, the real-time detection of drones is urgently required to protect these areas. There are two main difficulties in real-time detection of drones. One of them is that the drones move quickly, which leads to requiring faster detectors. Another problem is that small drones are difficult to detect. In this paper, firstly, we achieve high detection accuracy by evaluating three state-of-the-art object detection methods: RetinaNet, FCOS, YOLOv3 and YOLOv4. Then, to address the first problem, we prune the convolutional channel and shortcut layer of YOLOv4 to develop thinner and shallower models. Furthermore, to improve the accuracy of small drone detection, we implement a special augmentation for small object detection by copying and pasting small drones. Experimental results verify that compared to YOLOv4, our pruned-YOLOv4 model, with 0.8 channel prune rate and 24 layers prune, achieves 90.5% mAP and its processing speed is increased by 60.4%. Additionally, after small object augmentation, the precision and recall of the pruned-YOLOv4 almost increases by 22.8% and 12.7%, respectively. Experiment results verify that our pruned-YOLOv4 is an effective and accurate approach for drone detection.

Download Full-text

Transcription Alignment of Historical Vietnamese Manuscripts without Human-Annotated Learning Samples

Applied Sciences ◽

10.3390/app11114894 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4894

Author(s):

Anna Scius-Bertrand ◽

Michael Jungo ◽

Beat Wolf ◽

Andreas Fischer ◽

Marc Bui

Keyword(s):

Object Detection ◽

State Of The Art ◽

Positive Impact ◽

Detection System ◽

Training Data ◽

Detection Accuracy ◽

Current State ◽

Alignment Task ◽

Scanned Image ◽

Automatic Transcription

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.

Download Full-text

Multiview deep learning based on tensor decomposition and its application in fault detection of overhead contact systems

The Visual Computer ◽

10.1007/s00371-021-02080-y ◽

2021 ◽

Author(s):

Xuewu Zhang ◽

Yansheng Gong ◽

Chen Qiao ◽

Wenfeng Jing

Keyword(s):

High Speed ◽

Tensor Decomposition ◽

Detection Methods ◽

Detection Accuracy ◽

Feature Maps ◽

Training Time ◽

Detection Model ◽

Railway Line ◽

Result Show ◽

Deep Layers

AbstractThis article mainly focuses on the most common types of high-speed railways malfunctions in overhead contact systems, namely, unstressed droppers, foreign-body invasions, and pole number-plate malfunctions, to establish a deep-network detection model. By fusing the feature maps of the shallow and deep layers in the pretraining network, global and local features of the malfunction area are combined to enhance the network's ability of identifying small objects. Further, in order to share the fully connected layers of the pretraining network and reduce the complexity of the model, Tucker tensor decomposition is used to extract features from the fused-feature map. The operation greatly reduces training time. Through the detection of images collected on the Lanxin railway line, experiments result show that the proposed multiview Faster R-CNN based on tensor decomposition had lower miss probability and higher detection accuracy for the three types faults. Compared with object-detection methods YOLOv3, SSD, and the original Faster R-CNN, the average miss probability of the improved Faster R-CNN model in this paper is decreased by 37.83%, 51.27%, and 43.79%, respectively, and average detection accuracy is increased by 3.6%, 9.75%, and 5.9%, respectively.

Download Full-text

Malaria parasite detection in thick blood smear microscopic images using modified YOLOV3 and YOLOV4 models

BMC Bioinformatics ◽

10.1186/s12859-021-04036-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Fetulhak Abdurahman ◽

Kinde Anlay Fante ◽

Mohammed Aliy

Keyword(s):

Object Detection ◽

Malaria Parasite ◽

Blood Smear ◽

Clustering Algorithm ◽

State Of The Art ◽

Detection Accuracy ◽

Small Object ◽

Thick Blood Smear ◽

Malaria Parasites ◽

Microscopic Images

Abstract Background Manual microscopic examination of Leishman/Giemsa stained thin and thick blood smear is still the “gold standard” for malaria diagnosis. One of the drawbacks of this method is that its accuracy, consistency, and diagnosis speed depend on microscopists’ diagnostic and technical skills. It is difficult to get highly skilled microscopists in remote areas of developing countries. To alleviate this problem, in this paper, we propose to investigate state-of-the-art one-stage and two-stage object detection algorithms for automated malaria parasite screening from microscopic image of thick blood slides. Results YOLOV3 and YOLOV4 models, which are state-of-the-art object detectors in accuracy and speed, are not optimized for detecting small objects such as malaria parasites in microscopic images. We modify these models by increasing feature scale and adding more detection layers to enhance their capability of detecting small objects without notably decreasing detection speed. We propose one modified YOLOV4 model, called YOLOV4-MOD and two modified models of YOLOV3, which are called YOLOV3-MOD1 and YOLOV3-MOD2. Besides, new anchor box sizes are generated using K-means clustering algorithm to exploit the potential of these models in small object detection. The performance of the modified YOLOV3 and YOLOV4 models were evaluated on a publicly available malaria dataset. These models have achieved state-of-the-art accuracy by exceeding performance of their original versions, Faster R-CNN, and SSD in terms of mean average precision (mAP), recall, precision, F1 score, and average IOU. YOLOV4-MOD has achieved the best detection accuracy among all the other models with a mAP of 96.32%. YOLOV3-MOD2 and YOLOV3-MOD1 have achieved mAP of 96.14% and 95.46%, respectively. Conclusions The experimental results of this study demonstrate that performance of modified YOLOV3 and YOLOV4 models are highly promising for detecting malaria parasites from images captured by a smartphone camera over the microscope eyepiece. The proposed system is suitable for deployment in low-resource setting areas.

Download Full-text

Augmented Reality and Machine Learning Incorporation Using YOLOv3 and ARKit

Applied Sciences ◽

10.3390/app11136006 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6006

Author(s):

Huy Le ◽

Minh Nguyen ◽

Wei Qi Yan ◽

Hoa Nguyen

Keyword(s):

Machine Learning ◽

Augmented Reality ◽

Object Detection ◽

Feature Detection ◽

Detection Methods ◽

Detection Accuracy ◽

Data Annotation ◽

Machine Learning Model ◽

Potential Benefits ◽

Feature Detection And Tracking

Augmented reality is one of the fastest growing fields, receiving increased funding for the last few years as people realise the potential benefits of rendering virtual information in the real world. Most of today’s augmented reality marker-based applications use local feature detection and tracking techniques. The disadvantage of applying these techniques is that the markers must be modified to match the unique classified algorithms or they suffer from low detection accuracy. Machine learning is an ideal solution to overcome the current drawbacks of image processing in augmented reality applications. However, traditional data annotation requires extensive time and labour, as it is usually done manually. This study incorporates machine learning to detect and track augmented reality marker targets in an application using deep neural networks. We firstly implement the auto-generated dataset tool, which is used for the machine learning dataset preparation. The final iOS prototype application incorporates object detection, object tracking and augmented reality. The machine learning model is trained to recognise the differences between targets using one of YOLO’s most well-known object detection methods. The final product makes use of a valuable toolkit for developing augmented reality applications called ARKit.

Download Full-text

Deep-Learning-Based Road Crack Detection Frameworks for Dashcam-captured Images under Different Illumination Conditions

10.21203/rs.3.rs-685762/v1 ◽

2021 ◽

Author(s):

Da-Ren Chen ◽

Wei-Min Chiu

Keyword(s):

Object Detection ◽

Large Scale ◽

Crack Detection ◽

State Of The Art ◽

Gaussian Mixture Models ◽

Gaussian Mixture ◽

Machine Learning Techniques ◽

Detection Accuracy ◽

The Road ◽

Art Object

Abstract Machine learning techniques have been used to increase detection accuracy of cracks in road surfaces. Most studies failed to consider variable illumination conditions on the target of interest (ToI), and only focus on detecting the presence or absence of road cracks. This paper proposes a new road crack detection method, IlumiCrack, which integrates Gaussian mixture models (GMM) and object detection CNN models. This work provides the following contributions: 1) For the first time, a large-scale road crack image dataset with a range of illumination conditions (e.g., day and night) is prepared using a dashcam. 2) Based on GMM, experimental evaluations on 2 to 4 levels of brightness are conducted for optimal classification. 3) the IlumiCrack framework is used to integrate state-of-the-art object detecting methods with CNN to classify the road crack images into eight types with high accuracy. Experimental results show that IlumiCrack outperforms the state-of-the-art R-CNN object detection frameworks.

Download Full-text

Feature weighting network for aircraft engine defect detection

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691320500125 ◽

2020 ◽

Vol 18 (03) ◽

pp. 2050012

Author(s):

Liqiong Chen ◽

Lian Zou ◽

Cien Fan ◽

Yifeng Liu

Keyword(s):

Defect Detection ◽

State Of The Art ◽

Aircraft Engine ◽

Air Transportation ◽

Feature Weighting ◽

Detection Methods ◽

Detection Accuracy ◽

Practical Applications ◽

Feature Pyramid ◽

New Feature

Automatic aircraft engine defect detection is a challenging but important task in industry which can ensure safe air transportation and flight. In this paper, we propose a fast and accurate feature weighting network (FWNet) to solve the problem of defect scale variation and improve detection accuracy. The framework is designed based on recent popular convolutional neural networks and feature pyramid. To further boost the representation power of the network, a new feature weighting module (FWM) was proposed to recalibrate the channel-wise attention and increase the weights of valid features. The model was trained and tested on a self-built dataset, which consisted of 1916 images and contained three defect types: ablation, crack and coating missing. Extensive experimental results verify the effectiveness of the proposed FWM and show that the proposed method can accurately detect engine defects of different scales and different locations. Our method obtains 89.4% mAP and can run at 6FPS, which surpasses other state-of-the-art detection methods and can quickly provide diagnostic basis for aircraft maintenance inspectors in practical applications.

Download Full-text

Backbone Cannot Be Trained at Once: Rolling Back to Pre-Trained Network for Person Re-Identification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018859 ◽

2019 ◽

Vol 33 ◽

pp. 8859-8867 ◽

Cited By ~ 4

Author(s):

Youngmin Ro ◽

Jongwon Choi ◽

Dae Ung Jo ◽

Byeongho Heo ◽

Jongin Lim ◽

...

Keyword(s):

Network Architecture ◽

State Of The Art ◽

Fine Tuning ◽

Neural Network Architecture ◽

Large Dataset ◽

Low Level ◽

Tuning Method ◽

Improved Performance ◽

High Level ◽

Tuning Strategy

In person re-identification (ReID) task, because of its shortage of trainable dataset, it is common to utilize fine-tuning method using a classification network pre-trained on a large dataset. However, it is relatively difficult to sufficiently finetune the low-level layers of the network due to the gradient vanishing problem. In this work, we propose a novel fine-tuning strategy that allows low-level layers to be sufficiently trained by rolling back the weights of high-level layers to their initial pre-trained weights. Our strategy alleviates the problem of gradient vanishing in low-level layers and robustly trains the low-level layers to fit the ReID dataset, thereby increasing the performance of ReID tasks. The improved performance of the proposed strategy is validated via several experiments. Furthermore, without any addons such as pose estimation or segmentation, our strategy exhibits state-of-the-art performance using only vanilla deep convolutional neural network architecture.

Download Full-text

Plug-and-Play Few-shot Object Detection with Meta Strategy and Explicit Localization Inference

10.36227/techrxiv.16864711.v1 ◽

2021 ◽

Author(s):

Junying Huang ◽

Fan Chen ◽

Liang Lin ◽

dongyu zhang

Keyword(s):

Object Detection ◽

Fine Tuning ◽

The Novel ◽

Plug And Play ◽

Tuning Method ◽

One Step ◽

Tuning Process ◽

General Object ◽

The One ◽

Parallel Techniques

Aiming at recognizing and localizing the object of novel categories by a few reference samples, few-shot object detection is a quite challenging task. Previous works often depend on the fine-tuning process to transfer their model to the novel category and rarely consider the defect of fine-tuning, resulting in many drawbacks. For example, these methods are far from satisfying in the low-shot or episode-based scenarios since the fine-tuning process in object detection requires much time and high-shot support data. To this end, this paper proposes a plug-and-play few-shot object detection (PnP-FSOD) framework that can accurately and directly detect the objects of novel categories without the fine-tuning process. To accomplish the objective, the PnP-FSOD framework contains two parallel techniques to address the core challenges in the few-shot learning, i.e., across-category task and few-annotation support. Concretely, we first propose two simple but effective meta strategies for the box classifier and RPN module to enable the across-category object detection without fine-tuning. Then, we introduce two explicit inferences into the localization process to reduce its dependence on the annotated data, including explicit localization score and semi-explicit box regression. In addition to the PnP-FSOD framework, we propose a novel one-step tuning method that can avoid the defects in fine-tuning. It is noteworthy that the proposed techniques and tuning method are based on the general object detector without other prior methods, so they are easily compatible with the existing FSOD methods. Extensive experiments show that the PnP-FSOD framework has achieved the state-of-the-art few-shot object detection performance without any tuning method. After applying the one-step tuning method, it further shows a significant lead in both efficiency, precision, and recall, under varied few-shot evaluation protocols.

Download Full-text

Ship Detection Based on YOLOv2 for SAR Imagery

Remote Sensing ◽

10.3390/rs11070786 ◽

2019 ◽

Vol 11 (7) ◽

pp. 786 ◽

Cited By ~ 41

Author(s):

Yang-Lang Chang ◽

Amare Anagaw ◽

Lena Chang ◽

Yi Wang ◽

Chih-Yu Hsiao ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time ◽

Experimental Results ◽

Detection Methods ◽

Computational Time ◽

Detection Accuracy ◽

Single Shot ◽

Ship Detection ◽

Sar Imagery

Synthetic aperture radar (SAR) imagery has been used as a promising data source for monitoring maritime activities, and its application for oil and ship detection has been the focus of many previous research studies. Many object detection methods ranging from traditional to deep learning approaches have been proposed. However, majority of them are computationally intensive and have accuracy problems. The huge volume of the remote sensing data also brings a challenge for real time object detection. To mitigate this problem a high performance computing (HPC) method has been proposed to accelerate SAR imagery analysis, utilizing the GPU based computing methods. In this paper, we propose an enhanced GPU based deep learning method to detect ship from the SAR images. The You Only Look Once version 2 (YOLOv2) deep learning framework is proposed to model the architecture and training the model. YOLOv2 is a state-of-the-art real-time object detection system, which outperforms Faster Region-Based Convolutional Network (Faster R-CNN) and Single Shot Multibox Detector (SSD) methods. Additionally, in order to reduce computational time with relatively competitive detection accuracy, we develop a new architecture with less number of layers called YOLOv2-reduced. In the experiment, we use two types of datasets: A SAR ship detection dataset (SSDD) dataset and a Diversified SAR Ship Detection Dataset (DSSDD). These two datasets were used for training and testing purposes. YOLOv2 test results showed an increase in accuracy of ship detection as well as a noticeable reduction in computational time compared to Faster R-CNN. From the experimental results, the proposed YOLOv2 architecture achieves an accuracy of 90.05% and 89.13% on the SSDD and DSSDD datasets respectively. The proposed YOLOv2-reduced architecture has a similarly competent detection performance as YOLOv2, but with less computational time on a NVIDIA TITAN X GPU. The experimental results shows that the deep learning can make a big leap forward in improving the performance of SAR image ship detection.

Download Full-text