scholarly journals Reinforced Neighbour Feature Fusion Object Detection with Deep Learning

Symmetry ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 1623
Author(s):  
Ningwei Wang ◽  
Yaze Li ◽  
Hongzhe Liu

Neural networks have enabled state-of-the-art approaches to achieve incredible results on computer vision tasks such as object detection. However, previous works have tried to improve the performance in various object detection necks but have failed to extract features efficiently. To solve the insufficient features of objects, this work introduces some of the most advanced and representative network models based on the Faster R-CNN architecture, such as Libra R-CNN, Grid R-CNN, guided anchoring, and GRoIE. We observed the performance of Neighbour Feature Pyramid Network (NFPN) fusion, ResNet Region of Interest Feature Extraction (ResRoIE) and the Recursive Feature Pyramid (RFP) architecture at different scales of precision when these components were used in place of the corresponding original members in various networks obtained on the MS COCO dataset. Compared to the experimental results after replacing the neck and RoIE parts of these models with our Reinforced Neighbour Feature Fusion (RNFF) model, the average precision (AP) is increased by 3.2 percentage points concerning the performance of the baseline network.

Author(s):  
M A Isayev ◽  
D A Savelyev

The comparison of different convolutional neural networks which are the core of the most actual solutions in the computer vision area is considers in hhe paper. The study includes benchmarks of this state-of-the-art solutions by some criteria, such as mAP (mean average precision), FPS (frames per seconds), for the possibility of real-time usability. It is concluded on the best convolutional neural network model and deep learning methods that were used at particular solution.


2020 ◽  
Vol 34 (07) ◽  
pp. 12573-12580
Author(s):  
Jiangqiao Yan ◽  
Yue Zhang ◽  
Zhonghan Chang ◽  
Tengfei Zhang ◽  
Menglong Yan ◽  
...  

Feature pyramid is the mainstream method for multi-scale object detection. In most detectors with feature pyramid, each proposal is predicted based on feature grids pooled from only one feature level, which is assigned heuristically. Recent studies report that the feature representation extracted using this method is sub-optimal, since they ignore the valid information exists on other unselected layers of the feature pyramid. To address this issue, researchers present to fuse valid information across all feature levels. However, these methods can be further improved: the feature fusion strategies, which use common operation (element-wise max or sum) in most detectors, should be replaced by a more flexible way. In this work, a novel method called feature adaptive selection subnetwork (FAS-Net) is proposed to construct effective features for detecting objects of different scales. Particularly, its adaption consists of two level: global attention and local adaptive selection. First, we model the global context of each feature map with global attention based feature selection module (GAFSM), which can strengthen the effective features across each layer adaptively. Then we extract the features of each region of interest (RoI) on the entire feature pyramid to construct a RoI feature pyramid. Finally, the RoI feature pyramid is sent to the feature adaptive selection module (FASM) to integrate the strengthened features according to the input adaptively. Our FAS-Net can be easily extended to other two-stage object detectors with feature pyramid, and supports to analyze the importance of different feature levels for multi-scale objects quantitatively. Besides, FAS-Net can also be further applied to instance segmentation task and get consistent improvements. Experiments on PASCAL07/12 and MSCOCO17 demonstrate the effectiveness and generalization of the proposed method.


2020 ◽  
Author(s):  
Nhan T. Nguyen ◽  
Dat Q. Tran ◽  
Dung B. Nguyen

ABSTRACTWe describe in this paper our deep learning-based approach for the EndoCV2020 challenge, which aims to detect and segment either artefacts or diseases in endoscopic images. For the detection task, we propose to train and optimize EfficientDet—a state-of-the-art detector—with different EfficientNet backbones using Focal loss. By ensembling multiple detectors, we obtain a mean average precision (mAP) of 0.2524 on EDD2020 and 0.2202 on EAD2020. For the segmentation task, two different architectures are proposed: UNet with EfficientNet-B3 encoder and Feature Pyramid Network (FPN) with dilated ResNet-50 encoder. Each of them is trained with an auxiliary classification branch. Our model ensemble reports an sscore of 0.5972 on EAD2020 and 0.701 on EDD2020, which were among the top submitters of both challenges.


2021 ◽  
Author(s):  
Amandip Sangha ◽  
Mohammad Rizvi

AbstractImportanceState-of-the art performance is achieved with a deep learning object detection model for acne detection. There is little current research on object detection in dermatology and acne in particular. As such, this work is early in this field and achieves state of the art performance.ObjectiveTrain an object detection model on a publicly available data set of acne photos.Design, Setting, and ParticipantsA deep learning model is trained with cross validation on a data set of facial acne photos.Main Outcomes and MeasuresObject detection models for detecting acne for single-class (acne) and multi-class (four severity levels). We train and evaluate the models using standard metrics such as mean average precision (mAP). Then we manually evaluate the model predictions on the test set, and calculate accuracy in terms of precision, recall, F1, true and false positive and negative detections.ResultsWe achieve state-of-the art mean average precision [email protected] value of 37.97 for the single class acne detection task, and 26.50 for the 4-class acne detection task. Moreover, our manual evaluation shows that the single class detection model performs well on the validation set, achieving true positive 93.59 %, precision 96.45 % and recall 94.73 %.Conclusions and RelevanceWe are able to train a high-accuracy acne detection model using only a small publicly available data set of facial acne. Transfer learning on the pre-trained deep learning model yields good accuracy and high degree of transferability to patient submitted photographs. We also note that the training of standard architecture object detection models has given significantly better accuracy than more intricate and bespoke neural network architectures in the existing research literature.Key PointsQuestionCan deep learning-based acne detection models trained on a small data set of publicly available photos of patients with acne achieve high prediction accuracy?FindingsWe find that it is possible to train a reasonably good object detection model on a small, annotated data set of acne photos using standard deep learning architectures.MeaningDeep learning-based object detection models for acne detection can be a useful decision support tools for dermatologists treating acne patients in a digital clinical practice. It can prove a particularly useful tool for monitoring the time evolution of the acne disease state over prolonged time during follow-ups, as the model predictions give a quantifiable and comparable output for photographs over time. This is particularly helpful in teledermatological consultations, as a prediction model can be integrated in the patient-doctor remote communication.


Author(s):  
M. Sushma Sri ◽  
B. Rajendra Naik ◽  
K. Jaya Sankar

In recent years there is rapid improvement in Object detection in areas of video analysis and image processing applications. Determing a desired object became an important aspect, so that there are many numerous of methods are evolved in Object detection. In this regard as there is rapid development in Deep Learning for its high-level processing, extracting deeper features, reliable and flexible compared to conventional techniques. In this article, the author proposes Object detection with deep neural networks and faster region convolutional neural networks methods for providing a simple algorithm which provides better accuracy and mean average precision.


Author(s):  
M. Karthikeyan ◽  
T. S. Subashini

Mechanical fasteners are widely used in manufacturing of hardware and mechanical components such as automobiles, turbine & power generation and industries. Object detection method play a vital role to make a smart system for the society. Internet of things (IoT) leads to automation based on sensors and actuators not enough to build the systems due to limitations of sensors. Computer vision is the one which makes IoT too much smarter using deep learning techniques. Object detection is used to detect, recognize and localize the object in an image or a real time video. In industry revolution, robot arm is used to fit the fasteners to the automobile components. This system will helps the robot to detect the object of fasteners such as screw and nails accordingly to fit to the vehicle moved in the assembly line. Faster R-CNN deep learning algorithm is used to train the custom dataset and object detection is used to detect the fasteners. Region based convolutional neural networks (Faster R-CNN) uses a region proposed network (RPN) network to train the model efficiently and also with the help of Region of Interest able to localize the screw and nails objects with a mean average precision of 0.72 percent leads to accuracy of 95 percent object detection


Author(s):  
Hoseong Kim ◽  
Jaeguk Hyun ◽  
Hyunjung Yoo ◽  
Chunho Kim ◽  
Hyunho Jeon

Recently, infrared object detection(IOD) has been extensively studied due to the rapid growth of deep neural networks(DNN). Adversarial attacks using imperceptible perturbation can dramatically deteriorate the performance of DNN. However, most adversarial attack works are focused on visible image recognition(VIR), and there are few methods for IOD. We propose deep learning-based adversarial attacks for IOD by expanding several state-of-the-art adversarial attacks for VIR. We effectively validate our claim through comprehensive experiments on two challenging IOD datasets, including FLIR and MSOD.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Hoanh Nguyen

License plate detection is a key problem in intelligent transportation systems. Recently, many deep learning-based networks have been proposed and achieved incredible success in general object detection, such as faster R-CNN, SSD, and R-FCN. However, directly applying these deep general object detection networks on license plate detection without modifying may not achieve good enough performance. This paper proposes a novel deep learning-based framework for license plate detection in traffic scene images based on predicted anchor region proposal and balanced feature pyramid. In the proposed framework, ResNet-34 architecture is first adopted for generating the base convolution feature maps. A balanced feature pyramid generation module is then used to generate balanced feature pyramid, of which each feature level obtains equal information from other feature levels. Furthermore, this paper designs a multiscale region proposal network with a novel predicted location anchor scheme to generate high-quality proposals. Finally, a detection network which includes a region of interest pooling layer and fully connected layers is adopted to further classify and regress the coordinates of detected license plates. Experimental results on public datasets show that the proposed approach achieves better detection performance compared with other state-of-the-art methods on license plate detection.


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


Algorithms ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 39
Author(s):  
Carlos Lassance ◽  
Vincent Gripon ◽  
Antonio Ortega

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.


Sign in / Sign up

Export Citation Format

Share Document