scholarly journals Weakly Supervised Learning for Object Localization Based on an Attention Mechanism

2021 ◽  
Vol 11 (22) ◽  
pp. 10953
Author(s):  
Nojin Park ◽  
Hanseok Ko

Recently, deep learning has been successfully applied to object detection and localization tasks in images. When setting up deep learning frameworks for supervised training with large datasets, strongly labeling the objects facilitates good performance; however, the complexity of the image scene and large size of the dataset make this a laborious task. Hence, it is of paramount importance that the expensive work associated with the tasks involving strong labeling, such as bounding box annotation, is reduced. In this paper, we propose a method to perform object localization tasks without bounding box annotation in the training process by means of employing a two-path activation-map-based classifier framework. In particular, we develop an activation-map-based framework to judicially control the attention map in the perception branch by adding a two-feature extractor so that better attention weights can be distributed to induce improved performance. The experimental results indicate that our method surpasses the performance of the existing deep learning models based on weakly supervised object localization. The experimental results show that the proposed method achieves the best performance, with 75.21% Top-1 classification accuracy and 55.15% Top-1 localization accuracy on the CUB-200-2011 dataset.

2021 ◽  
Vol 7 (8) ◽  
pp. 145
Author(s):  
Antoine Mauri ◽  
Redouane Khemmar ◽  
Benoit Decoux ◽  
Madjid Haddad ◽  
Rémi Boutteau

For smart mobility, autonomous vehicles, and advanced driver-assistance systems (ADASs), perception of the environment is an important task in scene analysis and understanding. Better perception of the environment allows for enhanced decision making, which, in turn, enables very high-precision actions. To this end, we introduce in this work a new real-time deep learning approach for 3D multi-object detection for smart mobility not only on roads, but also on railways. To obtain the 3D bounding boxes of the objects, we modified a proven real-time 2D detector, YOLOv3, to predict 3D object localization, object dimensions, and object orientation. Our method has been evaluated on KITTI’s road dataset as well as on our own hybrid virtual road/rail dataset acquired from the video game Grand Theft Auto (GTA) V. The evaluation of our method on these two datasets shows good accuracy, but more importantly that it can be used in real-time conditions, in road and rail traffic environments. Through our experimental results, we also show the importance of the accuracy of prediction of the regions of interest (RoIs) used in the estimation of 3D bounding box parameters.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Zhijian Huang ◽  
Fangmin Li ◽  
Xidao Luan ◽  
Zuowei Cai

Automatically detecting mud in bauxite ores is important and valuable, with which we can improve productivity and reduce pollution. However, distinguishing mud and ores in a real scene is challenging for their similarity in shape, color, and texture. Moreover, training a deep learning model needs a large amount of exactly labeled samples, which is expensive and time consuming. Aiming at the challenging problem, this paper proposed a novel weakly supervised method based on deep active learning (AL), named YOLO-AL. The method uses the YOLO-v3 model as the basic detector, which is initialized with the pretrained weights on the MS COCO dataset. Then, an AL framework-embedded YOLO-v3 model is constructed. In the AL process, it iteratively fine-tunes the last few layers of the YOLO-v3 model with the most valuable samples, which is selected by a Less Confident (LC) strategy. Experimental results show that the proposed method can effectively detect mud in ores. More importantly, the proposed method can obviously reduce the labeled samples without decreasing the detection accuracy.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Ren-Yi Kung ◽  
Nai-Hsin Pan ◽  
Charles C.N. Wang ◽  
Pin-Chan Lee

Several natural and human factors are responsible for the defacement of the external walls and tiles of buildings, and the related deterioration can be a public safety hazard. Therefore, active building maintenance and repair processes are essential for ensuring building sustainability. However, conventional inspection methods are time-, cost-, and labor-intensive processes. Therefore, herein, this study proposes a convolutional neural network (CNN) model for image-based automated detection and localization of key building defects (efflorescence, spalling, cracking, and defacement). Based on a pretrained CNN VGG-16 classifier, this model applies class activation mapping for object localization. After identifying its limitations in real-life applications, this study determined the model’s robustness and ability to accurately detect and localize defects in the external wall tiles of buildings. For real-time detection and localization, this study applied this model by using mobile devices and drones. The results show that the application of deep learning with UAV can effectively detect various kinds of external wall defects and improve the detection efficiency.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3813
Author(s):  
Athanasios Anagnostis ◽  
Aristotelis C. Tagarakis ◽  
Dimitrios Kateris ◽  
Vasileios Moysiadis ◽  
Claus Grøn Sørensen ◽  
...  

This study aimed to propose an approach for orchard trees segmentation using aerial images based on a deep learning convolutional neural network variant, namely the U-net network. The purpose was the automated detection and localization of the canopy of orchard trees under various conditions (i.e., different seasons, different tree ages, different levels of weed coverage). The implemented dataset was composed of images from three different walnut orchards. The achieved variability of the dataset resulted in obtaining images that fell under seven different use cases. The best-trained model achieved 91%, 90%, and 87% accuracy for training, validation, and testing, respectively. The trained model was also tested on never-before-seen orthomosaic images or orchards based on two methods (oversampling and undersampling) in order to tackle issues with out-of-the-field boundary transparent pixels from the image. Even though the training dataset did not contain orthomosaic images, it achieved performance levels that reached up to 99%, demonstrating the robustness of the proposed approach.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Changyong Li ◽  
Yongxian Fan ◽  
Xiaodong Cai

Abstract Background With the development of deep learning (DL), more and more methods based on deep learning are proposed and achieve state-of-the-art performance in biomedical image segmentation. However, these methods are usually complex and require the support of powerful computing resources. According to the actual situation, it is impractical that we use huge computing resources in clinical situations. Thus, it is significant to develop accurate DL based biomedical image segmentation methods which depend on resources-constraint computing. Results A lightweight and multiscale network called PyConvU-Net is proposed to potentially work with low-resources computing. Through strictly controlled experiments, PyConvU-Net predictions have a good performance on three biomedical image segmentation tasks with the fewest parameters. Conclusions Our experimental results preliminarily demonstrate the potential of proposed PyConvU-Net in biomedical image segmentation with resources-constraint computing.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2595
Author(s):  
Balakrishnan Ramalingam ◽  
Abdullah Aamir Hayat ◽  
Mohan Rajesh Elara ◽  
Braulio Félix Gómez ◽  
Lim Yi ◽  
...  

The pavement inspection task, which mainly includes crack and garbage detection, is essential and carried out frequently. The human-based or dedicated system approach for inspection can be easily carried out by integrating with the pavement sweeping machines. This work proposes a deep learning-based pavement inspection framework for self-reconfigurable robot named Panthera. Semantic segmentation framework SegNet was adopted to segment the pavement region from other objects. Deep Convolutional Neural Network (DCNN) based object detection is used to detect and localize pavement defects and garbage. Furthermore, Mobile Mapping System (MMS) was adopted for the geotagging of the defects. The proposed system was implemented and tested with the Panthera robot having NVIDIA GPU cards. The experimental results showed that the proposed technique identifies the pavement defects and litters or garbage detection with high accuracy. The experimental results on the crack and garbage detection are presented. It is found that the proposed technique is suitable for deployment in real-time for garbage detection and, eventually, sweeping or cleaning tasks.


Sign in / Sign up

Export Citation Format

Share Document