Weakly Supervised Learning for Object Localization Based on an Attention Mechanism

Nojin Park; Hanseok Ko

doi:10.3390/app112210953

Weakly Supervised Learning for Object Localization Based on an Attention Mechanism

Applied Sciences ◽

10.3390/app112210953 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10953

Author(s):

Nojin Park ◽

Hanseok Ko

Keyword(s):

Deep Learning ◽

Experimental Results ◽

Object Localization ◽

Localization Accuracy ◽

Large Size ◽

Bounding Box ◽

Improved Performance ◽

Weakly Supervised ◽

Detection And Localization ◽

Activation Map

Recently, deep learning has been successfully applied to object detection and localization tasks in images. When setting up deep learning frameworks for supervised training with large datasets, strongly labeling the objects facilitates good performance; however, the complexity of the image scene and large size of the dataset make this a laborious task. Hence, it is of paramount importance that the expensive work associated with the tasks involving strong labeling, such as bounding box annotation, is reduced. In this paper, we propose a method to perform object localization tasks without bounding box annotation in the training process by means of employing a two-path activation-map-based classifier framework. In particular, we develop an activation-map-based framework to judicially control the attention map in the perception branch by adding a two-feature extractor so that better attention weights can be distributed to induce improved performance. The experimental results indicate that our method surpasses the performance of the existing deep learning models based on weakly supervised object localization. The experimental results show that the proposed method achieves the best performance, with 75.21% Top-1 classification accuracy and 55.15% Top-1 localization accuracy on the CUB-200-2011 dataset.

Download Full-text

Real-Time 3D Multi-Object Detection and Localization Based on Deep Learning for Road and Railway Smart Mobility

Journal of Imaging ◽

10.3390/jimaging7080145 ◽

2021 ◽

Vol 7 (8) ◽

pp. 145

Author(s):

Antoine Mauri ◽

Redouane Khemmar ◽

Benoit Decoux ◽

Madjid Haddad ◽

Rémi Boutteau

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time ◽

Video Game ◽

Autonomous Vehicles ◽

Object Localization ◽

Driver Assistance Systems ◽

Smart Mobility ◽

Bounding Boxes ◽

Detection And Localization

For smart mobility, autonomous vehicles, and advanced driver-assistance systems (ADASs), perception of the environment is an important task in scene analysis and understanding. Better perception of the environment allows for enhanced decision making, which, in turn, enables very high-precision actions. To this end, we introduce in this work a new real-time deep learning approach for 3D multi-object detection for smart mobility not only on roads, but also on railways. To obtain the 3D bounding boxes of the objects, we modified a proven real-time 2D detector, YOLOv3, to predict 3D object localization, object dimensions, and object orientation. Our method has been evaluated on KITTI’s road dataset as well as on our own hybrid virtual road/rail dataset acquired from the video game Grand Theft Auto (GTA) V. The evaluation of our method on these two datasets shows good accuracy, but more importantly that it can be used in real-time conditions, in road and rail traffic environments. Through our experimental results, we also show the importance of the accuracy of prediction of the regions of interest (RoIs) used in the estimation of 3D bounding box parameters.

Download Full-text

A Weakly Supervised Method for Mud Detection in Ores Based on Deep Active Learning

Mathematical Problems in Engineering ◽

10.1155/2020/3510313 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Zhijian Huang ◽

Fangmin Li ◽

Xidao Luan ◽

Zuowei Cai

Keyword(s):

Deep Learning ◽

Active Learning ◽

Learning Model ◽

Experimental Results ◽

Detection Accuracy ◽

Challenging Problem ◽

Real Scene ◽

Weakly Supervised ◽

Deep Learning Model

Automatically detecting mud in bauxite ores is important and valuable, with which we can improve productivity and reduce pollution. However, distinguishing mud and ores in a real scene is challenging for their similarity in shape, color, and texture. Moreover, training a deep learning model needs a large amount of exactly labeled samples, which is expensive and time consuming. Aiming at the challenging problem, this paper proposed a novel weakly supervised method based on deep active learning (AL), named YOLO-AL. The method uses the YOLO-v3 model as the basic detector, which is initialized with the pretrained weights on the MS COCO dataset. Then, an AL framework-embedded YOLO-v3 model is constructed. In the AL process, it iteratively fine-tunes the last few layers of the YOLO-v3 model with the most valuable samples, which is selected by a Less Confident (LC) strategy. Experimental results show that the proposed method can effectively detect mud in ores. More importantly, the proposed method can obviously reduce the labeled samples without decreasing the detection accuracy.

Download Full-text

Application of Deep Learning and Unmanned Aerial Vehicle on Building Maintenance

Advances in Civil Engineering ◽

10.1155/2021/5598690 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Ren-Yi Kung ◽

Nai-Hsin Pan ◽

Charles C.N. Wang ◽

Pin-Chan Lee

Keyword(s):

Deep Learning ◽

Detection Efficiency ◽

Real Life ◽

Object Localization ◽

Safety Hazard ◽

External Wall ◽

Building Maintenance ◽

Aerial Vehicle ◽

Detection And Localization ◽

Activation Mapping

Several natural and human factors are responsible for the defacement of the external walls and tiles of buildings, and the related deterioration can be a public safety hazard. Therefore, active building maintenance and repair processes are essential for ensuring building sustainability. However, conventional inspection methods are time-, cost-, and labor-intensive processes. Therefore, herein, this study proposes a convolutional neural network (CNN) model for image-based automated detection and localization of key building defects (efflorescence, spalling, cracking, and defacement). Based on a pretrained CNN VGG-16 classifier, this model applies class activation mapping for object localization. After identifying its limitations in real-life applications, this study determined the model’s robustness and ability to accurately detect and localize defects in the external wall tiles of buildings. For real-time detection and localization, this study applied this model by using mobile devices and drones. The results show that the application of deep learning with UAV can effectively detect various kinds of external wall defects and improve the detection efficiency.

Download Full-text

Orchard Mapping with Deep Learning Semantic Segmentation

Sensors ◽

10.3390/s21113813 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3813

Author(s):

Athanasios Anagnostis ◽

Aristotelis C. Tagarakis ◽

Dimitrios Kateris ◽

Vasileios Moysiadis ◽

Claus Grøn Sørensen ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Semantic Segmentation ◽

Automated Detection ◽

Aerial Images ◽

Training Dataset ◽

Field Boundary ◽

Different Seasons ◽

Detection And Localization ◽

Different Levels

This study aimed to propose an approach for orchard trees segmentation using aerial images based on a deep learning convolutional neural network variant, namely the U-net network. The purpose was the automated detection and localization of the canopy of orchard trees under various conditions (i.e., different seasons, different tree ages, different levels of weed coverage). The implemented dataset was composed of images from three different walnut orchards. The achieved variability of the dataset resulted in obtaining images that fell under seven different use cases. The best-trained model achieved 91%, 90%, and 87% accuracy for training, validation, and testing, respectively. The trained model was also tested on never-before-seen orthomosaic images or orchards based on two methods (oversampling and undersampling) in order to tackle issues with out-of-the-field boundary transparent pixels from the image. Even though the training dataset did not contain orthomosaic images, it achieved performance levels that reached up to 99%, demonstrating the robustness of the proposed approach.

Download Full-text

A Weakly Supervised WordNet-Guided Deep Learning Approach to Extracting Aspect Terms from Online Reviews

ACM Transactions on Management Information Systems ◽

10.1145/3399630 ◽

2020 ◽

Vol 11 (3) ◽

pp. 1-22

Author(s):

Jie Tao ◽

Lina Zhou

Keyword(s):

Deep Learning ◽

Online Reviews ◽

Learning Approach ◽

Weakly Supervised

Download Full-text

Contrastive consistent feature learning for weakly supervised object localization semantic segmentation

Neurocomputing ◽

10.1016/j.neucom.2021.03.023 ◽

2021 ◽

Author(s):

Minsong Ki ◽

Youngjung Uh ◽

Wonyoung Lee ◽

Hyeran Byun

Keyword(s):

Feature Learning ◽

Semantic Segmentation ◽

Object Localization ◽

Consistent Feature ◽

Weakly Supervised

Download Full-text

PyConvU-Net: a lightweight and multiscale network for biomedical image segmentation

BMC Bioinformatics ◽

10.1186/s12859-020-03943-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Changyong Li ◽

Yongxian Fan ◽

Xiaodong Cai

Keyword(s):

Image Segmentation ◽

Deep Learning ◽

State Of The Art ◽

Experimental Results ◽

Actual Situation ◽

Controlled Experiments ◽

Biomedical Image ◽

Segmentation Methods ◽

Art Performance

Abstract Background With the development of deep learning (DL), more and more methods based on deep learning are proposed and achieve state-of-the-art performance in biomedical image segmentation. However, these methods are usually complex and require the support of powerful computing resources. According to the actual situation, it is impractical that we use huge computing resources in clinical situations. Thus, it is significant to develop accurate DL based biomedical image segmentation methods which depend on resources-constraint computing. Results A lightweight and multiscale network called PyConvU-Net is proposed to potentially work with low-resources computing. Through strictly controlled experiments, PyConvU-Net predictions have a good performance on three biomedical image segmentation tasks with the fewest parameters. Conclusions Our experimental results preliminarily demonstrate the potential of proposed PyConvU-Net in biomedical image segmentation with resources-constraint computing.

Download Full-text

Deep Learning Based Pavement Inspection Using Self-Reconfigurable Robot

Sensors ◽

10.3390/s21082595 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2595

Author(s):

Balakrishnan Ramalingam ◽

Abdullah Aamir Hayat ◽

Mohan Rajesh Elara ◽

Braulio Félix Gómez ◽

Lim Yi ◽

...

Keyword(s):

Deep Learning ◽

Semantic Segmentation ◽

High Accuracy ◽

Experimental Results ◽

Mobile Mapping ◽

Mapping System ◽

Mobile Mapping System ◽

Reconfigurable Robot ◽

Nvidia Gpu ◽

Inspection Task

The pavement inspection task, which mainly includes crack and garbage detection, is essential and carried out frequently. The human-based or dedicated system approach for inspection can be easily carried out by integrating with the pavement sweeping machines. This work proposes a deep learning-based pavement inspection framework for self-reconfigurable robot named Panthera. Semantic segmentation framework SegNet was adopted to segment the pavement region from other objects. Deep Convolutional Neural Network (DCNN) based object detection is used to detect and localize pavement defects and garbage. Furthermore, Mobile Mapping System (MMS) was adopted for the geotagging of the defects. The proposed system was implemented and tested with the Panthera robot having NVIDIA GPU cards. The experimental results showed that the proposed technique identifies the pavement defects and litters or garbage detection with high accuracy. The experimental results on the crack and garbage detection are presented. It is found that the proposed technique is suitable for deployment in real-time for garbage detection and, eventually, sweeping or cleaning tasks.

Download Full-text