scholarly journals A Survey on “Object Detection Algorithms for Visually Impaired People”

Author(s):  
Prof. Pradnya Kasture ◽  
Aishwarya Kumkar ◽  
Yash Jagtap ◽  
Akshay Tangade ◽  
Aditya Pole

Vision is one in every of the foremost necessary human senses and it plays a really necessary role in human interaction with the surrounding objects. Until now many papers have been published on these topics that shows various different computer vision products and services by developing new electronic devices for the visually disabled people. The aim is to study different object detection methods. As compared to other Object detection methods, YOLO method has multiple advantages. In alternative algorithms like CNN, Fast-CNN the algorithmic program won't investigate the image fully however in YOLO the algorithmic program investigate the image fully by predicting the bounding boxes by making use of convolutional network and possibilities for these boxes and detects the image quicker as compared to alternative algorithms.

2018 ◽  
Vol 232 ◽  
pp. 04036
Author(s):  
Jun Yin ◽  
Huadong Pan ◽  
Hui Su ◽  
Zhonggeng Liu ◽  
Zhirong Peng

We propose an object detection method that predicts the orientation bounding boxes (OBB) to estimate objects locations, scales and orientations based on YOLO (You Only Look Once), which is one of the top detection algorithms performing well both in accuracy and speed. Horizontal bounding boxes(HBB), which are not robust to orientation variances, are used in the existing object detection methods to detect targets. The proposed orientation invariant YOLO (OIYOLO) detector can effectively deal with the bird’s eye viewpoint images where the orientation angles of the objects are arbitrary. In order to estimate the rotated angle of objects, we design a new angle loss function. Therefore, the training of OIYOLO forces the network to learn the annotated orientation angle of objects, making OIYOLO orientation invariances. The proposed approach that predicts OBB can be applied in other detection frameworks. In additional, to evaluate the proposed OIYOLO detector, we create an UAV-DAHUA datasets that annotated with objects locations, scales and orientation angles accurately. Extensive experiments conducted on UAV-DAHUA and DOTA datasets demonstrate that OIYOLO achieves state-of-the-art detection performance with high efficiency comparing with the baseline YOLO algorithms.


Electronics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 279
Author(s):  
Rafael Padilla ◽  
Wesley L. Passos ◽  
Thadeu L. B. Dias ◽  
Sergio L. Netto ◽  
Eduardo A. B. da Silva

Recent outstanding results of supervised object detection in competitions and challenges are often associated with specific metrics and datasets. The evaluation of such methods applied in different contexts have increased the demand for annotated datasets. Annotation tools represent the location and size of objects in distinct formats, leading to a lack of consensus on the representation. Such a scenario often complicates the comparison of object detection methods. This work alleviates this problem along the following lines: (i) It provides an overview of the most relevant evaluation methods used in object detection competitions, highlighting their peculiarities, differences, and advantages; (ii) it examines the most used annotation formats, showing how different implementations may influence the assessment results; and (iii) it provides a novel open-source toolkit supporting different annotation formats and 15 performance metrics, making it easy for researchers to evaluate the performance of their detection algorithms in most known datasets. In addition, this work proposes a new metric, also included in the toolkit, for evaluating object detection in videos that is based on the spatio-temporal overlap between the ground-truth and detected bounding boxes.


2020 ◽  
Author(s):  
HE Yang ◽  
Beibei Fan ◽  
Ling ling Guo

Abstract The anchor-free method based on key point detection has made great progress. However, the anchor-free method is too dependent on using a convolutional network to generate a rough heat map. This is difficult to detect for objects with a large size variation and dense and overlapping objects. To solve this problem, first, we propose a mask attention mechanism for object detection methods. And make full use of the advantages of the attention mechanism to improve the accuracy of network detection heat map generation. Then, we designed an optimized fire model to reduce the size of the model. The fire model is an extension of grouped convolution. The fire model allows each group of convolutional network features to learn the same feature through purposeful grouping. In this paper, the mask attention mechanism uses object segmentation images to guide the generation of corner heat maps. Our approach achieved an accuracy of 91.84% and a recall 89.83% in the Tencent-100K dataset. Compared with the popular object detection methods, the proposed method has advantages in model size and accuracy.


Author(s):  
Jiajia Liao ◽  
Yujun Liu ◽  
Yingchao Piao ◽  
Jinhe Su ◽  
Guorong Cai ◽  
...  

AbstractRecent advances in camera-equipped drone applications increased the demand for visual object detection algorithms with deep learning for aerial images. There are several limitations in accuracy for a single deep learning model. Inspired by ensemble learning can significantly improve the generalization ability of the model in the machine learning field, we introduce a novel integration strategy to combine the inference results of two different methods without non-maximum suppression. In this paper, a global and local ensemble network (GLE-Net) was proposed to increase the quality of predictions by considering the global weights for different models and adjusting the local weights for bounding boxes. Specifically, the global module assigns different weights to models. In the local module, we group the bounding boxes that corresponding to the same object as a cluster. Each cluster generates a final predict box and assigns the highest score in the cluster as the score of the final predict box. Experiments on benchmarks VisDrone2019 show promising performance of GLE-Net compared with the baseline network.


2021 ◽  
Vol 13 (13) ◽  
pp. 2459
Author(s):  
Yangyang Li ◽  
Heting Mao ◽  
Ruijiao Liu ◽  
Xuan Pei ◽  
Licheng Jiao ◽  
...  

Object detection in remote sensing images has been widely used in military and civilian fields and is a challenging task due to the complex background, large-scale variation, and dense arrangement in arbitrary orientations of objects. In addition, existing object detection methods rely on the increasingly deeper network, which increases a lot of computational overhead and parameters, and is unfavorable to deployment on the edge devices. In this paper, we proposed a lightweight keypoint-based oriented object detector for remote sensing images. First, we propose a semantic transfer block (STB) when merging shallow and deep features, which reduces noise and restores the semantic information. Then, the proposed adaptive Gaussian kernel (AGK) is adapted to objects of different scales, and further improves detection performance. Finally, we propose the distillation loss associated with object detection to obtain a lightweight student network. Experiments on the HRSC2016 and UCAS-AOD datasets show that the proposed method adapts to different scale objects, obtains accurate bounding boxes, and reduces the influence of complex backgrounds. The comparison with mainstream methods proves that our method has comparable performance under lightweight.


Author(s):  
Prof. S. G. Latake

This work aims to assist the visually impaired people for reading a text material and detect objects in their surroundings. The input is taken in the form of an image captured from the web camera. This image is then processed either for the purpose of text reading or for object detection based on user choice. The main aim of this project is to build a system that detects objects from the image or a stream of images given to the system in the form of previously recorded video or the real time input from the camera. Bounding boxes will be drawn around the objects that are being detected by the System. The system will also classify the object to the classes the object belongs. Python programming and a machine Learning technique named yolo (you only look once) algorithm using convolutional neural network is used for the object detection. The smart blind navigation is fill gap, providing accurate and contextually rich information about the environment around the user current location, and simplifying the navigation and increasing the overall accuracy of the System. Preventing the user from dangerous locations. They have very little information on self-velocity objects, direction which is essential for travel. The navigation systems is costly which is not affordable by the common blind people. The navigation system are heavy complicated to operate.


2021 ◽  
Vol 1 (2) ◽  
Author(s):  
LOKESHKUMAR G ◽  
RAMAPRAKASH G

Object detection is used in almost every real-world application, including autonomous traversal, visual systems, and facial recognition, to name a few. The purpose of this study is to apply object detection algorithms to assist visually impaired people. It allows vision impaired people to be aware of their surroundings, enabling them to move freely. With promising findings, a prototype was developed on a Raspberry PI 3 using OpenCV libraries. This research looks at the many methods of detecting items with audio output using various object detection algorithms, such as a deep neural network for SSD constructed using the Caffe model. Along with vocal coaching, we've incorporated an emergency button to inform those nearby and a vibrator to alert deaf people to the obstruction in front of the camera.


2020 ◽  
Vol 16 (3) ◽  
pp. 227-243
Author(s):  
Shahid Karim ◽  
Ye Zhang ◽  
Shoulin Yin ◽  
Irfana Bibi ◽  
Ali Anwar Brohi

Traditional object detection algorithms and strategies are difficult to meet the requirements of data processing efficiency, performance, speed and intelligence in object detection. Through the study and imitation of the cognitive ability of the brain, deep learning can analyze and process the data features. It has a strong ability of visualization and becomes the mainstream algorithm of current object detection applications. Firstly, we have discussed the developments of traditional object detection methods. Secondly, the frameworks of object detection (e.g. Region-based CNN (R-CNN), Spatial Pyramid Pooling Network (SPP-NET), Fast-RCNN and Faster-RCNN) which combine region proposals and convolutional neural networks (CNNs) are briefly characterized for optical remote sensing applications. You only look once (YOLO) algorithm is the representative of the object detection frameworks (e.g. YOLO and Single Shot MultiBox Detector (SSD)) which transforms the object detection into a regression problem. The limitations of remote sensing images and object detectors have been highlighted and discussed. The feasibility and limitations of these approaches will lead researchers to prudently select appropriate image enhancements. Finally, the problems of object detection algorithms in deep learning are summarized and the future recommendations are also conferred.


Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1686 ◽  
Author(s):  
Feng Yang ◽  
Wentong Li ◽  
Haiwei Hu ◽  
Wanyi Li ◽  
Peng Wang

Accurate and robust detection of multi-class objects in very high resolution (VHR) aerial images has been playing a significant role in many real-world applications. The traditional detection methods have made remarkable progresses with horizontal bounding boxes (HBBs) due to CNNs. However, HBB detection methods still exhibit limitations including the missed detection and the redundant detection regions, especially for densely-distributed and strip-like objects. Besides, large scale variations and diverse background also bring in many challenges. Aiming to address these problems, an effective region-based object detection framework named Multi-scale Feature Integration Attention Rotation Network (MFIAR-Net) is proposed for aerial images with oriented bounding boxes (OBBs), which promotes the integration of the inherent multi-scale pyramid features to generate a discriminative feature map. Meanwhile, the double-path feature attention network supervised by the mask information of ground truth is introduced to guide the network to focus on object regions and suppress the irrelevant noise. To boost the rotation regression and classification performance, we present a robust Rotation Detection Network, which can generate efficient OBB representation. Extensive experiments and comprehensive evaluations on two publicly available datasets demonstrate the effectiveness of the proposed framework.


Electronics ◽  
2020 ◽  
Vol 9 (3) ◽  
pp. 537 ◽  
Author(s):  
Liquan Zhao ◽  
Shuaiyang Li

The ‘You Only Look Once’ v3 (YOLOv3) method is among the most widely used deep learning-based object detection methods. It uses the k-means cluster method to estimate the initial width and height of the predicted bounding boxes. With this method, the estimated width and height are sensitive to the initial cluster centers, and the processing of large-scale datasets is time-consuming. In order to address these problems, a new cluster method for estimating the initial width and height of the predicted bounding boxes has been developed. Firstly, it randomly selects a couple of width and height values as one initial cluster center separate from the width and height of the ground truth boxes. Secondly, it constructs Markov chains based on the selected initial cluster and uses the final points of every Markov chain as the other initial centers. In the construction of Markov chains, the intersection-over-union method is used to compute the distance between the selected initial clusters and each candidate point, instead of the square root method. Finally, this method can be used to continually update the cluster center with each new set of width and height values, which are only a part of the data selected from the datasets. Our simulation results show that the new method has faster convergence speed for initializing the width and height of the predicted bounding boxes and that it can select more representative initial widths and heights of the predicted bounding boxes. Our proposed method achieves better performance than the YOLOv3 method in terms of recall, mean average precision, and F1-score.


Sign in / Sign up

Export Citation Format

Share Document