Object Detection Techniques: A State-of-the-Art Survey and Challenges

In the last decade, object detection is one of the interesting topics that played an important role in revolutionizing the presentera. Especially when it comes to computervision, object detection is a challenging and most fundamental problem. Researchersin the last decade enhanced object detection and made many advance discoveries using thetechnological advancements. When wetalk about object detection, we also must talk about deep learning and its advancements over the time. This research work describes theadvancements in object detection over last10 years (2010-2020). Different papers published in last 10 years related to objectdetection and its types are discussed with respect to their role in advancement of object detection. This research work also describesdifferent types of object detection, which include text detection, face detection etc. It clearly describes the changes inobject detection techniques over the period of the last 10 years. The Objectdetection is divided into two groups. General detectionand Task based detection. General detection is discussed chronologically and with its different variants while task based detectionincludes many state of the art algorithms and techniques according to tasks. Wealso described the basic comparison of how somealgorithms and techniques have been updated and played a major role in advancements of different fields related to object detection.We conclude that the most important advancements happened in the last decade and the future is promising much more advancement inobject detection on the basis of work done in this decade.In the last decade, object detection is one of the interesting topics that played an important role in revolutionizing the presentera. Especially when it comes to computervision, object detection is the challenging and most fundamental problem. Researchersinlast decade enhanced object detection and made many advance discoveries using thetechnological advancements. When wetalk about object detection, we also must talk about deep learning and its advancements over the time. This research work describes theadvancements in object detection over last10 years (2010-2020). Different papers published in last 10 years related to objectdetection and its types are discussed with respect to their role in advancement of object detection. This research work also describesdifferent types of object detection, which include text detection, face detection etc. It clearly describes the changes inobject detection techniques over the period of last 10 years. The Objectdetection is divided into two groups. General detectionand Task based detection. General detection is discussed chronologically and with its different variants while task based detectionincludes many state of the art algorithms and techniques according to tasks. Wealso described the basic comparison of how somealgorithms and techniques have been updated and played a major role in advancements of different fields related to object detection.We conclude that the most important advancements happened in last decade and future is promising much more advancement inobject detection on the basis of work done in this decade.

Download Full-text

Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech

Sensors ◽

10.3390/s21041205 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1205

Author(s):

Mohammed Algabri ◽

Hassan Mathkour ◽

Mansour M. Alsulaiman ◽

Mohamed A. Bencherif

Keyword(s):

Object Detection ◽

Speech Signal ◽

State Of The Art ◽

Detection System ◽

Phoneme Recognition ◽

Detection Techniques ◽

Articulatory Features ◽

Arabic And English ◽

Scale Detection ◽

Phoneme Detection

This study proposes using object detection techniques to recognize sequences of articulatory features (AFs) from speech utterances by treating AFs of phonemes as multi-label objects in speech spectrogram. The proposed system, called AFD-Obj, recognizes sequence of multi-label AFs in speech signal and localizes them. AFD-Obj consists of two main stages: firstly, we formulate the problem of AFs detection as an object detection problem and prepare the data to fulfill requirement of object detectors by generating a spectral three-channel image from the speech signal and creating the corresponding annotation for each utterance. Secondly, we use annotated images to train the proposed system to detect sequences of AFs and their boundaries. We test the system by feeding spectrogram images to the system, which will recognize and localize multi-label AFs. We investigated using these AFs to detect the utterance phonemes. YOLOv3-tiny detector is selected because of its real-time property and its support for multi-label detection. We test our AFD-Obj system on Arabic and English languages using KAPD and TIMIT corpora, respectively. Additionally, we propose using YOLOv3-tiny as an Arabic phoneme detection system (i.e., PD-Obj) to recognize and localize a sequence of Arabic phonemes from whole speech utterances. The proposed AFD-Obj and PD-Obj systems achieve excellent results for Arabic corpus and comparable to the state-of-the-art method for English corpus. Moreover, we showed that using only one-scale detection is suitable for AFs detection or phoneme recognition.

Download Full-text

Browser Security Attacks and Detection Techniques: A Case of Tabnabbing

Science & Technology Journal ◽

10.22232/stj.2020.08.01.03 ◽

2020 ◽

Vol 8 (1) ◽

pp. 33-41

Author(s):

Dr. S. Sarika ◽

Keyword(s):

Credit Card ◽

State Of The Art ◽

Experimental Results ◽

Detection Technique ◽

Security Attacks ◽

Agent Based ◽

Detection Techniques ◽

Browser Security ◽

Cyber Threats ◽

Multi Agent

Phishing is a malicious and deliberate act of sending counterfeit messages or mimicking a webpage. The goal is either to steal sensitive credentials like login information and credit card details or to install malware on a victim’s machine. Browser-based cyber threats have become one of the biggest concerns in networked architectures. The most prolific form of browser attack is tabnabbing which happens in inactive browser tabs. In a tabnabbing attack, a fake page disguises itself as a genuine page to steal data. This paper presents a multi agent based tabnabbing detection technique. The method detects heuristic changes in a webpage when a tabnabbing attack happens and give a warning to the user. Experimental results show that the method performs better when compared with state of the art tabnabbing detection techniques.

Download Full-text

Transcription Alignment of Historical Vietnamese Manuscripts without Human-Annotated Learning Samples

Applied Sciences ◽

10.3390/app11114894 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4894

Author(s):

Anna Scius-Bertrand ◽

Michael Jungo ◽

Beat Wolf ◽

Andreas Fischer ◽

Marc Bui

Keyword(s):

Object Detection ◽

State Of The Art ◽

Positive Impact ◽

Detection System ◽

Training Data ◽

Detection Accuracy ◽

Current State ◽

Alignment Task ◽

Scanned Image ◽

Automatic Transcription

The current state of the art for automatic transcription of historical manuscripts is typically limited by the requirement of human-annotated learning samples, which are are necessary to train specific machine learning models for specific languages and scripts. Transcription alignment is a simpler task that aims to find a correspondence between text in the scanned image and its existing Unicode counterpart, a correspondence which can then be used as training data. The alignment task can be approached with heuristic methods dedicated to certain types of manuscripts, or with weakly trained systems reducing the required amount of annotations. In this article, we propose a novel learning-based alignment method based on fully convolutional object detection that does not require any human annotation at all. Instead, the object detection system is initially trained on synthetic printed pages using a font and then adapted to the real manuscripts by means of self-training. On a dataset of historical Vietnamese handwriting, we demonstrate the feasibility of annotation-free alignment as well as the positive impact of self-training on the character detection accuracy, reaching a detection accuracy of 96.4% with a YOLOv5m model without using any human annotation.

Download Full-text

A Set of Single YOLO Modalities to Detect Occluded Entities via Viewpoint Conversion

Applied Sciences ◽

10.3390/app11136016 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6016

Author(s):

Jinsoo Kim ◽

Jeongho Cho

Keyword(s):

Object Detection ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Detection Algorithm ◽

Detection Accuracy ◽

Cloud Data ◽

Detection Techniques ◽

Bounding Boxes ◽

Partially Occluded ◽

Rgb Image

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.

Download Full-text

Object Detection Under Rainy Conditions for Autonomous Vehicles: A Review of State-of-the-Art and Emerging Techniques

IEEE Signal Processing Magazine ◽

10.1109/msp.2020.2984801 ◽

2021 ◽

Vol 38 (1) ◽

pp. 53-67

Author(s):

Mazin Hnewa ◽

Hayder Radha

Keyword(s):

Object Detection ◽

Autonomous Vehicles ◽

State Of The Art

Download Full-text

Malaria parasite detection in thick blood smear microscopic images using modified YOLOV3 and YOLOV4 models

BMC Bioinformatics ◽

10.1186/s12859-021-04036-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Fetulhak Abdurahman ◽

Kinde Anlay Fante ◽

Mohammed Aliy

Keyword(s):

Object Detection ◽

Malaria Parasite ◽

Blood Smear ◽

Clustering Algorithm ◽

State Of The Art ◽

Detection Accuracy ◽

Small Object ◽

Thick Blood Smear ◽

Malaria Parasites ◽

Microscopic Images

Abstract Background Manual microscopic examination of Leishman/Giemsa stained thin and thick blood smear is still the “gold standard” for malaria diagnosis. One of the drawbacks of this method is that its accuracy, consistency, and diagnosis speed depend on microscopists’ diagnostic and technical skills. It is difficult to get highly skilled microscopists in remote areas of developing countries. To alleviate this problem, in this paper, we propose to investigate state-of-the-art one-stage and two-stage object detection algorithms for automated malaria parasite screening from microscopic image of thick blood slides. Results YOLOV3 and YOLOV4 models, which are state-of-the-art object detectors in accuracy and speed, are not optimized for detecting small objects such as malaria parasites in microscopic images. We modify these models by increasing feature scale and adding more detection layers to enhance their capability of detecting small objects without notably decreasing detection speed. We propose one modified YOLOV4 model, called YOLOV4-MOD and two modified models of YOLOV3, which are called YOLOV3-MOD1 and YOLOV3-MOD2. Besides, new anchor box sizes are generated using K-means clustering algorithm to exploit the potential of these models in small object detection. The performance of the modified YOLOV3 and YOLOV4 models were evaluated on a publicly available malaria dataset. These models have achieved state-of-the-art accuracy by exceeding performance of their original versions, Faster R-CNN, and SSD in terms of mean average precision (mAP), recall, precision, F1 score, and average IOU. YOLOV4-MOD has achieved the best detection accuracy among all the other models with a mAP of 96.32%. YOLOV3-MOD2 and YOLOV3-MOD1 have achieved mAP of 96.14% and 95.46%, respectively. Conclusions The experimental results of this study demonstrate that performance of modified YOLOV3 and YOLOV4 models are highly promising for detecting malaria parasites from images captured by a smartphone camera over the microscope eyepiece. The proposed system is suitable for deployment in low-resource setting areas.

Download Full-text

Analysis Of Various Object Detection Techniques for Self-Driving Cars

10.1109/asiancon51346.2021.9545034 ◽

2021 ◽

Author(s):

Yukta Lapsiya ◽

Dhruvi Jain ◽

Parshva Shah ◽

Atul Kachare

Keyword(s):

Object Detection ◽

Detection Techniques ◽

Self Driving Cars

Download Full-text

Dangerous Scenes Recognition During Hoisting Based on Faster Region-Based Convolutional Neural Network

Volume 2: Mechanics and Behavior of Active Materials; Structural Health Monitoring; Bioinspired Smart Materials and Systems; Energy Harvesting; Emerging Technologies ◽

10.1115/smasis2018-8226 ◽

2018 ◽

Author(s):

Hongguo Su ◽

Mingyuan Zhang ◽

Shengyuan Li ◽

Xuefeng Zhao

Keyword(s):

Neural Network ◽

Object Detection ◽

Convolutional Neural Network ◽

Image Classification ◽

State Of The Art ◽

Spatial Location ◽

Average Precision ◽

Practical Applications ◽

Security Enhancement ◽

Human Interactions

In the last couple of years, advancements in the deep learning, especially in convolutional neural networks, proved to be a boon for the image classification and recognition tasks. One of the important practical applications of object detection and image classification can be for security enhancement. If dangerous objects or scenes can be identified automatically, then a lot of accidents can be prevented. For this purpose, in this paper we made use of state-of-the-art implementation of Faster Region-based Convolutional Neural Network (Faster R-CNN) based on the monitoring video of hoisting sites to train a model to detect the dangerous object and the worker. By extracting the locations of them, object-human interactions during hoisting, mainly for changes in their spatial location relationship, can be understood whereby estimating whether the scene is safe or dangerous. Experimental results showed that the pre-trained model achieved good performance with a high mean average precision of 97.66% on object detection and the proposed method fulfilled the goal of dangerous scenes recognition perfectly.

Download Full-text