A Systematic Deep Learning Based Overhead Tracking and Counting System Using RGB-D Remote Cameras

Munkhjargal Gochoo; Syeda Amna Rizwan; Yazeed Yasin Ghadi; Ahmad Jalal; Kibum Kim

doi:10.3390/app11125503

A Systematic Deep Learning Based Overhead Tracking and Counting System Using RGB-D Remote Cameras

Applied Sciences ◽

10.3390/app11125503 ◽

2021 ◽

Vol 11 (12) ◽

pp. 5503

Author(s):

Munkhjargal Gochoo ◽

Syeda Amna Rizwan ◽

Yazeed Yasin Ghadi ◽

Ahmad Jalal ◽

Kibum Kim

Keyword(s):

Deep Learning ◽

Human Head ◽

Point Clouds ◽

Head Tracking ◽

Complex Environments ◽

People Tracking ◽

Practical Applications ◽

Remote Cameras ◽

Video Frames ◽

Benchmark Datasets

Automatic head tracking and counting using depth imagery has various practical applications in security, logistics, queue management, space utilization and visitor counting. However, no currently available system can clearly distinguish between a human head and other objects in order to track and count people accurately. For this reason, we propose a novel system that can track people by monitoring their heads and shoulders in complex environments and also count the number of people entering and exiting the scene. Our system is split into six phases; at first, preprocessing is done by converting videos of a scene into frames and removing the background from the video frames. Second, heads are detected using Hough Circular Gradient Transform, and shoulders are detected by HOG based symmetry methods. Third, three robust features, namely, fused joint HOG-LBP, Energy based Point clouds and Fused intra-inter trajectories are extracted. Fourth, the Apriori-Association is implemented to select the best features. Fifth, deep learning is used for accurate people tracking. Finally, heads are counted using Cross-line judgment. The system was tested on three benchmark datasets: the PCDS dataset, the MICC people counting dataset and the GOTPD dataset and counting accuracy of 98.40%, 98%, and 99% respectively was achieved. Our system obtained remarkable results.

Download Full-text

A unified framework for automated person re-indentification

Transport and Communication Science Journal ◽

10.25073/tcsj.71.7.11 ◽

2020 ◽

Vol 71 (7) ◽

pp. 868-880

Author(s):

Nguyen Hong-Quan ◽

Nguyen Thuy-Binh ◽

Tran Duc-Long ◽

Le Thi-Lan

Keyword(s):

Deep Learning ◽

Video Analysis ◽

Camera Network ◽

Unified Framework ◽

Person Detection ◽

Practical Applications ◽

Detection And Tracking ◽

Analysis System ◽

Bounding Boxes

Along with the strong development of camera networks, a video analysis system has been become more and more popular and has been applied in various practical applications. In this paper, we focus on person re-identification (person ReID) task that is a crucial step of video analysis systems. The purpose of person ReID is to associate multiple images of a given person when moving in a non-overlapping camera network. Many efforts have been made to person ReID. However, most of studies on person ReID only deal with well-alignment bounding boxes which are detected manually and considered as the perfect inputs for person ReID. In fact, when building a fully automated person ReID system the quality of the two previous steps that are person detection and tracking may have a strong effect on the person ReID performance. The contribution of this paper are two-folds. First, a unified framework for person ReID based on deep learning models is proposed. In this framework, the coupling of a deep neural network for person detection and a deep-learning-based tracking method is used. Besides, features extracted from an improved ResNet architecture are proposed for person representation to achieve a higher ReID accuracy. Second, our self-built dataset is introduced and employed for evaluation of all three steps in the fully automated person ReID framework.

Download Full-text

An Anatomy of a Hybrid Color Descriptor with a Neural Network Model to Enhance the Retrieval Accuracy of an Image Retrieval System

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666191122113801 ◽

2019 ◽

Vol 13 ◽

Author(s):

Shikha Bhardwaj ◽

Gitanjali Pandove ◽

Pawan Kumar Dahiya

Keyword(s):

Neural Network ◽

Deep Learning ◽

Image Retrieval ◽

Hybrid System ◽

Back Propagation ◽

Back Propagation Neural Network ◽

Retrieval Accuracy ◽

Color Descriptor ◽

Benchmark Datasets ◽

Color Moment

Background: In order to retrieve a particular image from vast repository of images, an efficient system is required and such an eminent system is well-known by the name Content-based image retrieval (CBIR) system. Color is indeed an important attribute of an image and the proposed system consist of a hybrid color descriptor which is used for color feature extraction. Deep learning, has gained a prominent importance in the current era. So, the performance of this fusion based color descriptor is also analyzed in the presence of Deep learning classifiers. Method: This paper describes a comparative experimental analysis on various color descriptors and the best two are chosen to form an efficient color based hybrid system denoted as combined color moment-color autocorrelogram (Co-CMCAC). Then, to increase the retrieval accuracy of the hybrid system, a Cascade forward back propagation neural network (CFBPNN) is used. The classification accuracy obtained by using CFBPNN is also compared to Patternnet neural network. Results: The results of the hybrid color descriptor depict that the proposed system has superior results of the order of 95.4%, 88.2%, 84.4% and 96.05% on Corel-1K, Corel-5K, Corel-10K and Oxford flower benchmark datasets respectively as compared to many state-of-the-art related techniques. Conclusion: This paper depict an experimental and analytical analysis on different color feature descriptors namely, Color moment (CM), Color auto-correlogram (CAC), Color histogram (CH), Color coherence vector (CCV) and Dominant color descriptor (DCD). The proposed hybrid color descriptor (Co-CMCAC) is utilized for the withdrawal of color features with Cascade forward back propagation neural network (CFBPNN) is used as a classifier on four benchmark datasets namely Corel-1K, Corel-5K and Corel-10K and Oxford flower.

Download Full-text

Real-Time Environment Monitoring Using a Lightweight Image Super-Resolution Network

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18115890 ◽

2021 ◽

Vol 18 (11) ◽

pp. 5890

Author(s):

Qiang Yu ◽

Feiqiang Liu ◽

Long Xiao ◽

Zitao Liu ◽

Xiaomin Yang

Keyword(s):

Deep Learning ◽

Real Time ◽

Super Resolution ◽

Model Complexity ◽

Practical Application ◽

Single Image ◽

Feature Maps ◽

Benchmark Datasets ◽

Image Super Resolution ◽

Single Image Super Resolution

Deep-learning (DL)-based methods are of growing importance in the field of single image super-resolution (SISR). The practical application of these DL-based models is a remaining problem due to the requirement of heavy computation and huge storage resources. The powerful feature maps of hidden layers in convolutional neural networks (CNN) help the model learn useful information. However, there exists redundancy among feature maps, which can be further exploited. To address these issues, this paper proposes a lightweight efficient feature generating network (EFGN) for SISR by constructing the efficient feature generating block (EFGB). Specifically, the EFGB can conduct plain operations on the original features to produce more feature maps with parameters slightly increasing. With the help of these extra feature maps, the network can extract more useful information from low resolution (LR) images to reconstruct the desired high resolution (HR) images. Experiments conducted on the benchmark datasets demonstrate that the proposed EFGN can outperform other deep-learning based methods in most cases and possess relatively lower model complexity. Additionally, the running time measurement indicates the feasibility of real-time monitoring.

Download Full-text

Olympic Games Event Recognition via Transfer Learning with Photobombing Guided Data Augmentation

Journal of Imaging ◽

10.3390/jimaging7020012 ◽

2021 ◽

Vol 7 (2) ◽

pp. 12

Author(s):

Yousef I. Mohamad ◽

Samah S. Baraheem ◽

Tam V. Nguyen

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Data Augmentation ◽

Olympic Games ◽

Event Recognition ◽

Surveillance Systems ◽

Video Captioning ◽

Practical Applications ◽

Sport Events ◽

The Olympic Games

Automatic event recognition in sports photos is both an interesting and valuable research topic in the field of computer vision and deep learning. With the rapid increase and the explosive spread of data, which is being captured momentarily, the need for fast and precise access to the right information has become a challenging task with considerable importance for multiple practical applications, i.e., sports image and video search, sport data analysis, healthcare monitoring applications, monitoring and surveillance systems for indoor and outdoor activities, and video captioning. In this paper, we evaluate different deep learning models in recognizing and interpreting the sport events in the Olympic Games. To this end, we collect a dataset dubbed Olympic Games Event Image Dataset (OGED) including 10 different sport events scheduled for the Olympic Games Tokyo 2020. Then, the transfer learning is applied on three popular deep convolutional neural network architectures, namely, AlexNet, VGG-16 and ResNet-50 along with various data augmentation methods. Extensive experiments show that ResNet-50 with the proposed photobombing guided data augmentation achieves 90% in terms of accuracy.

Download Full-text

A Survey on Deep Learning Based Methods and Datasets for Monocular 3D Object Detection

Electronics ◽

10.3390/electronics10040517 ◽

2021 ◽

Vol 10 (4) ◽

pp. 517

Author(s):

Seong-heum Kim ◽

Youngbae Hwang

Keyword(s):

Deep Learning ◽

Object Detection ◽

Low Cost ◽

Detection Methods ◽

Future Research ◽

3D Object ◽

Practical Applications ◽

Depth Sensors ◽

Significant Research ◽

3D Object Detection

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.

Download Full-text

Multi-Dimensional Underwater Point Cloud Detection Based on Deep Learning

Sensors ◽

10.3390/s21030884 ◽

2021 ◽

Vol 21 (3) ◽

pp. 884

Author(s):

Chia-Ming Tsai ◽

Yi-Horng Lai ◽

Yung-Da Sun ◽

Yu-Jen Chung ◽

Jau-Woei Perng

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Three Dimensional ◽

Point Clouds ◽

Training Data ◽

Network Architectures ◽

Point Cloud Data ◽

Data Types ◽

Raw Data ◽

Cloud Data

Numerous sensors can obtain images or point cloud data on land, however, the rapid attenuation of electromagnetic signals and the lack of light in water have been observed to restrict sensing functions. This study expands the utilization of two- and three-dimensional detection technologies in underwater applications to detect abandoned tires. A three-dimensional acoustic sensor, the BV5000, is used in this study to collect underwater point cloud data. Some pre-processing steps are proposed to remove noise and the seabed from raw data. Point clouds are then processed to obtain two data types: a 2D image and a 3D point cloud. Deep learning methods with different dimensions are used to train the models. In the two-dimensional method, the point cloud is transferred into a bird’s eye view image. The Faster R-CNN and YOLOv3 network architectures are used to detect tires. Meanwhile, in the three-dimensional method, the point cloud associated with a tire is cut out from the raw data and is used as training data. The PointNet and PointConv network architectures are then used for tire classification. The results show that both approaches provide good accuracy.

Download Full-text

Named Entity Recognition and Relation Extraction

ACM Computing Surveys ◽

10.1145/3445965 ◽

2021 ◽

Vol 54 (1) ◽

pp. 1-39

Author(s):

Zara Nasar ◽

Syed Waqar Jaffry ◽

Muhammad Kamran Malik

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Named Entity Recognition ◽

Relation Extraction ◽

The State ◽

Entity Recognition ◽

Joint Models ◽

Named Entity ◽

Textual Data ◽

Benchmark Datasets

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.

Download Full-text

Predicting AC Optimal Power Flows: Combining Deep Learning and Lagrangian Dual Methods

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5403 ◽

2020 ◽

Vol 34 (01) ◽

pp. 630-637 ◽

Cited By ~ 2

Author(s):

Ferdinando Fioretto ◽

Terrence W.K. Mak ◽

Pascal Van Hentenryck

Keyword(s):

Deep Learning ◽

Power Systems ◽

Power Flow ◽

Optimal Power Flow ◽

Large Scale ◽

Renewable Energy Sources ◽

Electrical Power ◽

Practical Applications ◽

Fundamental Building Block ◽

Optimal Power

The Optimal Power Flow (OPF) problem is a fundamental building block for the optimization of electrical power systems. It is nonlinear and nonconvex and computes the generator setpoints for power and voltage, given a set of load demands. It is often solved repeatedly under various conditions, either in real-time or in large-scale studies. This need is further exacerbated by the increasing stochasticity of power systems due to renewable energy sources in front and behind the meter. To address these challenges, this paper presents a deep learning approach to the OPF. The learning model exploits the information available in the similar states of the system (which is commonly available in practical applications), as well as a dual Lagrangian method to satisfy the physical and engineering constraints present in the OPF. The proposed model is evaluated on a large collection of realistic medium-sized power systems. The experimental results show that its predictions are highly accurate with average errors as low as 0.2%. Additionally, the proposed approach is shown to improve the accuracy of the widely adopted linear DC approximation by at least two orders of magnitude.

Download Full-text

Deep Learning-Based Classification and Reconstruction of Residential Scenes From Large-Scale Point Clouds

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2017.2769120 ◽

2018 ◽

Vol 56 (4) ◽

pp. 1887-1897 ◽

Cited By ~ 13

Author(s):

Liqiang Zhang ◽

Liang Zhang

Keyword(s):

Deep Learning ◽

Large Scale ◽

Point Clouds ◽

Scale Point

Download Full-text

Time-ResNeXt for epilepsy recognition based on EEG signals in wireless networks

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-020-01810-5 ◽

2020 ◽

Vol 2020 (1) ◽

Cited By ~ 1

Author(s):

Shaoqiang Wang ◽

Shudong Wang ◽

Song Zhang ◽

Yifan Wang

Keyword(s):

Deep Learning ◽

Network Structure ◽

Signal Recognition ◽

Eeg Signals ◽

Real World Data ◽

Data Set ◽

Practical Applications ◽

Epilepsy Diagnosis ◽

Single Data ◽

Electroencephalogram Eeg

Abstract To automatically detect dynamic EEG signals to reduce the time cost of epilepsy diagnosis. In the signal recognition of electroencephalogram (EEG) of epilepsy, traditional machine learning and statistical methods require manual feature labeling engineering in order to show excellent results on a single data set. And the artificially selected features may carry a bias, and cannot guarantee the validity and expansibility in real-world data. In practical applications, deep learning methods can release people from feature engineering to a certain extent. As long as the focus is on the expansion of data quality and quantity, the algorithm model can learn automatically to get better improvements. In addition, the deep learning method can also extract many features that are difficult for humans to perceive, thereby making the algorithm more robust. Based on the design idea of ResNeXt deep neural network, this paper designs a Time-ResNeXt network structure suitable for time series EEG epilepsy detection to identify EEG signals. The accuracy rate of Time-ResNeXt in the detection of EEG epilepsy can reach 91.50%. The Time-ResNeXt network structure produces extremely advanced performance on the benchmark dataset (Berne-Barcelona dataset) and has great potential for improving clinical practice.

Download Full-text