scholarly journals Challenges and Limitations in Human Action Recognition on Unmanned Aerial Vehicles: A Comprehensive Survey

2021 ◽  
Vol 38 (5) ◽  
pp. 1403-1411
Author(s):  
Nashwan Adnan Othman ◽  
Ilhan Aydin

An Unmanned Aerial Vehicle (UAV), commonly called a drone, is an aircraft without a human pilot aboard. Making UAVs that can accurately discover individuals on the ground is very important for various applications, such as people searches, and surveillance. UAV integration in smart cities is challenging, however, because of problems and concerns such as privacy, safety, and ethical/legal use. Human action recognition-based UAVs can utilize modern technologies. Thus, it is essential for future development of the aforementioned applications. UAV-based human activity recognition is the procedure of classifying photo sequences with action labels. This paper offers a comprehensive study of UAV-based human action recognition techniques. Furthermore, we conduct empirical research studies to assess several factors that might influence the efficiency of human detection and action recognition techniques in UAVs. Benchmark datasets commonly utilized for UAV-based human action recognition are briefly explained. Our findings reveal that the existing human action recognition innovations can identify human actions on UAVs with some limitations in range, altitudes, long-distance, and a large angle of depression.

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3642
Author(s):  
Mohammad Farhad Bulbul ◽  
Sadiya Tabussum ◽  
Hazrat Ali ◽  
Wenli Zheng ◽  
Mi Young Lee ◽  
...  

This paper proposes an action recognition framework for depth map sequences using the 3D Space-Time Auto-Correlation of Gradients (STACOG) algorithm. First, each depth map sequence is split into two sets of sub-sequences of two different frame lengths individually. Second, a number of Depth Motion Maps (DMMs) sequences from every set are generated and are fed into STACOG to find an auto-correlation feature vector. For two distinct sets of sub-sequences, two auto-correlation feature vectors are obtained and applied gradually to L2-regularized Collaborative Representation Classifier (L2-CRC) for computing a pair of sets of residual values. Next, the Logarithmic Opinion Pool (LOGP) rule is used to combine the two different outcomes of L2-CRC and to allocate an action label of the depth map sequence. Finally, our proposed framework is evaluated on three benchmark datasets named MSR-action 3D dataset, DHA dataset, and UTD-MHAD dataset. We compare the experimental results of our proposed framework with state-of-the-art approaches to prove the effectiveness of the proposed framework. The computational efficiency of the framework is also analyzed for all the datasets to check whether it is suitable for real-time operation or not.


2020 ◽  
Vol 10 (12) ◽  
pp. 4412
Author(s):  
Ammar Mohsin Butt ◽  
Muhammad Haroon Yousaf ◽  
Fiza Murtaza ◽  
Saima Nazir ◽  
Serestina Viriri ◽  
...  

Human action recognition has gathered significant attention in recent years due to its high demand in various application domains. In this work, we propose a novel codebook generation and hybrid encoding scheme for classification of action videos. The proposed scheme develops a discriminative codebook and a hybrid feature vector by encoding the features extracted from CNNs (convolutional neural networks). We explore different CNN architectures for extracting spatio-temporal features. We employ an agglomerative clustering approach for codebook generation, which intends to combine the advantages of global and class-specific codebooks. We propose a Residual Vector of Locally Aggregated Descriptors (R-VLAD) and fuse it with locality-based coding to form a hybrid feature vector. It provides a compact representation along with high order statistics. We evaluated our work on two publicly available standard benchmark datasets HMDB-51 and UCF-101. The proposed method achieves 72.6% and 96.2% on HMDB51 and UCF101, respectively. We conclude that the proposed scheme is able to boost recognition accuracy for human action recognition.


The present The present situation is having many challenges in security and surveillance of Human Action recognition (HAR). HAR has many fields and many techniques to provide modern and technical action implementation. We have studied multiple parameters and techniques used in HAR. We have come out with a list of outcomes and drawbacks of each technique present in different researches. This paper presents the survey on the complete process of recognition of human activity and provides survey on different Motion History Imaging (MHI) methods, model based, multiview and multiple feature extraction based recognition methods.


Author(s):  
B. H. Shekar ◽  
P. Rathnakara Shetty ◽  
M. Sharmila Kumari ◽  
L. Mestetsky

<p><strong>Abstract.</strong> Accumulating the motion information from a video sequence is one of the highly challenging and significant phase in Human Action Recognition. To achieve this, several classical and compact representations are proposed by the research community with proven applicability. In this paper, we propose a compact Depth Motion Map based representation methodology with hastey striding, consisely accumulating the motion information. We extract Undecimated Dual Tree Complex Wavelet Transform features from the proposed DMM, to form an efficient feature descriptor. We designate a Sequential Extreme Learning Machine for classifying the human action secquences on benchmark datasets, MSR Action 3D dataset and DHA Dataset. We empirically prove the feasability of our method under standard protocols, achieving proven results.</p>


Author(s):  
Lakhyadeep Konwar ◽  
Anjan Kumar Talukdar ◽  
Kandarpa Kumar Sarma ◽  
Navajit Saikia ◽  
Subhash Chandra Rajbangshi

Detection as well as classification of different object for machine vision application is a challenging task. Similar to the other object detection and classification task, human detection concept provides a major role for the ad- vancement in the design of an automatic visual surveillance system (AVSS). For the future automation system if it is possible to include human detection and tracking, human action recognition, usual as well as unusual event recognition etc. concept for future AVSS, it will be a greater success in the transformable world. In this paper we have proposed a proper human detection and tracking technique for human action recognition toward the design of AVSS. Here we use median filter for noise removal, graph cut for segment the human images, mathematical morphology to refine the segmentation mask, extract selective feature points by sing HOG, classify human objects by using SVM with polynomial ker- nel and finally particle filter for tracking those of detected human. Due to the above mentioned combinations our system can independent to the variations of lightening conditions, color, shape, size, clothing etc. and can handle the occlusion. Our system can easily detect and track human in different indoor as well as outdoor environ- ment with a automatic multiple human detection rate of 97:61% and total multiple human detection and tracking accuracy is about 92% for AVSS. Due to the use of HOG to extract features af- ter graph cut segmentation operation, our system requires less memory for store the trained data therefore processing speed as well as accuracy of detection and tracking will be better than other techniques which can be suitable for action classification task.


Human Action Recognition from videos has been an active research is in the computer vision due to its significant applicability in various real-time applications like video retrieval, human-robot interactions, and visual surveillance, etc. Though there are so many surveys over Human action Recognition, they are limited to various constraints like only focusing on the methods in few orientations only. Unlike the earlier ones, this paper provides a detailed survey according to the basic working methodology of Human action recognition system. Initially, a detailed illustration is given about various standard benchmark datasets. Further, following the methodology, the survey is accomplished in two phases, i.e., the survey over feature extraction approaches and the survey over action classification approaches. Further, a fine-grained survey is also accomplished under every phase based on the individual strategies


Author(s):  
Pawan Kumar Singh ◽  
Soumalya Kundu ◽  
Titir Adhikary ◽  
Ram Sarkar ◽  
Debotosh Bhattacharjee

2020 ◽  
Vol 2020 ◽  
pp. 1-30
Author(s):  
Deepika Roselind Johnson ◽  
V.Rhymend Uthariaraj

Human action recognition is a trending topic in the field of computer vision and its allied fields. The goal of human action recognition is to identify any human action that takes place in an image or a video dataset. For instance, the actions include walking, running, jumping, throwing, and much more. Existing human action recognition techniques have their own set of limitations when it concerns model accuracy and flexibility. To overcome these limitations, deep learning technologies were implemented. In the deep learning approach, a model learns by itself to improve its recognition accuracy and avoids problems such as gradient eruption, overfitting, and underfitting. In this paper, we propose a novel parameter initialization technique using the Maxout activation function. Firstly, human action is detected and tracked from the video dataset to learn the spatial-temporal features. Secondly, the extracted feature descriptors are trained using the RBM-NN. Thirdly, the local features are encoded into global features using an integrated forward and backward propagation process via RBM-NN. Finally, an SVM classifier recognizes the human actions in the video dataset. The experimental analysis performed on various benchmark datasets showed an improved recognition rate when compared to other state-of-the-art learning models.


Sign in / Sign up

Export Citation Format

Share Document