ARRNET: Action recognition through recurrent neural networks

Author(s):  
Kumaresh Krishnan ◽  
Nikita Prabhu ◽  
R. Venkatesh Babu
2017 ◽  
Vol 8 (4) ◽  
pp. 327-345
Author(s):  
Aleksandr Buyko ◽  
◽  
Andrey Vinogradov ◽  

2021 ◽  
Author(s):  
Fakhrul Aniq Hakimi Nasrul ’Alam ◽  
Mohd. Ibrahim Shapiai ◽  
Uzma Batool ◽  
Ahmad Kamal Ramli ◽  
Khairil Ashraf Elias

Recognition of human behavior is critical in video monitoring, human-computer interaction, video comprehension, and virtual reality. The key problem with behaviour recognition in video surveillance is the high degree of variation between and within subjects. Numerous studies have suggested background-insensitive skeleton-based as the proven detection technique. The present state-of-the-art approaches to skeleton-based action recognition rely primarily on Recurrent Neural Networks (RNN) and Convolution Neural Networks (CNN). Both methods take dynamic human skeleton as the input to the network. We chose to handle skeleton data differently, relying solely on its skeleton joint coordinates as the input. The skeleton joints’ positions are defined in (x, y) coordinates. In this paper, we investigated the incorporation of the Neural Oblivious Decision Ensemble (NODE) into our proposed action classifier network. The skeleton is extracted using a pose estimation technique based on the Residual Network (ResNet). It extracts the 2D skeleton of 18 joints for each detected body. The joint coordinates of the skeleton are stored in a table in the form of rows and columns. Each row represents the position of the joints. The structured data are fed into NODE for label prediction. With the proposed network, we obtain 97.5% accuracy on RealWorld (HAR) dataset. Experimental results show that the proposed network outperforms one the state-of-the-art approaches by 1.3%. In conclusion, NODE is a promising deep learning technique for structured data analysis as compared to its machine learning counterparts such as the GBDT packages; Catboost, and XGBoost.


Author(s):  
Yu Pan ◽  
Jing Xu ◽  
Maolin Wang ◽  
Jinmian Ye ◽  
Fei Wang ◽  
...  

Recurrent Neural Networks (RNNs) and their variants, such as Long-Short Term Memory (LSTM) networks, and Gated Recurrent Unit (GRU) networks, have achieved promising performance in sequential data modeling. The hidden layers in RNNs can be regarded as the memory units, which are helpful in storing information in sequential contexts. However, when dealing with high dimensional input data, such as video and text, the input-to-hidden linear transformation in RNNs brings high memory usage and huge computational cost. This makes the training of RNNs very difficult. To address this challenge, we propose a novel compact LSTM model, named as TR-LSTM, by utilizing the low-rank tensor ring decomposition (TRD) to reformulate the input-to-hidden transformation. Compared with other tensor decomposition methods, TR-LSTM is more stable. In addition, TR-LSTM can complete an end-to-end training and also provide a fundamental building block for RNNs in handling large input data. Experiments on real-world action recognition datasets have demonstrated the promising performance of the proposed TR-LSTM compared with the tensor-train LSTM and other state-of-the-art competitors.


2017 ◽  
Vol 8 (4) ◽  
pp. 327-344
Author(s):  
Aleksandr Buyko ◽  
◽  
Andrey Vinogradov ◽  

Sign in / Sign up

Export Citation Format

Share Document