scholarly journals Real-Time Continuous Action Recognition Using Pose Contexts With Depth Sensors

IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 51708-51720 ◽  
Author(s):  
Hejun Wu ◽  
Zhenye Huang ◽  
Biao Hu ◽  
Zhi Yu ◽  
Xiying Li ◽  
...  
2021 ◽  
Vol 11 (11) ◽  
pp. 4940
Author(s):  
Jinsoo Kim ◽  
Jeongho Cho

The field of research related to video data has difficulty in extracting not only spatial but also temporal features and human action recognition (HAR) is a representative field of research that applies convolutional neural network (CNN) to video data. The performance for action recognition has improved, but owing to the complexity of the model, some still limitations to operation in real-time persist. Therefore, a lightweight CNN-based single-stream HAR model that can operate in real-time is proposed. The proposed model extracts spatial feature maps by applying CNN to the images that develop the video and uses the frame change rate of sequential images as time information. Spatial feature maps are weighted-averaged by frame change, transformed into spatiotemporal features, and input into multilayer perceptrons, which have a relatively lower complexity than other HAR models; thus, our method has high utility in a single embedded system connected to CCTV. The results of evaluating action recognition accuracy and data processing speed through challenging action recognition benchmark UCF-101 showed higher action recognition accuracy than the HAR model using long short-term memory with a small amount of video frames and confirmed the real-time operational possibility through fast data processing speed. In addition, the performance of the proposed weighted mean-based HAR model was verified by testing it in Jetson NANO to confirm the possibility of using it in low-cost GPU-based embedded systems.


2021 ◽  
Vol 20 (3) ◽  
pp. 1-22
Author(s):  
David Langerman ◽  
Alan George

High-resolution, low-latency apps in computer vision are ubiquitous in today’s world of mixed-reality devices. These innovations provide a platform that can leverage the improving technology of depth sensors and embedded accelerators to enable higher-resolution, lower-latency processing for 3D scenes using depth-upsampling algorithms. This research demonstrates that filter-based upsampling algorithms are feasible for mixed-reality apps using low-power hardware accelerators. The authors parallelized and evaluated a depth-upsampling algorithm on two different devices: a reconfigurable-logic FPGA embedded within a low-power SoC; and a fixed-logic embedded graphics processing unit. We demonstrate that both accelerators can meet the real-time requirements of 11 ms latency for mixed-reality apps. 1


Author(s):  
Mingqin Liu ◽  
Xiaoguang Zhang ◽  
Guiyun Xu

The continuous image sequence recognition is more difficult than the single image recognition because the classification of continuous image sequences and the image edge recognition must be very accurate. Hence, a method based on sequence alignment for action segmentation and classification is proposed to reconstruct a template sequence by estimating the mean action of a class category, which calculates the distance between a single image and a template sequence by sparse coding in Dynamic Time Warping. The proposed method, the methods of Kulkarni et al. [Continuous action recognition based on sequence alignment, Int. J. Comput. Vis. pp. 1–26.] and Hoai et al. [Joint segmentation and classification of human actions in video, IEEE Conf. Computer Vision and Pattern Recognition, 2008, pp. 108–119.] are compared on the recognition accuracy of the continuous recognition and isolated recognition, which clearly shows that the proposed method outperforms the other methods. When applied to continuous gesture classification, it not only can recognize the gesture categories more quickly and accurately, but is more realistic in solving continuous action recognition problems in a video than the other existing methods.


Sign in / Sign up

Export Citation Format

Share Document