scholarly journals Improving Real-Time Hand Gesture Recognition with Semantic Segmentation

Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 356
Author(s):  
Gibran Benitez-Garcia ◽  
Lidia Prudente-Tixteco ◽  
Luis Carlos Castro-Madrid ◽  
Rocio Toscano-Medina ◽  
Jesus Olivares-Mercado ◽  
...  

Hand gesture recognition (HGR) takes a central role in human–computer interaction, covering a wide range of applications in the automotive sector, consumer electronics, home automation, and others. In recent years, accurate and efficient deep learning models have been proposed for real-time applications. However, the most accurate approaches tend to employ multiple modalities derived from RGB input frames, such as optical flow. This practice limits real-time performance due to intense extra computational cost. In this paper, we avoid the optical flow computation by proposing a real-time hand gesture recognition method based on RGB frames combined with hand segmentation masks. We employ a light-weight semantic segmentation method (FASSD-Net) to boost the accuracy of two efficient HGR methods: Temporal Segment Networks (TSN) and Temporal Shift Modules (TSM). We demonstrate the efficiency of the proposal on our IPN Hand dataset, which includes thirteen different gestures focused on interaction with touchless screens. The experimental results show that our approach significantly overcomes the accuracy of the original TSN and TSM algorithms by keeping real-time performance.

2020 ◽  
Vol 17 (4) ◽  
pp. 497-506
Author(s):  
Sunil Patel ◽  
Ramji Makwana

Automatic classification of dynamic hand gesture is challenging due to the large diversity in a different class of gesture, Low resolution, and it is performed by finger. Due to a number of challenges many researchers focus on this area. Recently deep neural network can be used for implicit feature extraction and Soft Max layer is used for classification. In this paper, we propose a method based on a two-dimensional convolutional neural network that performs detection and classification of hand gesture simultaneously from multimodal Red, Green, Blue, Depth (RGBD) and Optical flow Data and passes this feature to Long-Short Term Memory (LSTM) recurrent network for frame-to-frame probability generation with Connectionist Temporal Classification (CTC) network for loss calculation. We have calculated an optical flow from Red, Green, Blue (RGB) data for getting proper motion information present in the video. CTC model is used to efficiently evaluate all possible alignment of hand gesture via dynamic programming and check consistency via frame-to-frame for the visual similarity of hand gesture in the unsegmented input stream. CTC network finds the most probable sequence of a frame for a class of gesture. The frame with the highest probability value is selected from the CTC network by max decoding. This entire CTC network is trained end-to-end with calculating CTC loss for recognition of the gesture. We have used challenging Vision for Intelligent Vehicles and Applications (VIVA) dataset for dynamic hand gesture recognition captured with RGB and Depth data. On this VIVA dataset, our proposed hand gesture recognition technique outperforms competing state-of-the-art algorithms and gets an accuracy of 86%


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Samy Bakheet ◽  
Ayoub Al-Hamadi

AbstractRobust vision-based hand pose estimation is highly sought but still remains a challenging task, due to its inherent difficulty partially caused by self-occlusion among hand fingers. In this paper, an innovative framework for real-time static hand gesture recognition is introduced, based on an optimized shape representation build from multiple shape cues. The framework incorporates a specific module for hand pose estimation based on depth map data, where the hand silhouette is first extracted from the extremely detailed and accurate depth map captured by a time-of-flight (ToF) depth sensor. A hybrid multi-modal descriptor that integrates multiple affine-invariant boundary-based and region-based features is created from the hand silhouette to obtain a reliable and representative description of individual gestures. Finally, an ensemble of one-vs.-all support vector machines (SVMs) is independently trained on each of these learned feature representations to perform gesture classification. When evaluated on a publicly available dataset incorporating a relatively large and diverse collection of egocentric hand gestures, the approach yields encouraging results that agree very favorably with those reported in the literature, while maintaining real-time operation.


2012 ◽  
Vol 6 ◽  
pp. 98-107 ◽  
Author(s):  
Amit Gupta ◽  
Vijay Kumar Sehrawat ◽  
Mamta Khosla

2021 ◽  
Vol 102 ◽  
pp. 04009
Author(s):  
Naoto Ageishi ◽  
Fukuchi Tomohide ◽  
Abderazek Ben Abdallah

Hand gestures are a kind of nonverbal communication in which visible bodily actions are used to communicate important messages. Recently, hand gesture recognition has received significant attention from the research community for various applications, including advanced driver assistance systems, prosthetic, and robotic control. Therefore, accurate and fast classification of hand gesture is required. In this research, we created a deep neural network as the first step to develop a real-time camera-only hand gesture recognition system without electroencephalogram (EEG) signals. We present the system software architecture in a fair amount of details. The proposed system was able to recognize hand signs with an accuracy of 97.31%.


Sign in / Sign up

Export Citation Format

Share Document