Convolutional and Recurrent Neural Network for Human Action Recognition: application on American Sign Language

Mapping Intimacies ◽

10.1101/535492 ◽

2019 ◽

Author(s):

Hernandez Vincent ◽

Suzuki Tomoya ◽

Venture Gentiane

Keyword(s):

Neural Network ◽

American Sign Language ◽

Sign Language ◽

Action Recognition ◽

Recurrent Neural Network ◽

Data Augmentation ◽

Human Action Recognition ◽

Human Action ◽

American Sign ◽

Test Set

AbstractHuman Action Recognition (HAR) is an important and difficult topic because of the important variability between tasks repeated several times by a subject and between subjects. This work is motivated by providing time-series signal classification and a robust validation and test approaches. This study proposes to classify 60 American Sign Language signs from data provided by the LeapMotion sensor by using a combined approach with Convolutional Neural Network (ConvNet) and Recurrent Neural Network with Long-Short Term Memory cells (LSTM) called ConvNet-LSTM. Moreover, a complete kinematic model of the right and left forearm/hand/fingers/thumb is proposed as well as the use of a simple data augmentation technique to improve the generalization of neural networks. Results showed an accuracy of 89.3% on a user-independent test set with data augmentation when using the ConvNet-LSTM, while LSTM alone provided an accuracy of 85.0% on the same test set. The result dropped respectively to 85.9% and 81.4% without data augmentation.

Download Full-text

Human action recognition based on two-stream Ind recurrent neural network

Tenth International Conference on Graphics and Image Processing (ICGIP 2018) ◽

10.1117/12.2524322 ◽

2019 ◽

Author(s):

Penghua Ge ◽

Min Zhi

Keyword(s):

Neural Network ◽

Action Recognition ◽

Recurrent Neural Network ◽

Human Action Recognition ◽

Human Action

Download Full-text

Convolutional and recurrent neural network for human activity recognition: Application on American sign language

PLoS ONE ◽

10.1371/journal.pone.0228869 ◽

2020 ◽

Vol 15 (2) ◽

pp. e0228869 ◽

Cited By ~ 2

Author(s):

Vincent Hernandez ◽

Tomoya Suzuki ◽

Gentiane Venture

Keyword(s):

Neural Network ◽

American Sign Language ◽

Sign Language ◽

Activity Recognition ◽

Human Activity ◽

Recurrent Neural Network ◽

Human Activity Recognition ◽

American Sign

Download Full-text

Recurrent Neural Network for Human Action Recognition using Star Skeletonization

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit195217 ◽

2019 ◽

pp. 335-344

Author(s):

Anantha Prabha P ◽

Srimathi R ◽

Srividhya R ◽

Sowmiya T G

Keyword(s):

Neural Network ◽

Action Recognition ◽

Recurrent Neural Network ◽

Feature Vector ◽

Video Retrieval ◽

Human Action Recognition ◽

Human Action ◽

Human Computer Interactions ◽

Vector Sequence ◽

Active Research

Human Action Recognition has been an active research topic since early 1980s due to its promising applications in many domains like video indexing, surveillance, gesture recognition, video retrieval and human-computer interactions where the actions in the form of videos or sensor datas are recognized. The extraction of relevant features from the video streams is the most challenging part. With the emergence of advanced artificial intelligence techniques, deep learning methods are adopted to achieve the goal. The proposed system presents a Recurrent Neural Network (RNN) methodology for Human Action Recognition using star skeleton as a representative descriptor of human posture. Star skeleton is the process of jointing the gross contour extremes of a body to its centroid. To use star skeleton as feature for action recognition, the feature is defined as a five-dimensional vector in star fashion because the head and four limbs are usually local extremes of human body. In our project, we assumed an action is composed of a series of star skeletons overtime. Therefore, images expressing human action which are time-sequential are transformed into a feature vector sequence. Then the feature vector sequence must be transformed into symbol sequence so that RNN can model the action. RNN is used because the features extracted are time dependent

Download Full-text

American sign language recognition and training method with recurrent neural network

Expert Systems with Applications ◽

10.1016/j.eswa.2020.114403 ◽

2021 ◽

Vol 167 ◽

pp. 114403

Author(s):

C.K.M. Lee ◽

Kam K.H. Ng ◽

Chun-Hsien Chen ◽

H.C.W. Lau ◽

S.Y. Chung ◽

...

Keyword(s):

Neural Network ◽

American Sign Language ◽

Sign Language ◽

Recurrent Neural Network ◽

American Sign ◽

Training Method ◽

Language Recognition ◽

Sign Language Recognition ◽

And Training

Download Full-text

ASNet: Auto-Augmented Siamese Neural Network for Action Recognition

Sensors ◽

10.3390/s21144720 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4720

Author(s):

Yujia Zhang ◽

Lai-Man Po ◽

Jingjing Xiong ◽

Yasar Abbas Ur REHMAN ◽

Kwok-Wai Cheung

Keyword(s):

Neural Network ◽

Action Recognition ◽

Data Augmentation ◽

Recognition Performance ◽

Human Action Recognition ◽

Human Action ◽

Deep Convolutional Neural Networks ◽

Learning Agent ◽

Markov Decision ◽

The Impact

Human action recognition methods in videos based on deep convolutional neural networks usually use random cropping or its variants for data augmentation. However, this traditional data augmentation approach may generate many non-informative samples (video patches covering only a small part of the foreground or only the background) that are not related to a specific action. These samples can be regarded as noisy samples with incorrect labels, which reduces the overall action recognition performance. In this paper, we attempt to mitigate the impact of noisy samples by proposing an Auto-augmented Siamese Neural Network (ASNet). In this framework, we propose backpropagating salient patches and randomly cropped samples in the same iteration to perform gradient compensation to alleviate the adverse gradient effects of non-informative samples. Salient patches refer to the samples containing critical information for human action recognition. The generation of salient patches is formulated as a Markov decision process, and a reinforcement learning agent called SPA (Salient Patch Agent) is introduced to extract patches in a weakly supervised manner without extra labels. Extensive experiments were conducted on two well-known datasets UCF-101 and HMDB-51 to verify the effectiveness of the proposed SPA and ASNet.

Download Full-text

Classification of American Sign Language by Applying a Transfer Learned Deep Convolutional Neural Network

2020 23rd International Conference on Computer and Information Technology (ICCIT) ◽

10.1109/iccit51783.2020.9392703 ◽

2020 ◽

Author(s):

Md. Mehedi Hasan ◽

Azmain Yakin Srizon ◽

Abu Sayeed ◽

Md. Al Mehedi Hasan

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

American Sign Language ◽

Sign Language ◽

Deep Convolutional Neural Network ◽

American Sign

Download Full-text

End-to-end learning of deep convolutional neural network for 3D human action recognition

2017 IEEE International Conference on Multimedia & Expo Workshops (ICMEW) ◽

10.1109/icmew.2017.8026281 ◽

2017 ◽

Author(s):

Chao Li ◽

Shouqian Sun ◽

Xin Min ◽

Wenqian Lin ◽

Binling Nie ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Deep Convolutional Neural Network ◽

End To End

Download Full-text

A convolutional neural network to classify American Sign Language fingerspelling from depth and colour images

Expert Systems ◽

10.1111/exsy.12197 ◽

2017 ◽

Vol 34 (3) ◽

pp. e12197 ◽

Cited By ~ 17

Author(s):

Salem Ameen ◽

Sunil Vadera

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

American Sign Language ◽

Sign Language ◽

American Sign

Download Full-text

Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos

Multimedia Tools and Applications ◽

10.1007/s11042-018-5893-9 ◽

2018 ◽

Vol 77 (20) ◽

pp. 26901-26918 ◽

Cited By ~ 8

Author(s):

Bo Meng ◽

XueJun Liu ◽

Xiaolin Wang

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action

Download Full-text

Comparison of American Sign Language Use Identification using Multi-Class SVM Classification, Backpropagation Neural Network, K - Nearest Neighbor and Naive Bayes

Teknik ◽

10.14710/teknik.v42i2.36929 ◽

2021 ◽

Vol 42 (2) ◽

pp. 137-148

Author(s):

Vincentius Abdi Gunawan ◽

Leonardus Sandy Ade Putra

Keyword(s):

Neural Network ◽

American Sign Language ◽

Sign Language ◽

Nearest Neighbor ◽

American Sign ◽

Classification Methods ◽

K Nearest Neighbor ◽

Backpropagation Neural Network ◽

Svm Classification ◽

Verbal Language

Communication is essential in conveying information from one individual to another. However, not all individuals in the world can communicate verbally. According to WHO, deafness is a hearing loss that affects 466 million people globally, and 34 million are children. So it is necessary to have a non-verbal language learning method for someone who has hearing problems. The purpose of this study is to build a system that can identify non-verbal language so that it can be easily understood in real-time. A high success rate in the system needs a proper method to be applied in the system, such as machine learning supported by wavelet feature extraction and different classification methods in image processing. Machine learning was applied in the system because of its ability to recognize and compare the classification results in four different methods. The four classifications used to compare the hand gesture recognition from American Sign Language are the Multi-Class SVM classification, Backpropagation Neural Network Backpropagation, K - Nearest Neighbor (K-NN), and Naïve Bayes. The simulation test of the four classification methods that have been carried out obtained success rates of 99.3%, 98.28%, 97.7%, and 95.98%, respectively. So it can be concluded that the classification method using the Multi-Class SVM has the highest success rate in the introduction of American Sign Language, which reaches 99.3%. The whole system is designed and tested using MATLAB as supporting software and data processing.

Download Full-text