Facial Expression Recognition of Nonlinear Facial Variations Using Deep Locality De-Expression Residue Learning in the Wild

Asad Ullah; Jing Wang; M. Shahid Anwar; Usman Ahmad; Uzair Saeed; Zesong Fei

doi:10.3390/electronics8121487

Facial Expression Recognition of Nonlinear Facial Variations Using Deep Locality De-Expression Residue Learning in the Wild

Electronics ◽

10.3390/electronics8121487 ◽

2019 ◽

Vol 8 (12) ◽

pp. 1487 ◽

Cited By ~ 1

Author(s):

Asad Ullah ◽

Jing Wang ◽

M. Shahid Anwar ◽

Usman Ahmad ◽

Uzair Saeed ◽

...

Keyword(s):

Facial Expression ◽

Real World ◽

Facial Expression Recognition ◽

Eye Gaze ◽

Input Image ◽

Training Data ◽

Head Pose Estimation ◽

Expression Recognition ◽

Action Unit ◽

In The Wild

Automatic facial expression recognition is an emerging field. Moreover, the interest has been increased with the transition from laboratory-controlled conditions to in the wild scenarios. Most of the research has been done over nonoccluded faces under the constrained environment, while automatic facial expression is less understood/implemented for partial occlusion in the real world conditions. Apart from that, our research aims to tackle the issues of overfitting (caused by the shortage of adequate training data) and to alleviate the expression-unrelated/intraclass/nonlinear facial variations, such as head pose estimation, eye gaze estimation, intensity and microexpressions. In our research, we control the magnitude of each Action Unit (AU) and combine several of the Action Unit combinations to leverage learning from the generative and discriminative representations for automatic FER. We have also addressed the problem of diversification of expressions from lab controlled to real-world scenarios from our cross-database study and proposed a model for enhancement of the discriminative power of deep features while increasing the interclass scatters, by preserving the locality closeness. Furthermore, facial expression consists of an expressive component as well as neutral component, so we proposed a generative model which is capable of generating neutral expression from an input image using cGAN. The expressive component is filtered and passed to the intermediate layers and the process is called De-expression Residue Learning. The residue in the intermediate/middle layers is very important for learning through expressive components. Finally, we validate the effectiveness of our method (DLP-DeRL) through qualitative and quantitative experimental results using four databases. Our method is more accurate and robust, and outperforms all the existing methods (hand crafted features and deep learning) while dealing the images in the wild.

Download Full-text

Facial Expression Recognition Based on Weighted-Cluster Loss and Deep Transfer Learning Using a Highly Imbalanced Dataset

Sensors ◽

10.3390/s20092639 ◽

2020 ◽

Vol 20 (9) ◽

pp. 2639

Author(s):

Quan T. Ngo ◽

Seokhoon Yoon

Keyword(s):

Facial Expression ◽

Transfer Learning ◽

Loss Function ◽

Real World ◽

Facial Expression Recognition ◽

Training Data ◽

Fine Tuning ◽

Expression Recognition ◽

Recent Success ◽

Deep Cnn

Facial expression recognition (FER) is a challenging problem in the fields of pattern recognition and computer vision. The recent success of convolutional neural networks (CNNs) in object detection and object segmentation tasks has shown promise in building an automatic deep CNN-based FER model. However, in real-world scenarios, performance degrades dramatically owing to the great diversity of factors unrelated to facial expressions, and due to a lack of training data and an intrinsic imbalance in the existing facial emotion datasets. To tackle these problems, this paper not only applies deep transfer learning techniques, but also proposes a novel loss function called weighted-cluster loss, which is used during the fine-tuning phase. Specifically, the weighted-cluster loss function simultaneously improves the intra-class compactness and the inter-class separability by learning a class center for each emotion class. It also takes the imbalance in a facial expression dataset into account by giving each emotion class a weight based on its proportion of the total number of images. In addition, a recent, successful deep CNN architecture, pre-trained in the task of face identification with the VGGFace2 database from the Visual Geometry Group at Oxford University, is employed and fine-tuned using the proposed loss function to recognize eight basic facial emotions from the AffectNet database of facial expression, valence, and arousal computing in the wild. Experiments on an AffectNet real-world facial dataset demonstrate that our method outperforms the baseline CNN models that use either weighted-softmax loss or center loss.

Download Full-text

A Unified Framework of Deep Learning-Based Facial Expression Recognition System for Diversified Applications

Applied Sciences ◽

10.3390/app11199174 ◽

2021 ◽

Vol 11 (19) ◽

pp. 9174

Author(s):

Sanoar Hossain ◽

Saiyed Umer ◽

Vijayan Asari ◽

Ranjeet Kumar Rout

Keyword(s):

Facial Expression ◽

Facial Expression Recognition ◽

Data Augmentation ◽

Recognition System ◽

Input Image ◽

Fine Tuning ◽

Expression Recognition ◽

Unified Framework ◽

Face Region ◽

In The Wild

This work proposes a facial expression recognition system for a diversified field of applications. The purpose of the proposed system is to predict the type of expressions in a human face region. The implementation of the proposed method is fragmented into three components. In the first component, from the given input image, a tree-structured part model has been applied that predicts some landmark points on the input image to detect facial regions. The detected face region was normalized to its fixed size and then down-sampled to its varying sizes such that the advantages, due to the effect of multi-resolution images, can be introduced. Then, some convolutional neural network (CNN) architectures were proposed in the second component to analyze the texture patterns in the facial regions. To enhance the proposed CNN model’s performance, some advanced techniques, such data augmentation, progressive image resizing, transfer-learning, and fine-tuning of the parameters, were employed in the third component to extract more distinctive and discriminant features for the proposed facial expression recognition system. The performance of the proposed system, due to different CNN models, is fused to achieve better performance than the existing state-of-the-art methods and for this reason, extensive experimentation has been carried out using the Karolinska-directed emotional faces (KDEF), GENKI-4k, Cohn-Kanade (CK+), and Static Facial Expressions in the Wild (SFEW) benchmark databases. The performance has been compared with some existing methods concerning these databases, which shows that the proposed facial expression recognition system outperforms other competing methods.

Download Full-text

A Compact Deep Learning Model for Robust Facial Expression Recognition

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8724.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 2956-2960

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Facial Expression Recognition ◽

Input Image ◽

Image Resolution ◽

Training Data ◽

Expression Recognition ◽

Human Beings ◽

The Public ◽

Low Intensity

In this paper we are proposing a compact CNN model for facial expression recognition. Expression recognition on the low quality images are much more challenging and interesting due to the presence of low-intensity expressions. These low intensity expressions are difficult to distinguish with insufficient image resolution. Data collection for FER is expensive and time-consuming. Researches indicates the fact that downloaded images from the Internet is very useful to model and train expression recognition problem. We use extra datasets to improve the training of facial expression recognition, each representing specific data source. Moreover, to prevent subjective annotation, each dataset is labeled with different approaches to ensure annotation qualities. Recognizing the precise and exact expression from a variety of expressions of different people is a huge problem. To solve this problem, we proposed an Emotion Detection Model to extract emotions from the given input image. This work mainly focuses on the psychological approach of color circle-emotion relation[1] to find the accurate emotion from the input image. Initially the whole image is preprocessed and pixel by pixel data is studied. And the combinations of the circles based on combined data will result in a new color. This resulted color will be directly correlated to a particular emotion. Based on the psychological aspects the output will be of reasonable accuracy. The major application of our work is to predict a person’s emotion based on his face images or video frames This can even be applied for evaluating the public opinion relating to a particular movie, form the video reaction posts on social Medias. One of the diverse applications of our system is to understand the students learning from their emotions. Human beings shows their emotional states and intentions through facial expressions.. Facial expressions are powerful and natural methods that emphasize the emotional status of humans .The approach used in this work successfully exploits temporal information and it improves the accuracies on the public benchmarking databases. The basic facial expressions are happiness, fear, anger, disgust sadness, and surprise[2]. Contempt was subsequently added as one of the basic emotions. Having sufficient well labeled training data with variations of the populations and environments is important for the design of a deep expression recognition system .Behaviors, poses, facial expressions, actions and speech are considered as channels, which convey human emotions. Lot of research works are going on in this field to explore the correlation between the above mentioned channels and emotions. This paper highlights on the development of a system which automatically recognizes the

Download Full-text

Facial Expression Recognition in the Wild

2018 ACM Multimedia Conference on Multimedia Conference - MM '18 ◽

10.1145/3240508.3240574 ◽

2018 ◽

Cited By ~ 3

Author(s):

Feifei Zhang ◽

Tianzhu Zhang ◽

Qirong Mao ◽

Lingyu Duan ◽

Changsheng Xu

Keyword(s):

Facial Expression ◽

Facial Expression Recognition ◽

Expression Recognition ◽

In The Wild

Download Full-text

Hybrid Attention Cascade Network for Facial Expression Recognition

Sensors ◽

10.3390/s21062003 ◽

2021 ◽

Vol 21 (6) ◽

pp. 2003 ◽

Cited By ~ 1

Author(s):

Xiaoliang Zhu ◽

Shihao Ye ◽

Liang Zhao ◽

Zhicheng Dai

Keyword(s):

Facial Expression ◽

Facial Expressions ◽

Facial Expression Recognition ◽

Expression Recognition ◽

Spatial Features ◽

Face Images ◽

Temporal Features ◽

The Face ◽

In The Wild ◽

Fusion Features

As a sub-challenge of EmotiW (the Emotion Recognition in the Wild challenge), how to improve performance on the AFEW (Acted Facial Expressions in the wild) dataset is a popular benchmark for emotion recognition tasks with various constraints, including uneven illumination, head deflection, and facial posture. In this paper, we propose a convenient facial expression recognition cascade network comprising spatial feature extraction, hybrid attention, and temporal feature extraction. First, in a video sequence, faces in each frame are detected, and the corresponding face ROI (range of interest) is extracted to obtain the face images. Then, the face images in each frame are aligned based on the position information of the facial feature points in the images. Second, the aligned face images are input to the residual neural network to extract the spatial features of facial expressions corresponding to the face images. The spatial features are input to the hybrid attention module to obtain the fusion features of facial expressions. Finally, the fusion features are input in the gate control loop unit to extract the temporal features of facial expressions. The temporal features are input to the fully connected layer to classify and recognize facial expressions. Experiments using the CK+ (the extended Cohn Kanade), Oulu-CASIA (Institute of Automation, Chinese Academy of Sciences) and AFEW datasets obtained recognition accuracy rates of 98.46%, 87.31%, and 53.44%, respectively. This demonstrated that the proposed method achieves not only competitive performance comparable to state-of-the-art methods but also greater than 2% performance improvement on the AFEW dataset, proving the significant outperformance of facial expression recognition in the natural environment.

Download Full-text

Facial Expression Recognition using Residual Convnet with Image Augmentations

Jurnal Ilmu Komputer dan Informasi ◽

10.21609/jiki.v14i2.968 ◽

2021 ◽

Vol 14 (2) ◽

pp. 127-135

Author(s):

Fadhil Yusuf Rahadika ◽

Novanto Yudistira ◽

Yuita Arum Sari

Keyword(s):

Neural Networks ◽

Facial Expression ◽

Test Data ◽

Facial Expression Recognition ◽

Optimization Method ◽

Training Data ◽

Batch Size ◽

Online Video ◽

Expression Recognition ◽

Validation Data

During the COVID-19 pandemic, many offline activities are turned into online activities via video meetings to prevent the spread of the COVID 19 virus. In the online video meeting, some micro-interactions are missing when compared to direct social interactions. The use of machines to assist facial expression recognition in online video meetings is expected to increase understanding of the interactions among users. Many studies have shown that CNN-based neural networks are quite effective and accurate in image classification. In this study, some open facial expression datasets were used to train CNN-based neural networks with a total number of training data of 342,497 images. This study gets the best results using ResNet-50 architecture with Mish activation function and Accuracy Booster Plus block. This architecture is trained using the Ranger and Gradient Centralization optimization method for 60000 steps with a batch size of 256. The best results from the training result in accuracy of AffectNet validation data of 0.5972, FERPlus validation data of 0.8636, FERPlus test data of 0.8488, and RAF-DB test data of 0.8879. From this study, the proposed method outperformed plain ResNet in all test scenarios without transfer learning, and there is a potential for better performance with the pre-training model. The code is available at https://github.com/yusufrahadika-facial-expressions-essay.

Download Full-text

FERNet: A Deep CNN Architecture for Facial Expression Recognition in the Wild

Journal of The Institution of Engineers (India) Series B ◽

10.1007/s40031-021-00681-8 ◽

2021 ◽

Author(s):

Jyostna Devi Bodapati ◽

U. Srilakshmi ◽

N. Veeranjaneyulu

Keyword(s):

Facial Expression ◽

Facial Expression Recognition ◽

Expression Recognition ◽

In The Wild ◽

Deep Cnn

Download Full-text

Toward Unbiased Facial Expression Recognition in the Wild via Cross-Dataset Adaptation

IEEE Access ◽

10.1109/access.2020.3018738 ◽

2020 ◽

Vol 8 ◽

pp. 159172-159181

Author(s):

Byungok Han ◽

Woo-Han Yun ◽

Jang-Hee Yoo ◽

Won Hwa Kim

Keyword(s):

Facial Expression ◽

Facial Expression Recognition ◽

Expression Recognition ◽

In The Wild

Download Full-text

Pyramid With Super Resolution for In-the-Wild Facial Expression Recognition

IEEE Access ◽

10.1109/access.2020.3010018 ◽

2020 ◽

Vol 8 ◽

pp. 131988-132001 ◽

Cited By ~ 1

Author(s):

Thanh-Hung Vo ◽

Guee-Sang Lee ◽

Hyung-Jeong Yang ◽

Soo-Hyung Kim

Keyword(s):

Facial Expression ◽

Facial Expression Recognition ◽

Super Resolution ◽

Expression Recognition ◽

In The Wild

Download Full-text

eXnet: An Efficient Approach for Emotion Recognition in the Wild

Sensors ◽

10.3390/s20041087 ◽

2020 ◽

Vol 20 (4) ◽

pp. 1087

Author(s):

Muhammad Naveed Riaz ◽

Yao Shen ◽

Muhammad Sohail ◽

Minyi Guo

Keyword(s):

Facial Expression ◽

Emotion Recognition ◽

Facial Expression Recognition ◽

Data Augmentation ◽

Raspberry Pi ◽

Expression Recognition ◽

Extensive Evaluation ◽

Benchmark Datasets ◽

Augmentation Techniques ◽

In The Wild

Facial expression recognition has been well studied for its great importance in the areas of human–computer interaction and social sciences. With the evolution of deep learning, there have been significant advances in this area that also surpass human-level accuracy. Although these methods have achieved good accuracy, they are still suffering from two constraints (high computational power and memory), which are incredibly critical for small hardware-constrained devices. To alleviate this issue, we propose a new Convolutional Neural Network (CNN) architecture eXnet (Expression Net) based on parallel feature extraction which surpasses current methods in accuracy and contains a much smaller number of parameters (eXnet: 4.57 million, VGG19: 14.72 million), making it more efficient and lightweight for real-time systems. Several modern data augmentation techniques are applied for generalization of eXnet; these techniques improve the accuracy of the network by overcoming the problem of overfitting while containing the same size. We provide an extensive evaluation of our network against key methods on Facial Expression Recognition 2013 (FER-2013), Extended Cohn-Kanade Dataset (CK+), and Real-world Affective Faces Database (RAF-DB) benchmark datasets. We also perform ablation evaluation to show the importance of different components of our architecture. To evaluate the efficiency of eXnet on embedded systems, we deploy it on Raspberry Pi 4B. All these evaluations show the superiority of eXnet for emotion recognition in the wild in terms of accuracy, the number of parameters, and size on disk.

Download Full-text