A Multimodal Facial Emotion Recognition Framework through the Fusion of Speech with Visible and Infrared Images

Mohammad Faridul Haque Siddiqui; Ahmad Y. Javaid

doi:10.3390/mti4030046

A Multimodal Facial Emotion Recognition Framework through the Fusion of Speech with Visible and Infrared Images

Multimodal Technologies and Interaction ◽

10.3390/mti4030046 ◽

2020 ◽

Vol 4 (3) ◽

pp. 46

Author(s):

Mohammad Faridul Haque Siddiqui ◽

Ahmad Y. Javaid

Keyword(s):

Emotion Recognition ◽

Final Decision ◽

Infrared Images ◽

Decision Level ◽

Visible Images ◽

Contemporary Work ◽

Decision Level Fusion ◽

Hybrid Fusion ◽

Level Fusion ◽

Fusion Approach

The exigency of emotion recognition is pushing the envelope for meticulous strategies of discerning actual emotions through the use of superior multimodal techniques. This work presents a multimodal automatic emotion recognition (AER) framework capable of differentiating between expressed emotions with high accuracy. The contribution involves implementing an ensemble-based approach for the AER through the fusion of visible images and infrared (IR) images with speech. The framework is implemented in two layers, where the first layer detects emotions using single modalities while the second layer combines the modalities and classifies emotions. Convolutional Neural Networks (CNN) have been used for feature extraction and classification. A hybrid fusion approach comprising early (feature-level) and late (decision-level) fusion, was applied to combine the features and the decisions at different stages. The output of the CNN trained with voice samples of the RAVDESS database was combined with the image classifier’s output using decision-level fusion to obtain the final decision. An accuracy of 86.36% and similar recall (0.86), precision (0.88), and f-measure (0.87) scores were obtained. A comparison with contemporary work endorsed the competitiveness of the framework with the rationale for exclusivity in attaining this accuracy in wild backgrounds and light-invariant conditions.

Download Full-text

ODROID XU4 based implementation of decision level fusion approach for matching computer generated sketches

Journal of Computational Science ◽

10.1016/j.jocs.2016.07.013 ◽

2016 ◽

Vol 16 ◽

pp. 217-224 ◽

Cited By ~ 13

Author(s):

Steven Lawrence Fernandes ◽

G. Josemin Bala

Keyword(s):

Decision Level ◽

Decision Level Fusion ◽

Level Fusion ◽

Fusion Approach

Download Full-text

Decision-level fusion approach to face recognition with multiple cameras

10.1117/12.2053638 ◽

2014 ◽

Cited By ~ 2

Author(s):

Seokwon Yeom

Keyword(s):

Face Recognition ◽

Multiple Cameras ◽

Decision Level ◽

Decision Level Fusion ◽

Level Fusion ◽

Fusion Approach

Download Full-text

Emotion Recognition from Decision Level Fusion of Visual and Acoustic Features Using Hausdorff Classifier

Communications in Computer and Information Science - Computer Networks and Intelligent Computing ◽

10.1007/978-3-642-22786-8_76 ◽

2011 ◽

pp. 601-610

Author(s):

Vankayalapati H.D. ◽

Anne K.R. ◽

Kyamakya K.

Keyword(s):

Emotion Recognition ◽

Acoustic Features ◽

Decision Level ◽

Decision Level Fusion ◽

Level Fusion

Download Full-text

Decision-Level Fusion Method for Emotion Recognition using Multimodal Emotion Recognition Information

2018 15th International Conference on Ubiquitous Robots (UR) ◽

10.1109/urai.2018.8441795 ◽

2018 ◽

Cited By ~ 3

Author(s):

Kyu-Seob Song ◽

Young-Hoon Nho ◽

Ju-Hwan Seo ◽

Dong-soo Kwon

Keyword(s):

Emotion Recognition ◽

Fusion Method ◽

Decision Level ◽

Decision Level Fusion ◽

Multimodal Emotion Recognition ◽

Level Fusion

Download Full-text

Shape representation and classification through Height functions and Local Binary Pattern - a decision level fusion approach

2014 International Conference on Advances in Computing, Communications and Informatics (ICACCI) ◽

10.1109/icacci.2014.6968428 ◽

2014 ◽

Cited By ~ 1

Author(s):

B.H. Shekar ◽

Bharathi Pilar

Keyword(s):

Local Binary Pattern ◽

Shape Representation ◽

Height Functions ◽

Decision Level ◽

Decision Level Fusion ◽

Level Fusion ◽

Fusion Approach

Download Full-text

Multimodal Biometrics Using Fingerprint, Palmprint, and Iris With a Combined Fusion Approach

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2019100101 ◽

2019 ◽

Vol 9 (4) ◽

pp. 1-14

Author(s):

Priti Shivaji Sanjekar ◽

Jayantrao B. Patil

Keyword(s):

Feature Extraction ◽

Dimensionality Reduction ◽

Wavelet Transformation ◽

Multimodal Biometrics ◽

Decision Level ◽

Decision Level Fusion ◽

Level Fusion ◽

Fusion Approach ◽

Match Score

Multimodal biometrics is the frontier to unimodal biometrics as it integrates the information obtained from multiple biometric sources at various fusion levels i.e. sensor level, feature extraction level, match score level, or decision level. In this article, fingerprint, palmprint, and iris are used for verification of an individual. The wavelet transformation is used to extract features from fingerprint, palmprint, and iris. Further the PCA is used for dimensionality reduction. The fusion of traits is employed at three levels: feature level; feature level combined with match score level; and feature level combined with decision level. The main objective of this research is to observe effect of combined fusion levels on verification of an individual. The performance of three cases of fusion is measured in terms of EER and represented with ROC. The experiments performed on 100 different subjects from publicly available databases demonstrate that combining feature level with match score level and feature level with decision level fusion both outperforms fusion at only a feature level.

Download Full-text

Emotion recognition from audio-visual data using rule based decision level fusion

2016 IEEE Students’ Technology Symposium (TechSym) ◽

10.1109/techsym.2016.7872646 ◽

2016 ◽

Cited By ~ 3

Author(s):

Subhasmita Sahoo ◽

Aurobinda Routray

Keyword(s):

Emotion Recognition ◽

Visual Data ◽

Rule Based ◽

Decision Level ◽

Decision Level Fusion ◽

Level Fusion

Download Full-text

Hybrid feature and decision level fusion of face and speech information for bimodal emotion recognition

2009 14th International CSI Computer Conference ◽

10.1109/csicc.2009.5349653 ◽

2009 ◽

Cited By ~ 4

Author(s):

Muharram Mansoorizadeh ◽

Nasrollah Moghaddam Charkari

Keyword(s):

Emotion Recognition ◽

Decision Level ◽

Decision Level Fusion ◽

Speech Information ◽

Level Fusion

Download Full-text

Emotion Recognition from Multimodal Physiological Signals for Emotion Aware Healthcare Systems

Journal of Medical and Biological Engineering ◽

10.1007/s40846-019-00505-7 ◽

2020 ◽

Vol 40 (2) ◽

pp. 149-157 ◽

Cited By ~ 3

Author(s):

Değer Ayata ◽

Yusuf Yaslan ◽

Mustafa E. Kamasak

Keyword(s):

Emotion Recognition ◽

Healthcare Systems ◽

Physiological Signals ◽

Support Vector ◽

Multiple Sources ◽

Accuracy Rate ◽

Decision Level ◽

Decision Level Fusion ◽

Level Fusion ◽

Arousal And Valence

Abstract Purpose The purpose of this paper is to propose a novel emotion recognition algorithm from multimodal physiological signals for emotion aware healthcare systems. In this work, physiological signals are collected from a respiratory belt (RB), photoplethysmography (PPG), and fingertip temperature (FTT) sensors. These signals are used as their collection becomes easy with the advance in ergonomic wearable technologies. Methods Arousal and valence levels are recognized from the fused physiological signals using the relationship between physiological signals and emotions. This recognition is performed using various machine learning methods such as random forest, support vector machine and logistic regression. The performance of these methods is studied. Results Using decision level fusion, the accuracy improved from 69.86 to 73.08% for arousal, and from 69.53 to 72.18% for valence. Results indicate that using multiple sources of physiological signals and their fusion increases the accuracy rate of emotion recognition. Conclusion This study demonstrated a framework for emotion recognition using multimodal physiological signals from respiratory belt, photo plethysmography and fingertip temperature. It is shown that decision level fusion from multiple classifiers (one per signal source) improved the accuracy rate of emotion recognition both for arousal and valence dimensions.

Download Full-text

Hybrid fusion for biometrics: Combining score-level and decision-level fusion

2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops ◽

10.1109/cvprw.2008.4563106 ◽

2008 ◽

Cited By ~ 1

Author(s):

Qian Tao ◽

Raymond Veldhuis

Keyword(s):

Decision Level ◽

Decision Level Fusion ◽

Hybrid Fusion ◽

Level Fusion

Download Full-text