scholarly journals A Multimodal Facial Emotion Recognition Framework through the Fusion of Speech with Visible and Infrared Images

2020 ◽  
Vol 4 (3) ◽  
pp. 46
Author(s):  
Mohammad Faridul Haque Siddiqui ◽  
Ahmad Y. Javaid

The exigency of emotion recognition is pushing the envelope for meticulous strategies of discerning actual emotions through the use of superior multimodal techniques. This work presents a multimodal automatic emotion recognition (AER) framework capable of differentiating between expressed emotions with high accuracy. The contribution involves implementing an ensemble-based approach for the AER through the fusion of visible images and infrared (IR) images with speech. The framework is implemented in two layers, where the first layer detects emotions using single modalities while the second layer combines the modalities and classifies emotions. Convolutional Neural Networks (CNN) have been used for feature extraction and classification. A hybrid fusion approach comprising early (feature-level) and late (decision-level) fusion, was applied to combine the features and the decisions at different stages. The output of the CNN trained with voice samples of the RAVDESS database was combined with the image classifier’s output using decision-level fusion to obtain the final decision. An accuracy of 86.36% and similar recall (0.86), precision (0.88), and f-measure (0.87) scores were obtained. A comparison with contemporary work endorsed the competitiveness of the framework with the rationale for exclusivity in attaining this accuracy in wild backgrounds and light-invariant conditions.

Author(s):  
Priti Shivaji Sanjekar ◽  
Jayantrao B. Patil

Multimodal biometrics is the frontier to unimodal biometrics as it integrates the information obtained from multiple biometric sources at various fusion levels i.e. sensor level, feature extraction level, match score level, or decision level. In this article, fingerprint, palmprint, and iris are used for verification of an individual. The wavelet transformation is used to extract features from fingerprint, palmprint, and iris. Further the PCA is used for dimensionality reduction. The fusion of traits is employed at three levels: feature level; feature level combined with match score level; and feature level combined with decision level. The main objective of this research is to observe effect of combined fusion levels on verification of an individual. The performance of three cases of fusion is measured in terms of EER and represented with ROC. The experiments performed on 100 different subjects from publicly available databases demonstrate that combining feature level with match score level and feature level with decision level fusion both outperforms fusion at only a feature level.


2020 ◽  
Vol 40 (2) ◽  
pp. 149-157 ◽  
Author(s):  
Değer Ayata ◽  
Yusuf Yaslan ◽  
Mustafa E. Kamasak

Abstract Purpose The purpose of this paper is to propose a novel emotion recognition algorithm from multimodal physiological signals for emotion aware healthcare systems. In this work, physiological signals are collected from a respiratory belt (RB), photoplethysmography (PPG), and fingertip temperature (FTT) sensors. These signals are used as their collection becomes easy with the advance in ergonomic wearable technologies. Methods Arousal and valence levels are recognized from the fused physiological signals using the relationship between physiological signals and emotions. This recognition is performed using various machine learning methods such as random forest, support vector machine and logistic regression. The performance of these methods is studied. Results Using decision level fusion, the accuracy improved from 69.86 to 73.08% for arousal, and from 69.53 to 72.18% for valence. Results indicate that using multiple sources of physiological signals and their fusion increases the accuracy rate of emotion recognition. Conclusion This study demonstrated a framework for emotion recognition using multimodal physiological signals from respiratory belt, photo plethysmography and fingertip temperature. It is shown that decision level fusion from multiple classifiers (one per signal source) improved the accuracy rate of emotion recognition both for arousal and valence dimensions.


Sign in / Sign up

Export Citation Format

Share Document