scholarly journals Multi-Modality Emotion Recognition Model with GAT-Based Multi-Head Inter-Modality Attention

Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4894 ◽  
Author(s):  
Changzeng Fu ◽  
Chaoran Liu ◽  
Carlos Toshinori Ishi ◽  
Hiroshi Ishiguro

Emotion recognition has been gaining attention in recent years due to its applications on artificial agents. To achieve a good performance with this task, much research has been conducted on the multi-modality emotion recognition model for leveraging the different strengths of each modality. However, a research question remains: what exactly is the most appropriate way to fuse the information from different modalities? In this paper, we proposed audio sample augmentation and an emotion-oriented encoder-decoder to improve the performance of emotion recognition and discussed an inter-modality, decision-level fusion method based on a graph attention network (GAT). Compared to the baseline, our model improved the weighted average F1-scores from 64.18 to 68.31% and the weighted average accuracy from 65.25 to 69.88%.

2017 ◽  
Vol 88 (9) ◽  
pp. 094301 ◽  
Author(s):  
Changrong Ye ◽  
Xiaoping Zeng ◽  
Guojun Li ◽  
Chenyuan Shi ◽  
Xin Jian ◽  
...  

Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5732
Author(s):  
Shih-Wei Sun ◽  
Bao-Yun Liu ◽  
Pao-Chi Chang

We propose a violin bowing action recognition system that can accurately recognize distinct bowing actions in classical violin performance. This system can recognize bowing actions by analyzing signals from a depth camera and from inertial sensors that are worn by a violinist. The contribution of this study is threefold: (1) a dataset comprising violin bowing actions was constructed from data captured by a depth camera and multiple inertial sensors; (2) data augmentation was achieved for depth-frame data through rotation in three-dimensional world coordinates and for inertial sensing data through yaw, pitch, and roll angle transformations; and, (3) bowing action classifiers were trained using different modalities, to compensate for the strengths and weaknesses of each modality, based on deep learning methods with a decision-level fusion process. In experiments, large external motions and subtle local motions produced from violin bow manipulations were both accurately recognized by the proposed system (average accuracy > 80%).


2020 ◽  
Vol 40 (2) ◽  
pp. 149-157 ◽  
Author(s):  
Değer Ayata ◽  
Yusuf Yaslan ◽  
Mustafa E. Kamasak

Abstract Purpose The purpose of this paper is to propose a novel emotion recognition algorithm from multimodal physiological signals for emotion aware healthcare systems. In this work, physiological signals are collected from a respiratory belt (RB), photoplethysmography (PPG), and fingertip temperature (FTT) sensors. These signals are used as their collection becomes easy with the advance in ergonomic wearable technologies. Methods Arousal and valence levels are recognized from the fused physiological signals using the relationship between physiological signals and emotions. This recognition is performed using various machine learning methods such as random forest, support vector machine and logistic regression. The performance of these methods is studied. Results Using decision level fusion, the accuracy improved from 69.86 to 73.08% for arousal, and from 69.53 to 72.18% for valence. Results indicate that using multiple sources of physiological signals and their fusion increases the accuracy rate of emotion recognition. Conclusion This study demonstrated a framework for emotion recognition using multimodal physiological signals from respiratory belt, photo plethysmography and fingertip temperature. It is shown that decision level fusion from multiple classifiers (one per signal source) improved the accuracy rate of emotion recognition both for arousal and valence dimensions.


Sign in / Sign up

Export Citation Format

Share Document