厦门大学
智能媒体创新实验室
A Visual-Audio Based Emotion Recognition System Integrating
Dimensional Analysis
IEEE Transactions on Computational Social Systems
摘要/Abstract
Dimensional emotion recognition research is an important branch of affective computing, which uses continuous values to represent complex human emotions. In this study, we propose a visual–audio emotion recognition system that integrated emotional dimensions. For the visual part, the corresponding relationship between emotion category and emotion dimension interval is established based on rules, and the respective classifiers are trained and fused using the machine learning methods. For the audio part, some emotion-related features are extracted, and a 128-D global feature is extracted through a deep convolutional neural network (DCNN). We use a combination of Bayesian and machine learning to integrate the information of visual–audio modalities. We have tested the proposed system and its single modalities on the standard databases CK+ and eNTERFACE ’05, and the experimental results and comparison showed the efficiency of the proposed system. Furthermore, our proposed system uses emotion category label and dimension values simultaneously to represent emotion, providing strong interpretability and expansibility for emotion recognition, which goes beyond the methods only come with either classification or dimension.
研究框架图/Research Architecture



