Context-aware emotion recognition in the wild using spatio-temporal and temporal-pyramid models

Do, N-T; Kim, S-H; Yang, H-J; Lee, G-S; Yeom, Soonja

143632 - Context-aware emotion recognition in the wild using spatio-temporal and temporal-pyramid models.pdf (1.71 MB)

Context-aware emotion recognition in the wild using spatio-temporal and temporal-pyramid models

journal contribution

posted on 2023-05-20, 22:16 authored by Do, N-T, Kim, S-H, Yang, H-J, Lee, G-S, Soonja YeomSoonja Yeom

Emotion recognition plays an important role in human–computer interactions. Recent studies have focused on video emotion recognition in the wild and have run into difficulties related to occlusion, illumination, complex behavior over time, and auditory cues. State-of-the-art methods use multiple modalities, such as frame-level, spatiotemporal, and audio approaches. However, such methods have difficulties in exploiting long-term dependencies in temporal information, capturing contextual information, and integrating multi-modal information. In this paper, we introduce a multi-modal flexible system for video-based emotion recognition in the wild. Our system tracks and votes on significant faces corresponding to persons of interest in a video to classify seven basic emotions. The key contribution of this study is that it proposes the use of face feature extraction with context-aware and statistical information for emotion recognition. We also build two model architectures to effectively exploit long-term dependencies in temporal information with a temporal-pyramid model and a spatiotemporal model with “Conv2D+LSTM+3DCNN+Classify” architecture. Finally, we propose the best selection ensemble to improve the accuracy of multi-modal fusion. The best selection ensemble selects the best combination from spatiotemporal and temporal-pyramid models to achieve the best accuracy for classifying the seven basic emotions. In our experiment, we take benchmark measurement on the AFEW dataset with high accuracy.

History

Publication title

Sensors

Volume

21

Issue

7

Article number

2344

Number

2344

Pagination

1-29

ISSN

1424-8220

Department/School

School of Information and Communication Technology

Publisher

Molecular Diversity Preservation International

Place of publication

Matthaeusstrasse 11, Basel, Switzerland, Ch-4057

Rights statement

Copyright: © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Repository Status

Open

Socio-economic Objectives

Artificial intelligence

Usage metrics

Keywords

video emotion recognition spatiotemporal temporal-pyramid best selection ensemble facial emotion recognition

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Context-aware emotion recognition in the wild using spatio-temporal and temporal-pyramid models

History

Publication title

Volume

Issue

Article number

Number

Pagination

ISSN

Department/School

Publisher

Place of publication

Rights statement

Repository Status

Socio-economic Objectives

Usage metrics

Categories

Keywords

Licence

Exports