eCite Digital Repository

Combination of CNN and RNN for speech emotion recognition


Vo, TH and Yeom, S and Na, IS and Oh, A and Lee, GS and Yang, HJ and Kim, SH, Combination of CNN and RNN for speech emotion recognition, SMA2019 - The International Conference on Smart Media & Applications, 5 - 7 December 2019, Hilton Guam Resort & Spa, GUAM, pp. 163 -166. ISSN 2287-4348 (2019) [Refereed Conference Paper]

Copyright Statement

Copyright unknown

Official URL:


Speech Emotion Recognition has several potential applications in health care and human-computer interaction. In recent years, some studies use deep learning to deal with this task, but most of them use shallow neural networks due to the overfitting problem. We proposed an architecture that combines CNN and RNN to learn not only to capture the spatial but also the time-series information. The experiments on VGG-16 and ResNet-101 like networks gave a compatible result with some recent studies.

Item Details

Item Type:Refereed Conference Paper
Keywords:emotion, speech, human computer interaction, RNN, CNN, Deep learning
Research Division:Information and Computing Sciences
Research Group:Artificial intelligence
Research Field:Artificial life and complex adaptive systems
Objective Division:Information and Communication Services
Objective Group:Information systems, technologies and services
Objective Field:Artificial intelligence
UTAS Author:Yeom, S (Dr Soonja Yeom)
ID Code:149125
Year Published:2019
Deposited By:Information and Communication Technology
Deposited On:2022-03-11
Last Modified:2022-07-13

Repository Staff Only: item control page