Shared acoustic codes underlie emotional communication in music and speech-Evidence from deep transfer learning.

Coutinho, Eduardo; Schuller, Björn

Coutinho, Eduardo; Schuller, Björn.

Afiliación

Coutinho E; Department of Music, University of Liverpool, Liverpool, United Kingdom.
Schuller B; Department of Computing, Imperial College London, London, United Kingdom.

PLoS One ; 12(6): e0179289, 2017.

Article en En | MEDLINE | ID: mdl-28658285

ABSTRACT

ABSTRACT

Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining the degree of overlap between both domains is fundamental to understand the shared mechanisms underlying such phenomenon. From a Machine learning perspective, the overlap between acoustic codes for emotional expression in music and speech opens new possibilities to enlarge the amount of data available to develop music and speech emotion recognition systems. In this article, we investigate time-continuous predictions of emotion (Arousal and Valence) in music and speech, and the Transfer Learning between these domains. We establish a comparative framework including intra- (i.e., models trained and tested on the same modality, either music or speech) and cross-domain experiments (i.e., models trained in one modality and tested on the other). In the cross-domain context, we evaluated two strategies-the direct transfer between domains, and the contribution of Transfer Learning techniques (feature-representation-transfer based on Denoising Auto Encoders) for reducing the gap in the feature space distributions. Our results demonstrate an excellent cross-domain generalisation performance with and without feature representation transfer in both directions. In the case of music, cross-domain approaches outperformed intra-domain models for Valence estimation, whereas for Speech intra-domain models achieve the best performance. This is the first demonstration of shared acoustic codes for emotional expression in music and speech in the time-continuous domain.

Asunto(s)

Acústica; Emociones; Música/psicología; Habla; Femenino; Humanos; Aprendizaje; Masculino

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Habla / Acústica / Emociones / Música Tipo de estudio: Prognostic_studies Límite: Female / Humans / Male Idioma: En Revista: PLoS One Asunto de la revista: CIENCIA / MEDICINA Año: 2017 Tipo del documento: Article País de afiliación: Reino Unido

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google