Deep Modelling Strategies for Human Confidence Classification using Audio-visual Data.

Gudipalli, Yagna; Deshpande, Gauri; Patel, Sachin; Schuller, Bjorn W

Gudipalli, Yagna; Deshpande, Gauri; Patel, Sachin; Schuller, Bjorn W.

Annu Int Conf IEEE Eng Med Biol Soc ; 2023: 1-4, 2023 07.

Article en En | MEDLINE | ID: mdl-38083410

ABSTRACT

ABSTRACT

Human behavior expressions such as of confidence are time-varying entities. Both vocal and facial cues that convey the human confidence expressions keep varying throughout the duration of analysis. Although, the cues from these two modalities are not always in synchrony, they impact each other and the fused outcome as well. In this paper, we present a deep fusion technique to combine the two modalities and derive a single outcome to infer human confidence. Fused outcome improves the classification performance by capturing the temporal information from both the modalities. The analysis of time-varying nature of expressions in the conversations captured in an interview setup is also presented. We collected data from 51 speakers who participated in interview sessions. The average area under the curve (AUC) of uni-modal models using speech and facial expressions is 70.6% and 69.4%, respectively, for classifying confident videos from non-confident ones in 5-fold cross-validation analysis. Our deep fusion model improves the performance giving an average AUC of 76.8%.

Asunto(s)

Percepción del Habla; Voz; Humanos; Habla; Comunicación; Procesos Mentales

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Percepción del Habla / Voz Límite: Humans Idioma: En Revista: Annu Int Conf IEEE Eng Med Biol Soc Año: 2023 Tipo del documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar en Google