Audio-visual emotion recognition: A dynamic, multimodal approach - Sorbonne Université
Poster De Conférence Année : 2014

Audio-visual emotion recognition: A dynamic, multimodal approach

Résumé

Designing systems able to interact with students in a natural manner is a complex and far from solved problem. A key aspect of natural interaction is the ability to understand and appropriately respond to human emotions. This paper details our response to the continuous Audio/Visual Emotion Challenge (AVEC'12) whose goal is to predict four affective signals describing human emotions. The proposed method uses Fourier spectra to extract multi-scale dynamic descriptions of signals characterizing face appearance, head movements and voice. We perform a kernel regression with very few representative samples selected via a supervised weighted-distance-based clustering, that leads to a high generalization power. We also propose a particularly fast regressor-level fusion framework to merge systems based on different modalities. Experiments have proven the efficiency of each key point of the proposed method and our results on challenge data were the highest among 10 international research teams.
Fichier principal
Vignette du fichier
p44-nicole.pdf (436.52 Ko) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01089628 , version 1 (02-12-2014)

Identifiants

  • HAL Id : hal-01089628 , version 1

Citer

Jérémie Nicolle, Vincent Rapp, Kevin Bailly, Lionel Prevost, Mohamed Chetouani. Audio-visual emotion recognition: A dynamic, multimodal approach. IHM'14, 26e conférence francophone sur l'Interaction Homme-Machine, Oct 2014, Lille, France. pp.44-51, 2014. ⟨hal-01089628⟩
352 Consultations
274 Téléchargements

Partager

More