ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos

Guillaume Vaudaux-Ruth; Adrien Chan-Hon-Tong; Catherine Achard

Pré-Publication, Document De Travail Année : 2020

ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos

(1, 2) , (2) , (3, 4, 1)

1
2
3
4

Guillaume Vaudaux-Ruth

Fonction : Auteur
PersonId : 1067601
IdRef : 26843591X

Sorbonne Université

DTIS, ONERA, Université Paris Saclay (COmUE) [Palaiseau]

Adrien Chan-Hon-Tong

Fonction : Auteur
PersonId : 923812

DTIS, ONERA, Université Paris Saclay (COmUE) [Palaiseau]

Catherine Achard

Fonction : Auteur
PersonId : 182097
IdHAL : catherine-achard
ORCID : 0000-0002-5790-0830
IdRef : 13796658X

Institut des Systèmes Intelligents et de Robotique

Perception, Interaction, Robotique sociales

Sorbonne Université

Résumé

Summarizing video content is an important task in many applications. This task can be defined as the computation of the ordered list of actions present in a video. Such a list could be extracted using action detection algorithms. However, it is not necessary to determine the temporal boundaries of actions to know their existence. Moreover, localizing precise boundaries usually requires dense video analysis to be effective. In this work, we propose to directly compute this ordered list by sparsely browsing the video and selecting one frame per action instance, task known as action spotting in literature. To do this, we propose ActionSpotter, a spotting algorithm that takes advantage of Deep Reinforcement Learning to efficiently spot actions while adapting its video browsing speed, without additional supervision. Experiments performed on datasets THUMOS14 and ActivityNet show that our framework outperforms state of the art detection methods. In particular, the spotting mean Average Precision on THUMOS14 is significantly improved from 59.7% to 65.6% while skipping 23% of video.

Mots clés

L A T E X template style Index Terms-Class IEEEtran paper typesetting

Domaines

Intelligence artificielle [cs.AI] Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

bare_conf.pdf (1.09 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Guillaume Vaudaux-Ruth : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02534615

Soumis le : mardi 14 avril 2020-15:47:36

Dernière modification le : jeudi 28 mars 2024-13:58:03

Dates et versions

hal-02534615 , version 1 (14-04-2020)

hal-02534615 , version 2 (05-11-2020)

Identifiants

HAL Id : hal-02534615 , version 1
ARXIV : 2004.06971

Citer

Guillaume Vaudaux-Ruth, Adrien Chan-Hon-Tong, Catherine Achard. ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos. 2020. ⟨hal-02534615v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

204 Consultations

275 Téléchargements

ActionSpotter: Deep Reinforcement Learning Framework for Temporal Action Spotting in Videos

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Altmetric

Partager