Policy Search with Rare Significant Events: Choosing the Right Partner to Cooperate with

Paul Ecoffet; Nicolas Fontbonne; Jean-Baptiste André; Nicolas Bredeche

doi:10.1371/journal.pone.0266841

Article Dans Une Revue PLoS ONE Année : 2022

Policy Search with Rare Significant Events: Choosing the Right Partner to Cooperate with

(1) , (1) , (2) , (1)

1
2

Paul Ecoffet

Fonction : Auteur
PersonId : 1096041

Institut des Systèmes Intelligents et de Robotique

Nicolas Fontbonne

Fonction : Auteur
PersonId : 754562
IdHAL : nicolas-fontbonne

Institut des Systèmes Intelligents et de Robotique

Jean-Baptiste André

Fonction : Auteur
PersonId : 741564
IdHAL : jean-baptiste-andre
ORCID : 0000-0001-9069-447X
IdRef : 204171806

Institut Jean-Nicod

Nicolas Bredeche

Fonction : Auteur
PersonId : 184446
IdHAL : nicolas-bredeche
ORCID : 0000-0002-8241-7461
IdRef : 070019452

Institut des Systèmes Intelligents et de Robotique

Résumé

This paper focuses on a class of reinforcement learning problems where significant events are rare and limited to a single positive reward per episode. A typical example is that of an agent who has to choose a partner to cooperate with, while a large number of partners are simply not interested in cooperating, regardless of what the agent has to offer. We address this problem in a continuous state and action space with two different kinds of search methods: a gradient policy search method and a direct policy search method using an evolution strategy. We show that when significant events are rare, gradient information is also scarce, making it difficult for policy gradient search methods to find an optimal policy, with or without a deep neural architecture. On the other hand, we show that direct policy search methods are invariant to the rarity of significant events, which is yet another confirmation of the unique role evolutionary algorithms has to play as a reinforcement learning method.

Mots clés

reinforcement learning rare significant events on-policy on-line continuous state and action spaces cooperation and partner choice gradient policy search direct policy search evolutionary algorithms PPO CMAES

Domaines

Apprentissage [cs.LG] Réseau de neurones [cs.NE] Intelligence artificielle [cs.AI]

Fichier principal

2021 - Policy search Arxiv 2103.06846.pdf (1.52 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Nicolas Bredeche : Connectez-vous pour contacter le contributeur

https://hal.sorbonne-universite.fr/hal-03315730

Soumis le : jeudi 5 août 2021-17:22:32

Dernière modification le : vendredi 19 avril 2024-16:18:55

Archivage à long terme le : samedi 6 novembre 2021-18:41:44

Dates et versions

hal-03315730 , version 1 (05-08-2021)

Identifiants

HAL Id : hal-03315730 , version 1
DOI : 10.1371/journal.pone.0266841

Citer

Paul Ecoffet, Nicolas Fontbonne, Jean-Baptiste André, Nicolas Bredeche. Policy Search with Rare Significant Events: Choosing the Right Partner to Cooperate with. PLoS ONE, 2022, PLoS ONE, 17 (4), pp.e0266841. ⟨10.1371/journal.pone.0266841⟩. ⟨hal-03315730⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS CNRS CDF EHESS DEC ISIR PSL SORBONNE-UNIVERSITE SU-SCIENCES JEAN-NICOD ANR ISIR_AMAC

51 Consultations

42 Téléchargements

Policy Search with Rare Significant Events: Choosing the Right Partner to Cooperate with

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager