PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation - Sorbonne Université Access content directly
Conference Papers Year : 2020

PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation

Abstract

In language generation models conditioned by structured data, the classical training via maximum likelihood almost always leads models to pick up on dataset divergence (i.e., hallucinations or omissions), and to incorporate them erroneously in their own generations at inference. In this work, we build ontop of previous Reinforcement Learning based approaches and show that a model-agnostic framework relying on the recently introduced PARENT metric is efficient at reducing both hallucinations and omissions. Evaluations on the widely used WikiBIO and WebNLG benchmarks demonstrate the effectiveness of this framework compared to state-of-the-art models.
Fichier principal
Vignette du fichier
2010.10866.pdf (390.44 Ko) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03479883 , version 1 (14-12-2021)

Identifiers

  • HAL Id : hal-03479883 , version 1

Cite

Clément Rebuffel, Laure Soulier, Geoffrey Scoutheeten, Patrick Gallinari. PARENTing via Model-Agnostic Reinforcement Learning to Correct Pathological Behaviors in Data-to-Text Generation. Proceedings of the 13th International Conference on Natural Language Generation, INLG 2020, Dec 2020, Dublin, Ireland. ⟨hal-03479883⟩
30 View
24 Download

Share

Gmail Facebook X LinkedIn More