Detecting Potential Local Adversarial Examples for Human-Interpretable Defense

Machine learning models are increasingly used in the industry to make decisions such as credit insurance approval. Some people may be tempted to manipulate specific variables, such as the age or the salary, in order to get better chances of approval. In this ongoing work, we propose to discuss, with a first proposition, the issue of detecting a potential local adversarial example on classical tabular data by providing to a human expert the locally critical features for the classifier's decision, in order to control the provided information and avoid a fraud.

Domaines

Intelligence artificielle [cs.AI] Apprentissage [cs.LG] Autres [stat.ML] Cryptographie et sécurité [cs.CR]

Thibault Laugel : Connectez-vous pour contacter le contributeur

https://hal.sorbonne-universite.fr/hal-01905948

Soumis le : vendredi 26 octobre 2018-11:49:32

Dernière modification le : samedi 7 octobre 2023-21:36:22

Dates et versions

hal-01905948 , version 1 (26-10-2018)

Identifiants

HAL Id : hal-01905948 , version 1
ARXIV : 1809.02397

Citer

Xavier Renard, Thibault Laugel, Marie-Jeanne Lesot, Christophe Marsala, Marcin Detyniecki. Detecting Potential Local Adversarial Examples for Human-Interpretable Defense. Workshop on Recent Advances in Adversarial Learning (Nemesis) of the European Conference on Machine Learning and Principles of Practice of Knowledge Discovery in Databases (ECML-PKDD), Sep 2018, Dublin, Ireland. ⟨hal-01905948⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LIP6 SORBONNE-UNIVERSITE SU-SCIENCES

94 Consultations

0 Téléchargements