Unsupervised Word Representations Learning with Bilinear Convolutional Network on Characters

Thomas Luka; Laure Soulier; David Picard

doi:10.14428/esann/2021.ES2021-38

Communication Dans Un Congrès Année : 2021

Unsupervised Word Representations Learning with Bilinear Convolutional Network on Characters

(1) , (2) , (1)

1
2

Thomas Luka

Fonction : Auteur
PersonId : 1106646

École nationale des ponts et chaussées

Laure Soulier

Fonction : Auteur
PersonId : 8070
IdHAL : soulierl
ORCID : 0000-0001-9827-7400
IdRef : 189293683

Machine Learning and Information Access

David Picard

Fonction : Auteur
PersonId : 741
IdHAL : david-picard
ORCID : 0000-0002-6296-4222
IdRef : 133005216

École nationale des ponts et chaussées

Résumé

In this paper, we propose a new unsupervised method for learning word embedding with raw characters as input representations, bypassing the problems arising from the use of a dictionary. To achieve this purpose, we translate the distributional hypothesis into a unsupervised metric learning objective, which allows to consider only an encoder instead of an encoder-decoder architecture. We propose to use a convolutional neural network with bilinear product blocks and residual connections to encode co-occurrences patterns. We show the efficiency of our approach by comparing it with classical word embedding methods such as fastText and GloVe on several benchmarks.

Domaines

Intelligence artificielle [cs.AI]

Laure Soulier : Connectez-vous pour contacter le contributeur

https://hal.sorbonne-universite.fr/hal-03479768

Soumis le : mardi 14 décembre 2021-14:53:43

Dernière modification le : jeudi 19 décembre 2024-16:50:04

Dates et versions

hal-03479768 , version 1 (14-12-2021)

Identifiants

HAL Id : hal-03479768 , version 1
DOI : 10.14428/esann/2021.ES2021-38

Citer

Thomas Luka, Laure Soulier, David Picard. Unsupervised Word Representations Learning with Bilinear Convolutional Network on Characters. The 29th European Symposium on Artificial Neural Networks, Oct 2021, Online, Belgium. pp.251-256, ⟨10.14428/esann/2021.ES2021-38⟩. ⟨hal-03479768⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENPC CNRS LIGM_A3SI PARISTECH LIGM LIP6 SORBONNE-UNIVERSITE SU-SCIENCES

62 Consultations

0 Téléchargements

Unsupervised Word Representations Learning with Bilinear Convolutional Network on Characters

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager