REDEN: Named Entity Linking in Digital Literary Editions Using Linked Data Sets

Carmen Brando; Francesca Frontini; Jean-Gabriel Ganascia

doi:10.7250/csimq.2016-7.04

Article Dans Une Revue Complex Systems Informatics and Modeling Quarterly Année : 2016

REDEN: Named Entity Linking in Digital Literary Editions Using Linked Data Sets

(1) , (2) , (3)

1
2
3

Carmen Brando

Fonction : Auteur
PersonId : 183296
IdHAL : carmen-brando
ORCID : 0000-0001-7098-3522
IdRef : 176424148

Approches historiques des mondes contemporains/Equipe CRH

Francesca Frontini

Fonction : Auteur
PersonId : 2695
IdHAL : francesca-frontini
ORCID : 0000-0002-8126-6294
IdRef : 227593766

Istituto di Linguistica Computazionale "Antonio Zampolli"

Jean-Gabriel Ganascia

Fonction : Auteur

Agents Cognitifs et Apprentissage Symbolique Automatique

Résumé

This paper proposes a graph-based Named Entity Linking (NEL) algorithm named REDEN for the disambiguation of authors' names in French literary criticism texts and scientific essays from the 19th and early 20th centuries. The algorithm is described and evaluated according to the two phases of NEL as reported in current state of the art, namely, candidate retrieval and candidate selection. REDEN leverages knowledge from different Linked Data sources in order to select candidates for each author mention, subsequently crawls data from other Linked Data sets using equivalence links (e.g., owl:sameAs), and, finally, fuses graphs of homologous individuals into a non-redundant graph well-suited for graph centrality calculation; the resulting graph is used for choosing the best referent. The REDEN algorithm is distributed in open-source and follows current standards in digital editions (TEI) and semantic Web (RDF). Its integration into an editorial workflow of digital editions in Digital humanities and cultural heritage projects is entirely plausible. Experiments are conducted along with the corresponding error analysis in order to test our approach and to help us to study the weaknesses and strengths of our algorithm, thereby to further improvements of REDEN.

Mots clés

digital humanities Named Entity Linking graph centrality linked data data fusion

Domaines

Intelligence artificielle [cs.AI] Informatique et langage [cs.CL] Traitement du texte et du document Algorithme et structure de données [cs.DS]

Fichier principal

1401-4407-1-PB.pdf (599)

Origine	Fichiers éditeurs autorisés sur une archive ouverte

Carmen Brando : Connectez-vous pour contacter le contributeur

https://hal.sorbonne-universite.fr/hal-01396037

Soumis le : dimanche 13 novembre 2016-16:58:09

Dernière modification le : vendredi 7 février 2025-14:06:04

Archivage à long terme le : mardi 21 mars 2017-00:47:22

Dates et versions

hal-01396037 , version 1 (13-11-2016)

Identifiants

HAL Id : hal-01396037 , version 1
DOI : 10.7250/csimq.2016-7.04

Citer

Carmen Brando, Francesca Frontini, Jean-Gabriel Ganascia. REDEN: Named Entity Linking in Digital Literary Editions Using Linked Data Sets. Complex Systems Informatics and Modeling Quarterly, 2016, 7, pp.60 - 80. ⟨10.7250/csimq.2016-7.04⟩. ⟨hal-01396037⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-PARIS UPMC CNRS EHESS LADEHIS AHMOC CRH LIP6 SORBONNE-UNIVERSITE SU-SCIENCES ANR ODHN

784 Consultations

558 Téléchargements

REDEN: Named Entity Linking in Digital Literary Editions Using Linked Data Sets

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager