Sparse Coding of Pitch Contours with Deep Auto-Encoders

Nicolas Obin; Julie Beliao

doi:10.21437/SpeechProsody.2018-161

Communication Dans Un Congrès Année : 2018

Sparse Coding of Pitch Contours with Deep Auto-Encoders

(1) , (2)

1
2

Nicolas Obin

Fonction : Auteur
PersonId : 7042
IdHAL : nicolas-obin
ORCID : 0000-0002-5236-5306
IdRef : 157523799

Analyse et synthèse sonores [Paris]

Julie Beliao

Fonction : Auteur
PersonId : 438
IdHAL : julie-beliao

Modèles, Dynamiques, Corpus

Résumé

This paper presents a sparse coding algorithm based on deep auto-encoders for the stylization and the clustering of pitch contours. The main objective of the proposed algorithm is to learn a set of pitch templates that can be easily interpreted by humans and whose combination can approximate efficiently the observed pitch contours. The proposed learning architecture is based on deep auto-encoders, commonly used to learn non-linear and low-dimensional latent representations that approximate the observed data. The proposed deep architecture is based on stacked auto-encoders and the sparsity of the network is investigated in order to learn a more robust and general representation of the pitch contours (dropout, denoising auto-encoder, sparsity regularization). The deep auto-encoding of the pitch contours is illustrated and discussed on the TIMIT American-English speech database † with comparison of other existing stylization and clustering algorithms.

Mots clés

speech prosody pitch contour sparse coding deep auto-encoders

Domaines

Traitement du signal et de l'image [eess.SP] Machine Learning [stat.ML]

Fichier principal

Sparse_Coding_of_Pitch_Contours_with_Dee.pdf (1.12 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Nicolas Obin : Connectez-vous pour contacter le contributeur

https://hal.sorbonne-universite.fr/hal-01722007

Soumis le : vendredi 2 mars 2018-18:45:08

Dernière modification le : mercredi 30 octobre 2024-13:29:04

Archivage à long terme le : jeudi 31 mai 2018-20:22:25

Dates et versions

hal-01722007 , version 1 (02-03-2018)

Identifiants

HAL Id : hal-01722007 , version 1
DOI : 10.21437/SpeechProsody.2018-161

Citer

Nicolas Obin, Julie Beliao. Sparse Coding of Pitch Contours with Deep Auto-Encoders. Speech Prosody, Mar 2018, Poznan, Poland. pp.799-803, ⟨10.21437/SpeechProsody.2018-161⟩. ⟨hal-01722007⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS MODYCO IRCAM STMS SORBONNE-UNIVERSITE SU-SCIENCES UNIV-PARIS-LUMIERES UNIV-PARIS-NANTERRE

153 Consultations

364 Téléchargements

Sparse Coding of Pitch Contours with Deep Auto-Encoders

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager