Skip to Main content Skip to Navigation
Journal articles

Similarity Search of Acted Voices for Automatic Voice Casting

Nicolas Obin 1 Axel Roebel 1
1 Analyse et synthèse sonores [Paris]
STMS - Sciences et Technologies de la Musique et du Son
Abstract : This paper presents a large-scale similarity search of professionally acted voices for computer-aided voice casting. The proposed voice casting system explores Gaussian mixture model-based acoustic models and multilabel recognition of perceived paralinguistic content (speaker states and speaker traits, e.g., age/gender, voice quality, emotion) for the voice casting of professionally acted voices. First, acoustic models (universal background model, super-vector, i-vector) are constructed to model the acoustic space of voices, from which the similarity between voices can be measured directly in the acoustic space. Second, multiple binary classification of speaker traits and states is added to the acoustic models in order to represent the vocal signature of a voice, which is then used to measure the similarity between voices in the paralinguistic space. Finally, a similarity search is processed in order to determine the set of target actors that are the most similar to the voice of a source actor. In a subjective experiment conducted in the real-context of cross-language voice casting, the multilabel scoring system significantly outperforms the acoustic scoring system. This constitutes a proof of concept for the role of perceived para-linguistic categories in the perception of voice similarity.
Complete list of metadatas

Cited literature [59 references]  Display  Hide  Download

https://hal.sorbonne-universite.fr/hal-01464715
Contributor : Nicolas Obin <>
Submitted on : Friday, February 10, 2017 - 2:11:14 PM
Last modification on : Thursday, March 21, 2019 - 1:15:08 PM
Long-term archiving on: : Thursday, May 11, 2017 - 1:54:56 PM

File

taslp-obin-2580302-proof.pdf
Files produced by the author(s)

Identifiers

Citation

Nicolas Obin, Axel Roebel. Similarity Search of Acted Voices for Automatic Voice Casting. IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2016, 24 (9), pp.1642 - 1651. ⟨10.1109/TASLP.2016.2580302⟩. ⟨hal-01464715⟩

Share

Metrics

Record views

2100

Files downloads

552