Bridging Text and Image for Artist Style Transfer via Contrastive Learning

Zhi-Song Liu; Li-Wen Wang; Jun Xiao; Vicky Kalogeiton

Communication Dans Un Congrès Année : 2024

Bridging Text and Image for Artist Style Transfer via Contrastive Learning

(1) , (2) , (2) , (3)

1
2
3

Zhi-Song Liu

Fonction : Auteur

Lappeenranta–Lahti University of Technology [Finlande]

Li-Wen Wang

Fonction : Auteur

The Hong Kong Polytechnic University [Hong Kong]

Jun Xiao

Fonction : Auteur

The Hong Kong Polytechnic University [Hong Kong]

Vicky Kalogeiton

Fonction : Auteur

Laboratoire d'informatique de l'École polytechnique [Palaiseau]

Résumé

Image style transfer has attracted widespread attention in the past few years. Despite its remarkable results, it requires additional style images available as references, making it less flexible and inconvenient. Using text is the most natural way to describe the style. More importantly, text can describe implicit abstract styles, like styles of specific artists or art movements. In this paper, we propose a Contrastive Learning for Artistic Style Transfer (CLAST) that leverages advanced image-text encoders to control arbitrary style transfer. We introduce a supervised contrastive training strategy to effectively extract style descriptions from the image-text model (i.e., CLIP), which aligns stylization with the text description. To this end, we also propose a novel and efficient adaLN based state space models that explore style-content fusion. Finally, we achieve a text-driven image style transfer. Extensive experiments demonstrate that our approach outperforms the state-of-the-art methods in artistic style transfer. More importantly, it does not require online fine-tuning and can render a 512x512 image in 0.03s.

Mots clés

vision and language text guidance domain transfer contrastive learning Style transfer multimodal learning vision and language text guidance domain transfer contrastive learning Style transfer multimodal learning

Domaines

Vision par ordinateur et reconnaissance de formes [cs.CV]

Fichier principal

2410.09566v1.pdf (14.44 Mo)

Bridging Text and Image for Artist Style Transfer.png (214.97 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Origine	Fichiers produits par l'(les) auteur(s)

Lucas Degeorge : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04822965

Soumis le : vendredi 6 décembre 2024-11:35:22

Dernière modification le : jeudi 12 décembre 2024-03:25:44

Dates et versions

hal-04822965 , version 1 (06-12-2024)

Identifiants

HAL Id : hal-04822965 , version 1
ARXIV : 2410.09566

Citer

Zhi-Song Liu, Li-Wen Wang, Jun Xiao, Vicky Kalogeiton. Bridging Text and Image for Artist Style Transfer via Contrastive Learning. European Conference on Computer Vision Workshop (ECCV-W) 2024, European Computer Vision Association, Sep 2024, Milan (Italie), Italy. ⟨hal-04822965⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

X CNRS LIX X-LIX X-DEP-INFO IP_PARIS ANR

0 Consultations

0 Téléchargements

Bridging Text and Image for Artist Style Transfer via Contrastive Learning

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager