CLIP-based neural neighbor style transfer for 3d assets

Shailesh Mishra, Jonathan Granskog

Published: 08 Aug 2022, Last Modified: 05 Mar 2025Eurographics Short PapersEveryoneCC BY 4.0

Abstract: We present a method for transferring the style from a set of images to a 3D object. The texture appearance of an asset is optimized with a differentiable renderer in a pipeline based on losses using pretrained deep neural networks. More specifically, we utilize a nearest-neighbor feature matching loss with CLIP-ResNet50 to extract the style from images. We show that a CLIPbased style loss provides a different appearance over a VGG-based loss by focusing more on texture over geometric shapes. Additionally, we extend the loss to support multiple images and enable loss-based control over the color palette combined with automatic color palette extraction from style images.