Hypernetworks for image recontextualization

Maciej Zieba; Jakub Balicki; Tomasz Drozdz; Konrad Karanowski; Pawel Lorek; Hong Lyu; Aleksander Piotr Skorupa; Tomasz Trzcinski; Oriol Caudevilla; Jakub M. Tomczak

Hypernetworks for image recontextualization

Maciej Zieba, Jakub Balicki, Tomasz Drozdz, Konrad Karanowski, Pawel Lorek, Hong Lyu, Aleksander Piotr Skorupa, Tomasz Trzcinski, Oriol Caudevilla, Jakub M. Tomczak

Published: 10 Oct 2024, Last Modified: 28 Oct 2024UniRepsEveryoneRevisionsBibTeXCC BY 4.0

Track: Proceedings Track

Keywords: diffusion models, recontextualization, image generation

TL;DR: Effective method for image recontextualization

Abstract: Image recontextualization, the task of placing a subject from an image into a new context to serve a specific purpose, has become increasingly important in fields like art, media, marketing, and e-commerce. Recent advancements in deep generative modeling, such as text-to-image and image-to-image synthesis via diffusion models, have significantly improved recontextualization capabilities. However, current methods, like DreamBooth and LoRA, require time-consuming fine-tuning per individual image, resulting in inefficiencies and often suboptimal outputs. Other approaches to recontextualization, like MagicClothing, require reorganization of the architecture of the base model and a time-consuming training process in a particular domain. In this work, we propose HyperLoRA, a novel framework that leverages hypernetworks to predict LoRA parameters, allowing for more efficient image recontextualization without the need for image-specific fine-tuning. HyperLoRA utilizes domain pairs of context images and target objects, enabling instant adaptation to new contexts while significantly reducing computational costs. Our method outperforms traditional techniques by offering more accurate adjustments, broader applicability across multiple modalities (e.g., text, video, sound, and structured data), and scalable deployment. Experimental results demonstrate the effectiveness of our approach in garment-to-model recontextualization, highlighting the potential for broader applications.

Submission Number: 35

Loading