KnobGen: Controlling the Sophistication of Artwork in Sketch-Based Diffusion Models

Published: 20 Dec 2025, Last Modified: 20 Dec 2025CVPR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Image Generation, Diffusion Models, Sketch Conditioning
TL;DR: A sketch based image generation model that caters to a wide range of users
Abstract: Recent advances in diffusion models have significantly improved text-to-image (T2I) generation, but they often struggle to balance fine-grained precision with high-level control. Methods like ControlNet and T2I-Adapter excel at following sketches by seasoned artists but tend to be replicating unintentional flaws in sketches from novice users. Meanwhile, coarse-grained methods, such as sketch-based abstraction frameworks, offer more accessible input handling but lack the precise control needed for professional use. To address these limitations, we propose KnobGen, a dual-pathway framework that democratizes sketch-based image generation by adapting to varying levels of sketch complexity and user skill. KnobGen uses a Coarse-Grained Controller(CGC) module for high-level semantics and a Fine-Grained Controller (FGC) module for detailed refinement. The relative strength of these two modules can be adjusted through our knob inference mechanism to align with the user's specific needs. These mechanisms ensure that KnobGen can flexibly generate images from both novice sketches and those drawn by seasoned artists. This maintains control over the final output while preserving the natural appearance of the image, as evidenced on the MultiGen-20M dataset and a newly collected sketch dataset.
Supplementary Material: zip
Camera Ready Version: zip
Submission Number: 5
Loading