Generative Models: What Do They Know? Do They Know Things? Let's Find Out!

TMLR Paper2992 Authors

11 Jul 2024 (modified: 20 Sept 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Generative models excel at creating images that closely mimic real scenes, suggesting they inherently encode scene representations. We introduce Intrinsic LoRA (I-LoRA), a general approach that uses Low-Rank Adaptation (LoRA) to discover scene intrinsics such as normals, depth, albedo, and shading from a wide array of generative models. I-LoRA is lightweight, adding minimally to the model's parameters and requiring very small datasets for this knowledge discovery. Our approach, applicable to Diffusion models, GANs, and Autoregressive models alike, generates intrinsics using the same output head as the original images. We show a correlation between the generative model's quality and the extracted intrinsics' accuracy through control experiments. Finally, scene intrinsics obtained by our method with just hundreds to thousands of labeled images, perform on par with those from supervised methods trained on millions of labeled examples.
Submission Length: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=TJNQh9qqa3
Changes Since Last Submission: We have moved the teaser image to a later section to comply with the style requirements. We have also made a few minor wording adjustments.
Assigned Action Editor: ~Gabriel_Loaiza-Ganem1
Submission Number: 2992
Loading