Adversarially Injected Diagnosis for Coherent Visual Autoregressive Generation

18 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Visual Autoregressive Models (VARs); Image Generation; Adversarial Learning; Generative Models
TL;DR: This paper introduces a lightweight, plug-and-play module that uses adversarial guidance to improve the global coherence of large, frozen generative models.
Abstract: Visual Autoregressive (VAR) models, despite their formidable generative capa- bilities, accumulate local prediction errors across scales, leading to detail loss and local distortions. To address this, we introduce AID-VAR, a plug-and-play method that improves pretrained VARs via Adversarially Injected Diagnosis. Inspired by GANs, we train a discriminator to detect visual errors in generated samples and use an adversarial objective to pull generations toward the manifold of real im- ages. To avoid the computational and stability issues of directly updating the VAR, we attach a lightweight guidance injector that conditions on previously gen- erated scales of a pre-trained and frozen VAR and injects adversarial features to guide the next scale. To quantify reductions in cross-scale errors, we introduce the Inter-Scale Consistency Score (ISCS), which measures the fidelity of transi- tions between consecutive scales. Across standard VAR backbones, AID-VAR delivers sharper details, fewer local distortions, and stronger global coherence at remarkably low computational cost, adding negligible parameters and minimal computational overhead. Our results establish AID-VAR as a practical pathway for upgrading large VAR generators with adversarial feedback, without modifying training data, base architecture, or sampling schedules. For instance, our AID- VAR-d20 improves FID by 16%, with only 3% parameters increase.
Primary Area: generative models
Submission Number: 11397
Loading