CountLoop: Training-Free High-Instance Image Generation via Iterative Agent Guidance

TMLR Paper9259 Authors

27 May 2026 (modified: 14 Jun 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Diffusion models excel at photorealistic synthesis but struggle with precise object counts, especially in high-density settings. We introduce COUNTLOOP, a training-free framework that achieves precise instance control through iterative, structured feedback. Our method alternates between synthesis and evaluation: a VLM-based planner generates structured scene layouts, while a VLM-based critic provides explicit feedback on object counts, spatial arrangements, and visual quality to refine the layout iteratively. Instance-driven attention masking and cumulative attention composition further prevent semantic leakage, ensuring clear object separation even in densely occluded scenes. Evaluations on COCO-Count, T2I-CompBench, and two newly introduced high instance benchmarks show that COUNTLOOP reduces counting error by up to 57% and achieves the highest or comparable spatial quality scores across all benchmarks, while maintaining photorealism.
Submission Type: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=B7lXtvKImq
Changes Since Last Submission: The previous submission was desk-rejected for template non-compliance ("Modified template, please revisit and resubmit"). We have addressed all violations: tmlr.sty: Restored one missing line (\lhead{Under review as submission to TMLR}) to match the official template exactly. preamble.tex: Removed all layout-affecting overrides — global \captionsetup{skip=2pt}, \captionsetup[figure], \captionsetup[table], \usepackage[font=small,labelfont=bf]{caption}, \setlength{\dbltextfloatsep}, and \setlength{\dblfloatsep}. tmlr.tex: Removed \large from the title command. Negative \vspace: Removed all active negative vertical spacing from both the main paper and supplementary. Supplementary: Added a Broader Impact section. No changes were made to experimental results, methodology, or claims.
Assigned Action Editor: ~Ning_Yu2
Submission Number: 9259
Loading