Reinforcement Learning for Symbolic Graphics Code with Visual Feedback

18 Sept 2025 (modified: 05 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Code Generation, Reinforcement Learning, Applications
Abstract: Symbolic graphics code generation, particularly text-to-SVG generation, plays a critical role in numerous practical applications, including web design, digital publishing, and user interface prototyping. However, current open large language models face significant challenges in handling these visually intricate and structurally precise tasks, often exhibiting a considerable performance gap compared to leading proprietary models. In this paper, we present a novel approach aimed at substantially improving the capabilities in text-to-SVG tasks. Our main contributions are threefold: First, we propose a reinforcement learning framework that leverages vision-language models (VLMs) as visual reward model, providing comprehensive visual feedback that guides LLMs towards generating more accurate and visually coherent SVG outputs. Second, we investigate inference-time scaling methods through extended long Chain-of-Thought (CoT) reasoning combined with large-scale RL, revealing that such methods inherently counteract reward hacking by refining prompt engineering and making task objectives more explicit and concrete. Third, we introduce a new, high-quality benchmark alongside a rigorously curated training dataset dedicated to text-to-SVG generation, addressing the notable absence of specialized benchmarks and datasets in this domain. Experiments on open model, i.e., Qwen3 demonstrate that our approach achieves results comparable to state-of-the-art proprietary and larger models. This work substantially narrows the performance gap and provides both methods and resources to advance symbolic code generation research.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 11840
Loading