SOLA: Text-based animated vector graphics generation with agentic orchestration

Donghee Shin; Jong-Seok Lee

SOLA: Text-based animated vector graphics generation with agentic orchestration

Donghee Shin, Jong-Seok Lee

13 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: SVG, animation, generation

Abstract: We introduce SOLA (SVG-Orientied Language-to-Animation), a novel end-to-end generative pipeline that produces animated scalable vector graphics (SVGs) directly from natural language prompts. Unlike prior systems such as Keyframer, which requires a static SVG input image and uses GPT-4 to generate CSS animations for that image, our approach constructs the entire animated SVG sequence from scratch. SOLA employs an agentic pipeline architecture (LangGraph) to orchestrate multiple modules and ensure coherent results. Given a text description, it first synthesizes an initial sequence of video frames, then vectorizes each frame into SVG path shapes, and aligns corresponding shapes across frames via greedy bipartite matching. We normalize all shape outlines to a consistent polyline representation and convert them into smooth cubic Bezier curves for smooth morphing between frames. This shape-level processing is the key to a resolution-independent animation with coherent motion. To overcome the absence of existing benchmarks for text-to-SVG animation, we design a thorough evaluation protocol with a prompt test set and diverse performance metrics. Experimental results with this protocol demonstrate the superiority of our approach compared to state-of-the-art LLM-based methods in translating high-level language descriptions into fully vectorized animations.

Primary Area: generative models

Submission Number: 4766

Loading