ReCAD: Reinforcement Learning Enhanced Parametric CAD Model Generation with Vision-Language Models

Published: 07 Nov 2025, Last Modified: 26 Apr 2026AAAI 2026 OralEveryoneCC BY 4.0
Abstract: We present ReCAD, a reinforcement learning framework that bootstraps pretrained large models (PLMs) to generate precise parametric computer-aided design (CAD) models from multimodal inputs, by leveraging their inherent generation capabilities. With just access to simple functional interfaces (e.g., point coordinates), our approach enables the emergence of complex CAD operations (e.g., pattern replication} and mirror). This stands in contrast to previous methods, which typically rely on knowledge injected through supervised fine-tuning (SFT), offer limited support for editability, and fail to fully exploit the strong generative priors of PLMs. Specifically, the ReCAD framework begins by fine-tuning vision-language models (VLMs) to equip them with basic CAD model generation capabilities, where we first rewrite hardcoded CAD scripts into parameterized code, which is then leveraged to generate accurate textual descriptions for supervision. Then, we propose a novel reinforcement learning strategy that incorporates parameterized code as guidance to enhance the model’s reasoning on challenging questions, and further employ a curriculum-based hierarchical primitive learning process to progressively teach structured and compositional skills under a unified reward function that ensures both geometric accuracy and semantic fidelity. ReCAD establishes a new state-of-the-art on both text-to-CAD and image-to-CAD tasks, significantly improving geometric accuracy across in- and out-of-distribution settings. In the in-distribution setting, ReCAD reduces the mean Chamfer Distance (CD) from 73.47 to 29.61, and in the out-of-distribution setting, mean CD drops from 272.06 to 80.23, outperforming all existing baselines by a substantial margin.
Loading