CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing

Yu Yuan; Shizhao Sun; Qi Liu; Jiang Bian

CAD-Editor: A Locate-then-Infill Framework with Automated Training Data Synthesis for Text-Based CAD Editing

Yu Yuan, Shizhao Sun, Qi Liu, Jiang Bian

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC 4.0

TL;DR: This paper introduces CAD-Editor, a novel generative model that enables precise text-based editing of CAD designs.

Abstract: Computer Aided Design (CAD) is indispensable across various industries. \emph{Text-based CAD editing}, which automates the modification of CAD models based on textual instructions, holds great potential but remains underexplored. Existing methods primarily focus on design variation generation or text-based CAD generation, either lacking support for text-based control or neglecting existing CAD models as constraints. We introduce \emph{CAD-Editor}, the first framework for text-based CAD editing. To address the challenge of demanding triplet data with accurate correspondence for training, we propose an automated data synthesis pipeline. This pipeline utilizes design variation models to generate pairs of original and edited CAD models and employs Large Vision-Language Models (LVLMs) to summarize their differences into editing instructions. To tackle the composite nature of text-based CAD editing, we propose a locate-then-infill framework that decomposes the task into two focused sub-tasks: locating regions requiring modification and infilling these regions with appropriate edits. Large Language Models (LLMs) serve as the backbone for both sub-tasks, leveraging their capabilities in natural language understanding and CAD knowledge. Experiments show that CAD-Editor achieves superior performance both quantitatively and qualitatively.

Lay Summary: Designing 3D models for things like cars, machines, or buildings usually requires complex software and skilled human effort. But what if we could just tell the computer, in plain language, how to change a design — like saying “reduce the cylinder's height by half” or “drill four smaller holes through corners”? Our research introduces a new system called CAD-Editor that makes this possible. It allows people to modify existing 3D models using simple text instructions. To train this system, we created a large dataset by automatically generating design variations and describing the changes with the help of advanced AI models that understand both images and language. CAD-Editor breaks the editing task into two steps: first, it finds which part of the design needs to change, and then it figures out how to change it correctly. We use powerful large language models to handle both parts of the task. This work brings us closer to making computer-aided design more accessible, efficient, and intuitive — even for non-experts.

Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.

Link To Code: https://github.com/microsoft/CAD-Editor

Primary Area: Applications->Computer Vision

Keywords: Computer Aided Design, Generative Models, Text-based Editing, Large Language Models

Submission Number: 9572

Loading