CAD Translator: An Effective Drive for Text to 3D Parametric Computer-Aided Design Generative Modeling
Abstract: Computer-Aided Design (CAD) generative modeling is widely applicable in the fields of industrial engineering. Recently, text-to-3D generation has shown rapid progress in point clouds, mesh, and other non-parametric representations. On the contrary, text to 3D parametric CAD generative modeling is a practical task that has not been explored well, where its shape can be defined with several editable parametric command sequences. To investigate this, we design an encoder-decoder framework, namely CAD Translator, for incorporating the awareness of parametric CAD sequences into texts appropriately with only one-stage training. We first align texts and parametric CAD sequences via a Cascading Contrastive Strategy in the latent space, and then we propose CT-Mix to conduct the random mask operation on their embeddings separately to further get a fusion embedding via the linear interpolation. This can strengthen the connection between texts and parametric CAD sequences effectively. To train CAD Translator, we create a Text2CAD dataset with the help of Large Multimodal Model (LMM) for this practical task and conduct thorough experiments to demonstrate the effectiveness of our method.
Primary Subject Area: [Content] Multimodal Fusion
Secondary Subject Area: [Content] Vision and Language, [Experience] Multimedia Applications, [Generation] Generative Multimedia
Relevance To Conference: The parametric CAD sequence can be seen as a kind of multi-modal representation, where the command and its parameter can indicate the creation process of the object in the world. This work mainly aims to learn a mapping between natural text and parametric CAD sequence, which would open some new doors in the CAD industry (e.g., patching the parametric CAD sequence from texts, generating the parametric CAD sequence from texts).
Submission Number: 4588
Loading