## `cadrille`: Multi-modal CAD Reconstruction with Reinforcement Learning


This repository contains an implementation of `cadrille`, a multi-modal (point clouds / images / text) 3D CAD reconstruction method.



### Installation

Install Python packages according to our [Dockerfile](Dockerfile). We support DeepCAD (test), Fusion360 (test), Text2CAD (train / val / test), and CAD-Recode (train, val) datasets. Follow our [instruction](data/README.md) to download and preprocess data.

### Supervised Finetuning

To start training run *train.py* script:
```shell
python train.py --mode pc_img --use-text
```
To disable some of the modalities set *--mode* to *img* or *pc*, or disable *--use-text*. 

#### Training Data


- CAD-Recode(train / val): <https://huggingface.co/datasets/filapro/cad-recode-v1.5> Convert CadQuery programs to meshes before training by running the `cadrecode2mesh.py` script.
- DeepCAD test. Meshes are produced by official DeepCAD [script](https://github.com/ChrisWu1997/DeepCAD/blob/master/dataset/json2pc.py) and normalized to the unit cube.
- Fusion360 test. Meshes are downloaded from [link](https://github.com/AutodeskAILab/Fusion360GalleryDataset/blob/master/docs/reconstruction.md#traintest-split) and normalized to unit cube.
- Text2CAD (train / val / test). Text prompts are downloaded from [link](https://github.com/SadilKhan/Text2CAD?tab=readme-ov-file#-data-preparation) and shortened a bit. We also provide CadQuery codes for almost all DeepCAD examples.


Overall data structure should be as follows:
```
data
└── cad-recode-v1.5
    ├── train
        ├── batch_00
            ├── 0.py
            ├── 0.stl
            └── ...
        └── ...
    ├── val
        ├── 0.py
        ├── 0.stl
        └── ...
    ├── train.pkl
    └── val.pkl
    ├── text2cad
        ├── cadquery
            ├── 0.py
            └── ...
        ├── train.pkl
        ├── val.pkl
        └── test.pkl
    ├── deepcad_test_mesh
        ├── 0.stl
        └── ...
    └── fusion360_test_mesh
        ├── 0.stl
        └── ...
```


### RL Fine-Tuning (GRPO)

This repo includes an online RL stage (GRPO / Dr-CCPO style) under `rl_finetune/` to improve CAD code generation via task rewards.

**Data layout**
- Eval (preprocessed `.pkl`):
  - `data/deepcad_test/test.pkl`
  - `data/fusion360_test/test.pkl`
- Train (DeepCAD+Fusion mix):
  - `data/deepcad_fusion_train/train.pkl` (or `train_small.pkl`)

Each `.pkl` contains dicts with: `description`, `mesh_path`, `mesh`, `idx`, and optionally `point_cloud (N×3, N=256)` or `video`.

**Launch (multi-GPU, DDP)**
```bash
cd rl_finetune
torchrun --standalone --nnodes=1 --nproc-per-node=8 \
  train_cadrille_grpo.py \
  --sft_path /path/to/sft_or_base_ckpt 
```

### Inference

To predict CadQuery codes run *test.py* script:
```shell
python test.py --split deepcad_test_mesh --mode pc
```
To run on other datasets and modalities use *--split fusion360_test_mesh* or set *--mode* to *img* or *text*.

### Evaluation

To evaluate IoU, invalidity ratio, and chamfer distance run *evaluate.py* script:
```shell
python evaluate.py
```



