Keywords: 3D foundation model, model specialization, robust optimization, low rank adaptation, self-supervised learning
TL;DR: An efficient test-time self-calibration pipeline to specialize a 3D foundation model to a target scene in 5 min on a single GPU.
Abstract: Emerging 3D geometric foundation models, such as DUSt3R, offer a promising approach for in-the-wild 3D vision tasks.
However, due to the high-dimensional nature of the problem space and scarcity of high-quality 3D data,
these pre-trained models still struggle to generalize to many challenging circumstances,
such as limited view overlap or low lighting.
To address this, we propose LoRA3D, an efficient self-calibration pipeline to *specialize* the pre-trained models to target scenes using their own multi-view predictions.
Taking sparse RGB images as input, we leverage robust optimization techniques to refine multi-view predictions and align them into a global coordinate frame.
In particular, we incorporate prediction confidence into the geometric optimization process,
automatically re-weighting the confidence to better reflect point estimation accuracy.
We use the calibrated confidence to generate high-quality pseudo labels for the calibrating views and fine-tune the models using low-rank adaptation (LoRA) on the pseudo-labeled data.
Our method does not require any external priors or manual labels. It completes the self-calibration process on a **single standard GPU within just 5 minutes**.
Each low-rank adapter requires only **18MB** of storage.
We evaluated our method on **more than 160 scenes** from the Replica, TUM and Waymo Open datasets,
achieving up to **88\% performance improvement** on 3D reconstruction, multi-view pose estimation and novel-view rendering.
For more details, please visit our project page at https://520xyxyzq.github.io/lora3d/.
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10864
Loading