R³DC: Reliability-Guided Reveal-to-Revise Depth Completion for Cross-Domain Sparse Perception

Published: 03 May 2026, Last Modified: 03 May 2026CVPR 2026 Workshop 3D4S PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Depth completion, reliability-guided, reveal-to-revise framework, cross-domain generalization, sparse perception, aleatoric uncertainty, dual-stream encoder, geometry-adaptive deformable convolutions, cross-modal attention, Convolutional Spatial Propagation Network (CSPN++), RADI (Reliability-Aware Depth Index), confidence calibration, 3D perception, indoor calibration head
TL;DR: R3DC is a depth completion framework that employs a reliability-guided "Reveal-to-Revise" architecture to enhance cross-domain generalization and provide calibrated per-pixel confidence estimates.
Abstract: Depth completion is fundamental to 3D perception, yet practical deployment is hindered by three challenges: an absence of calibrated per-pixel confidence, poor cross-domain generalization, and benchmarks that evaluate accuracy while ignoring the trustworthiness of uncertainty estimates. We introduce $R^3DC$, an end-to-end Reveal-to-Revise framework that jointly predicts dense metric depth, per-pixel reliability, and aleatoric uncertainty. The architecture integrates a dual-stream encoder with geometry-adaptive deformable convolutions, hierarchical cross-modal attention, and Convolutional Spatial Propagation Network (CSPN++) refinement that is explicitly gated by learned reliability. Driven by a seven-term composite objective, $R^3DC$ stabilizes training across a highly diverse range of depths. To rigorously assess these confidence estimates, we propose RADI (Reliability-Aware Depth Index), a novel evaluation framework measuring reliability-error correlation (REC), revision benefit score (RBS), and calibration error (CAL). Across four heterogeneous benchmarks (KITTI, VisDrone, Drone-Videos, and NYU Depth V2), $R^3DC$ achieves highly competitive accuracy, including 0.24 m RMSE on KITTI and $\delta_{1}=0.927$ on NYU Depth V2, while its core architecture requires 10 to 170$\times$ fewer parameters than existing baselines. https://pmlrbd.github.io/r3dc/
Supplementary Material: pdf
Submission Number: 6
Loading