Abstract: Successful depth completion from a single RGB-D image requires both extracting plentiful 2D and 3D features and merging these heterogeneous features appropriately. We propose a novel depth completion framework, CostDCNet, based on the cost volume-based depth estimation approach that has been successfully employed for multi-view stereo (MVS). The key to high-quality depth map estimation in the approach is constructing an accurate cost volume. To produce a quality cost volume tailored to single-view depth completion, we present a simple but effective architecture that can fully exploit the 3D information, three options to make an RGB-D feature volume, and per-plane pixel shuffle for efficient volume upsampling. Our CostDCNet framework consists of lightweight deep neural networks ( $$\sim $$ 1.8M parameters), running in real time ( $$\sim $$ 30 ms). Nevertheless, thanks to our simple but effective design, CostDCNet demonstrates depth completion results comparable to or better than the state-of-the-art methods.
0 Replies
Loading