Sub-pixel Convolution and Edge Detection for Multi-view Stereo

Fanqi Yu

12 May 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: The deep multi-view stereo (MVS) approaches generally construct a cost volume pyramid in a coarse- to- fine manner to regularize and regress the depth or disparity, which is often built upon a feature pyramid encoding geometry or an image pyramid. A pyramid is an excellent approach to reducing memory, and many papers said even low-resolution images or features contain enough information for estimating low-resolution depth maps. However, recent papers show that the higher the image resolution, the better the output depth map, which means the resolution of depth maps in each stage cause effect on the final outputs. Therefore, we think the low-resolution depth map may not be enough for the high-resolution depth map. In this paper, we propose a sub-pixel upsampling module for post-processing the cost volume to generate a big resolution depth map at each stage. Besides, we also proposed an edge-weighted loss function for optimizing those inaccurate depth values in the edge regions of objects. Finally, we implement them on CasMVSNet, showing the effectiveness of our proposed method. The content of abstract.

0 Replies