High Fidelity Aggregated Planar Prior Assisted PatchMatch Multi-View Stereo

Published: 20 Jul 2024, Last Modified: 21 Jul 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract:

The quality of 3D models reconstructed by PatchMatch Multi-View Stereo remains a challenging problem due to unreliable photometric consistency in object boundaries and textureless areas. Since textureless areas usually exhibit strong planarity, previous methods used planar prior and significantly improved the reconstruction performance. However, their planar prior ignores the depth discontinuity at the object boundary, making the boundary inaccurate (not sharp). In addition, due to the unreliable planar models in large-scale low-textured objects, the reconstruction results are incomplete. To address the above issues, we introduce the segmentation generated from Segment Anything Model into PM pipelines for the first time. We use segmentation to determine whether the depth is continuous based on the characteristics of segmentation and depth sharing boundaries. Then we segment planes at object boundaries and enhance the consistency of planes in objects. Specifically, we construct $\textbf{Boundary Plane}$ that fits the object boundary and $\textbf{Object Plane}$ to increase consistency of planes in large-scale textureless objects. Finally, we use a probability graph model to calculate the $\textbf{Aggregated Prior guided by Multiple Planes}$ and embed it into the matching cost. The experimental results indicate that our method achieves state-of-the-art in terms of boundary sharpness on ETH3D. And it also significantly improves the completeness weakly textured objects. We also validated the generalization of our method on Tanks&Temples.

Primary Subject Area: [Experience] Interactions and Quality of Experience
Relevance To Conference: Enhancing the multimedia application's user experience requires the usage of interactive technologies. Virtual reality (VR) and augmented reality (AR), two cutting-edge interactive multimedia that offer interactive experiences in three dimensions, have garnered a lot of attention lately. A basic challenge in creating 3D content for VR and AR is getting a precise, high-quality, dense depth map. Depth maps have the advantages of easy transmission and parallelization, making them widely used in multimedia interaction technology. However, in large-scale scene data, it is difficult to obtain dense depth information using LiDAR. Multi-View Stereo is often used in multimedia content generation because it can obtain depth information through a set of photos and camera parameters. To improve the quality of depth maps on large-scale datasets, we propose a new PatchMatch-based MVS method. In our method, we combine modalities such as images, segmentation, and initial depth to generate high-fidelity depth maps.
Supplementary Material: zip
Submission Number: 1917
Loading