A hybrid architecture of sparse convolutional neural network-transformer for enhanced spatial-geometric feature learning in surface reconstruction
Abstract: Highlights•We propose SDF Utrans, the first architecture integrating dedicated 3D CNN and SDF Transformer branches for indoor scene reconstruction pipeline. Leveraging the powerful feature extraction capabilities, our network achieves state-of-the-art reconstruction results on the ScanNet dataset.•We introduce Sparse Position Attention and Mixed Feature Fusion strategies for skip connections. By efficiently integrating spatial and semantic features extracted, SDF Utrans effectively preserves local details.•We introduce novel lightweight Sparse Channel Decoding Blocks to construct the decoder. Leveraging rich spatial and semantic information, SCDB efficiently enable the reconstruction of high-quality surfaces with rich local details and coherent global structures.
Loading