Abstract: Point-cloud-based place recognition is a key component for outdoor large-scale Simultaneous Localization And Mapping (SLAM) in re-localization. However, most methods have limited generalization ability for unseen environments. To address this issue, a Hybrid Voxel- and Point-wise network, named HVP-Net, is proposed. This network utilizes sparse convolutions to learn the local detail of the voxel-wise features and proposed lightweight grouped efficient attention mechanisms to capture the global representations of the point-wise features. To enhance the discrimination of the global descriptors, these two kinds of features are fused in an interactive way to take advantage of point-wise features without information loss and voxel-wise ones robust to local noises. In addition, a positive-ranking guided triplet loss is proposed, which further considers the consistency of distance ranking between different anchor-positive pairs in both Euclidean and feature space. Experiments on the benchmark, KITTI, NCLT, and one self-collected dataset show that HVP-Net achieves state-of-the-art results and can effectively improve the generalization ability for unseen environments.
Loading