TT-LCD: Tensorized-Transformer based Loop Closure Detection for Robotic Visual SLAM on EdgeDownload PDFOpen Website

Published: 01 Jan 2023, Last Modified: 04 Feb 2024ICARM 2023Readers: Everyone
Abstract: Visual simultaneous localization and mapping (VSLAM) is one of the core technologies in autonomous driving, intelligent robots, metaverse and other fields. Besides, loop closure detection (LCD) is an essential component in VSLAM which can correct the drift and accumulated errors caused by the visual odometry (VO) front-end, and assist robot to build a globally consistent map. Over the years, several deep-learning methods have been proposed to address the task. However, the prior proposed neural network-based LCD models are heavy in model size, and difficult to be deployed on edge devices. In this paper, an LCD module based on the tensorized transformer model called TT-LCD is proposed. To obtain a tensorized transformer model with accuracy-complexity co-awareness which can be effectively deployed, we proposed a construction method for tensor compressed transformer model with tensor-train (TT) decomposition and a differential neural network architecture search (NAS) method for tensor rank selection. Experiments demonstrate that the TT-LCD realizes a model size 6.04 × smaller than uncompressed transformer model, 32.1 × smaller than the VGG model and achieves lower memory cost of about 134M on edge CPU with little loss of accuracy on pre-training dataset but even 2.13% higher average accuracy on NewCollege dataset compared with uncompressed DeiT-based model in LCD task.
0 Replies

Loading