3DCTN: 3D Convolution-Transformer Network for Point Cloud ClassificationDownload PDFOpen Website

Published: 2022, Last Modified: 05 Nov 2023IEEE Trans. Intell. Transp. Syst. 2022Readers: Everyone
Abstract: Point cloud classification is a fundamental task in 3D applications. However, it is challenging to achieve effective feature learning due to the irregularity and unordered nature of point clouds. Lately, 3D Transformers have been adopted to improve point cloud processing. Nevertheless, massive Transformer layers tend to incur huge computational and memory costs. This paper presented a novel hierarchical framework that incorporated convolutions with Transformers for point cloud classification, named 3D Convolution-Transformer Network (3DCTN). It combined the strong local feature learning ability of convolutions with the remarkable global context modeling capability of Transformers. Our method had two main modules operating on the downsampling point sets. Each module consisted of a multi-scale local feature aggregating (LFA) block and a global feature learning (GFL) block, which were implemented by using the Graph Convolution and Transformer respectively. We also conducted a detailed investigation on a series of self-attention variants to explore better performance for our network. Various experiments on ModelNet40 and ScanObjectNN datasets demonstrated that our method achieves state-of-the-art classification performance with a lightweight design. The code is publicly available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/d62lu/3DCTN</uri> .
0 Replies

Loading