Abstract: Transformer plays an increasingly important role in various computer vision areas and has made remarkable achievements in point cloud analysis. Since existing methods mainly focus on point-wise transformer, an adaptive channel-wise Transformer is proposed in this paper. Specifically, a channel encoding Transformer called Transformer Channel Encoder (TCE) is designed to encode the coordinate channel. It can encode coordinate channels by capturing the potential relationship between coordinates and features. The encoded channel can extract features with stronger representation ability. Compared with simply assigning attention weight to each channel, our method aims to encode the channel adaptively. Moreover, our method can be extended to other frameworks to improve their preformance. Our network adopts the neighborhood search method of feature similarity semantic receptive fields to improve the performance. Extensive experiments show that our method is superior to state-of-the-art point cloud classification and segmentation methods on three benchmark datasets.
0 Replies
Loading