Dual-axial self-attention network for text classification

Xiaochuan Zhang, Xipeng Qiu, Jianmin Pang, Fudong Liu, Xingwei Li

Published: 01 Jan 2021, Last Modified: 12 Jun 2025Sci. China Inf. Sci. 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Text classification is an important task in natural language processing and numerous studies aim to improve the accuracy and efficiency of text classification models. In this study, we propose an effective and efficient text classification model which is based on self-attention solely. The recently proposed multi-dimensional self-attention significantly improved the performance of self-attention. However, existing models suffer from two major limitations: (1) the previous multi-dimensional self-attention models are quite time-consuming; (2) the dependencies of elements along the feature axis are not taken into account. To overcome these problems, in this paper, a much more computational efficient multi-dimensional self-attention model is proposed, and two parallel self-attention modules, called dual-axial self-attention, are applied to capture rich dependencies along the feature axis as well as the text axis. A text classification model is then derived. The experimental results on eight representative datasets show that the proposed text classification model can obtain state-of-the-art results and the proposed self-attention outperforms conventional self-attention models.