Hierarchical and lateral multiple timescales gated recurrent units with pre-trained encoder for long text classification

Dennis Singh Moirangthem, Minho Lee

2021 (modified: 10 Nov 2021)Expert Syst. Appl. 2021Readers: Everyone

Abstract: Highlights • Performance of current text classifiers degrade with longer input sequences. • The proposed model can classify texts of diverse input lengths. • A hierarchical and lateral architecture is proposed to enhance the performance. • The model uses rich features extracted by pre-trained bidirectional encoders. • Our model outperforms existing models on various long text classification datasets. Abstract Text classification, using deep learning techniques, has become a research challenge in natural language processing. Most of the existing deep learning models for text classification face difficulties when the length of the input text increases. Most models work well on shorter text inputs, however, their performance degrades with the increase in the input length. In this work, we introduce a model for text classification that can alleviate this problem. We present the hierarchical and lateral multiple timescales gated recurrent units (HL-MTGRU), in combination with pre-trained encoders to address the long text classification problem. HL-MTGRU can represent multiple temporal scale dependencies for the discrimination task. By combining the slow and fast units of the HL-MTGRU, our model effectively classifies long multi-sentence texts into the desired classes. We also show that the HL-MTGRU structure helps the model to prevent degradation of performance on longer text inputs. We demonstrate that the proposed network with the help of the latest pre-trained encoders for feature extraction outperforms the conventional models on various long text classification benchmark datasets.

0 Replies