A New Multi-input Model with the Attention Mechanism for Text ClassificationDownload PDF

25 Sep 2019 (modified: 24 Dec 2019)ICLR 2020 Conference Blind SubmissionReaders: Everyone
  • Original Pdf: pdf
  • Abstract: Recently, deep learning has made extraordinary achievements in text classification. However, most of present models, especially convolutional neural network (CNN), do not extract long-range associations, global representations, and hierarchical features well due to their relatively shallow and simple structures. This causes a negative effect on text classification. Moreover, we find that there are many express methods of texts. It is appropriate to design the multi-input model to improve the classification effect. But most of models of text classification only use words or characters and do not use the multi-input model. Inspired by the above points and Densenet (Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4700–4708, 2017.), we propose a new text classification model, which uses words, characters, and labels as input. The model, which is a deep CNN with a novel attention mechanism, can effectively leverage the input information and solve the above issues of the shallow model. We conduct experiments on six large text classification datasets. Our model achieves the state of the art results on all datasets compared to multiple baseline models.
  • Keywords: Natural Language Processing, Text Classification, Densent, Multi-input Model, Attention Mechanism
  • TL;DR: We propose a new multi-input model with a novel attention mechanism, can effectively solve the issues of the shallow text classification model such as doing not extract long-range associations, global representations, and hierarchical features.
4 Replies

Loading