Character-aware Attention Residual Network for Sentence Representation

Xin Zheng, Zhenzhou Wu

Nov 05, 2016 (modified: Jan 04, 2017) ICLR 2017 conference submission readers: everyone
  • Abstract: Text classification in general is a well studied area. However, classifying short and noisy text remains challenging. Feature sparsity is a major issue. The quality of document representation here has a great impact on the classification accuracy. Existing methods represent text using bag-of-word model, with TFIDF or other weighting schemes. Recently word embedding and even document embedding are proposed to represent text. The purpose is to capture features at both word level and sentence level. However, the character level information are usually ignored. In this paper, we take word morphology and word semantic meaning into consideration, which are represented by character-aware embedding and word distributed embedding. By concatenating both character-level and word distributed embedding together and arranging words in order, a sentence representation matrix could be obtained. To overcome data sparsity problem of short text, sentence representation vector is then derived based on different views from sentence representation matrix. The various views contributes to the construction of an enriched sentence embedding. We employ a residual network on the sentence embedding to get a consistent and refined sentence representation. Evaluated on a few short text datasets, our model outperforms state-of-the-art models.
  • TL;DR: We propose a character-aware attention residual network for short text representation.
  • Keywords: Deep learning
  • Conflicts: