Character-aware Attention Residual Network for Sentence Representation

Xin Zheng; Zhenzhou Wu

Character-aware Attention Residual Network for Sentence Representation

Xin Zheng, Zhenzhou Wu

16 Aug 2024 (modified: 21 Jul 2022)Submitted to ICLR 2017Readers: Everyone

Abstract: Text classification in general is a well studied area. However, classifying short and noisy text remains challenging. Feature sparsity is a major issue. The quality of document representation here has a great impact on the classification accuracy. Existing methods represent text using bag-of-word model, with TFIDF or other weighting schemes. Recently word embedding and even document embedding are proposed to represent text. The purpose is to capture features at both word level and sentence level. However, the character level information are usually ignored. In this paper, we take word morphology and word semantic meaning into consideration, which are represented by character-aware embedding and word distributed embedding. By concatenating both character-level and word distributed embedding together and arranging words in order, a sentence representation matrix could be obtained. To overcome data sparsity problem of short text, sentence representation vector is then derived based on different views from sentence representation matrix. The various views contributes to the construction of an enriched sentence embedding. We employ a residual network on the sentence embedding to get a consistent and refined sentence representation. Evaluated on a few short text datasets, our model outperforms state-of-the-art models.

TL;DR: We propose a character-aware attention residual network for short text representation.

Conflicts: cs.nyu.edu

Keywords: Deep learning

11 Replies

Loading