Keywords: Short Text Classification, Directed Graph, Topological Feature, Graph Attention Network
Abstract: Prior work on short text classification has mainly focused on augmenting short texts with features obtained from external sources, which often leads to topic drift and ambiguity. Instead, this paper explores internal knowledge within dataset to enhance short text classification. In particular, we construct a single directed graph on the dataset where nodes denote words and edges represent the order between words. We treat a short text as a path and then propose a novel graph-based model, which aggregates graph topological features of each word into themselves. Compared with previous work, we focus on enhancing the representation of short texts based on geometry-based neighbours, regarded as internal knowledge from the dataset. Furthermore, we construct two new Chinese short text datasets and develop a simple method for short text classification. Experimental results on nine benchmark datasets validate the effectiveness of the proposed method and show improvements in classification accuracy.