Question Classification by Approximating Semantics

Guangyu Feng, Kun Xiong, Yang Tang, Anqi Cui, Jing Bai, Hang Li, Qiang Yang, Ming Li

2015 (modified: 12 Nov 2022)WWW (Companion Volume) 2015Readers: Everyone

Abstract: A central task of computational linguistics is to decide if two pieces of texts have similar meanings. Ideally, this depends on an intuitive notion of semantic distance. While this semantic distance is most likely undefinable and uncomputable, in practice it is approximated heuristically, consciously or unconsciously. In this paper, we present a theory, and its implementation, to approximate the elusive semantic distance by the well-defined information distance. It is mathematically proven that any computable approximation of the intuitive concept of semantic distance is "covered" by our theory. We have implemented our theory to question answering (QA) and performed experiments based on data extracted from over 35 million question-answer pairs. Experiments demonstrate that our initial implementation of the theory produces convincingly fewer errors in classification compared to other academic models and commercial systems.

0 Replies