Abstract: Question retrieval aims to find the semantically equivalent questions from question archives for a user question. Recently, Transformer-based models have significantly advanced the progress of question retrieval, which mainly focus on capturing the content-based semantic relations of two questions. However, they can not well capture the category-based semantic relations of two questions, even question categories are very important to identify the semantic equivalence of two questions. To capture both the content-based and category-based semantic relations, we study the issue of improving Transformer by highlighting and incorporating the category information. To this end, we innovatively propose the Category-Highlighting Transformer Network (CHT). Because questions are not equipped with explicit categories, CHT first uses a category identification unit to construct category-based semantic representations for the question and its embedded words. Second, to “deeply” capture the category-based and content-based semantic relations, we develop the category-highlighting Transformer by improving the self-attention unit with the category-based representations. The cascaded category highlighting Transformers are used for modelling “individual” semantics of a question and “joint” semantics of two questions. Extensive experiments on three public datasets show that the category-highlighting Transformer network outperforms the state-of-the-art solutions.
Loading