Abstract: Community-based Question and Answering (CQA) platforms have a huge number of users, resulting in numerous duplicate questions with similar intent from different users. Effectively detecting duplicate questions can improve the findability of platforms, and enhance the user experience of viewers and writers. Existing state-of-the-art methods focus on designing the structure of multi-layer interaction networks, ignoring the problems of error propagation and loss of low-level semantics. In this paper, we propose a novel Interaction-based Siamese Network (ISN) to address these issues, which utilizes a siamese structure to learn the original semantics of questions and captures interaction information with question interactive units. During the interaction, each interactive unit takes the original semantic representation of another question as an input, thus effectively mitigating the effect of error propagation. Furthermore, we propose an aggregation strategy to propagate low-level interaction features to high-level to preserve low-level semantic information, and introduce self-attention to enhance the model’s global interaction information learning ability. Experimental results on a real-world CQA dataset show that ISN outperforms state-of-the-art models for duplicate question detection.
External IDs:dblp:journals/mta/GaoYXZHZ24
Loading