Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation
Abstract: Current question answering systems leveraging retrieval augmented generation perform well in answering factoid questions but face challenges with non-factoid questions, particularly how-to queries requiring detailed step-by-step instructions and explanations. In this paper, we introduce Thread, a novel data organization paradigm that transforms documents into logic units based on their inter-connectivity. Extensive experiments across open-domain and industrial scenarios demonstrate that Thread outperforms existing data organization paradigms in RAG-based QA systems, significantly improving the handling of how-to questions.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: How-to Questions, Retrieval Augmented Generation, Large Language Models
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: English
Submission Number: 5221
Loading