COMPACT: Compressing Retrieved Documents Actively for Question Answering

ACL ARR 2024 June Submission5152 Authors

16 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Retrieval-augmented generation supports language models to strengthen their factual groundings by providing external context. However, language models often face challenges in locating and integrating extensive information, diminishing their effectiveness in solving complex questions. Query-focused compression tackles this issue by filtering out information irrelevant to the query, but current methods still struggle in realistic scenarios where crucial information may not be located with a single-step approach. To overcome this limitation, we introduce COMPACT, a novel framework that employs an active strategy to condense extensive documents without losing key information. COMPACT flexibly operates as a cost-efficient plug-in module with any off-the-shelf retriever or reader model, achieving extremely high compression rates (44x). Our experiments demonstrate that COMPACT brings significant improvements in both compression rate and QA performance on multi-hop question-answering datasets.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: multihop QA, open-domain QA
Contribution Types: Approaches low compute settings-efficiency
Languages Studied: English
Submission Number: 5152
Loading