Enhancing Pre-Trained Generative Language Models with Question Attended Span Extraction on Machine Reading Comprehension

ACL ARR 2024 June Submission564 Authors

12 Jun 2024 (modified: 02 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Machine Reading Comprehension (MRC) poses a significant challenge in the field of Natural Language Processing (NLP). While mainstream MRC methods predominantly leverage extractive strategies using encoder-only models such as BERT, generative approaches face the issue of $\textit{out-of-control generation}$ -- a critical problem where answers generated are often incorrect, irrelevant, or unfaithful to the source text. To address these limitations in generative models for extractive MRC, we introduce the $\textbf{Q}$uestion-$\textbf{A}$ttended $\textbf{S}$pan $\textbf{E}$xtraction ($\textit{QASE}$) module. Integrated during the fine-tuning phase of pre-trained generative language models (PLMs), $\textit{QASE}$ significantly enhances their performance, allowing them to surpass the extractive capabilities of advanced Large Language Models (LLMs) such as GPT-4 in few-shot settings. Notably, these gains in performance do not come with an increase in computational demands. The efficacy of the $\textit{QASE}$ module has been rigorously tested across various datasets, consistently achieving or even surpassing state-of-the-art (SOTA) results, thereby bridging the gap between generative and extractive models in extractive MRC tasks. Our code is available at this anonymous repo link: https://anonymous.4open.science/r/QASE-7753/README.md.
Paper Type: Long
Research Area: Generation
Research Area Keywords: controlled text generation, model architectures, efficient models
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 564
Loading