Enhancing Pre-Trained Generative Language Models with Question Attended Span Extraction on Machine Reading Comprehension

Enhancing Pre-Trained Generative Language Models with Question Attended Span Extraction on Machine Reading Comprehension

ACL ARR 2024 April Submission679 Authors

16 Apr 2024 (modified: 02 May 2024)ACL ARR 2024 April SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Machine Reading Comprehension (MRC) poses a significant challenge in the field of Natural Language Processing (NLP). While mainstream MRC methods predominantly leverage extractive strategies using encoder-only models such as BERT, generative approaches face the issue of $\textit{out-of-control generation}$ -- a critical problem where answers generated are often incorrect, irrelevant, or unfaithful to the source text. To address these limitations in generative models for MRC, we introduce the $\textbf{Q}uestion$-$\textbf{A}ttended$ $\textbf{S}pan$ $\textbf{E}xtraction$ $(\textit{QASE})$ module. Integrated during the fine-tuning phase of pre-trained generative language models (PLMs), $\textit{QASE}$ significantly enhances their performance, allowing them to surpass the extractive capabilities of advanced Large Language Models (LLMs) such as GPT-4. Notably, these gains in performance do not come with an increase in computational demands. The efficacy of the $\textit{QASE}$ module has been rigorously tested across various datasets, consistently achieving or even surpassing state-of-the-art (SOTA) results. Our code is available at this anonymous repo link: https://anonymous.4open.science/r/QASE-7753/README.md.

Paper Type: Long

Research Area: Generation

Research Area Keywords: controlled text generation,model architectures,efficient models

Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 679

Loading