Semantically Enriched Text Generation for QA through Dense Paraphrasing

Timothy Obiso, Bingyang Ye, Kyeongmin Rim, James Pustejovsky

Published: 19 Oct 2024, Last Modified: 09 Apr 2025ICNLSP 2024EveryoneCC BY 4.0

Abstract: Large Language Models (LLMs) are very effective at extractive language tasks such as Question Answering (QA). While LLMs can improve their performance on these tasks through increases in model size (via massive pretraining) and/or iterative on-the-job training (one-shot, few-shot, chain-of-thought), we explore what other less resource-intensive and more efficient types of data augmentation can be applied to obtain similar boosts in performance. We define multiple forms of Dense Paraphrasing (DP) and obtain DP-enriched versions of different contexts. We demonstrate that performing QA using these semantically enriched contexts leads to increased performance on models of various sizes and across task domains, without needing to increase model size.