Augmentation for Context in Financial Numerical Reasoning over Textual and Tabular Data with Large-Scale Language Model
Keywords: Data Augmentation, Numerical Reasoning, Hybrid QA, Financial QA
Abstract: Constructing large-scale datasets for numerical reasoning over tabular and textual data in the financial domain is particularly challenging. Moreover, even the commonly used augmentation techniques for dataset construction prove to be ineffective in augmenting financial dataset. To address this challenge, this paper proposes a context augmentation methodology for enhancing the financial dataset, which generates new contexts for the original question. To do this, we leverage the hallucination capability of large-scale generative language models. Specifically, by providing instructions with constraints for context generation with the original dataset's questions and arithmetic programs together as input to the language model's prompt, we create plausible contexts that provide evidence for the given questions. The experimental results showed that the reasoning performance improved when we augmented the FinQA dataset using our methodology and trained the model with it.
Submission Number: 29