Abstract: Understanding a quantity in text is a challenging task as it requires comprehending numbers within their numerical context. This complexity poses limitations in language model creation, extraction of numerical facts, etc. As a result, current question-answer (QA) systems often struggle to effectively answer questions beyond the factual ones related to quantities. The limitation arises from the datasets used to train QA models using neural architectures. To address this challenge, we aim to develop a QA dataset by generating questions that involve quantities mentioned in a text and valid comparative constraints within specific contexts. Our intuition is that formulating appropriate quantity-focused questions from a text will assist in creating a language model that better comprehends numerical contexts. This work proposes a bootstrapping framework that fine-tunes Large Language Models for quantity-focused question generation. Additionally, we introduce an answer generation module by combining an existing QA model with hand-crafted rules to handle quantity constraints. With this pipeline, we have created a new quantity-focused QA dataset. The experimental results indicate that this dataset improves the efficiency of answering quantity-focused questions.
Loading