What is Missing in Existing Multi-hop Datasets? Toward Deeper Multi-hop Reasoning TaskDownload PDF


16 Jun 2021 (modified: 05 May 2023)ACL ARR 2021 Jun Blind SubmissionReaders: Everyone
Abstract: Multi-hop machine reading comprehension (MRC) is a task that requires models to read and perform multi-hop reasoning over multiple paragraphs to answer a question. The task can be used to evaluate reasoning skills, as well as to check the explainability of the models, and is useful in applications (e.g., QA system). However, the current definition of hop (alias step) in the multi-hop MRC is ambiguous; moreover, previous studies demonstrated that many multi-hop examples contain reasoning shortcuts where the questions can be solved without performing multi-hop reasoning. In this opinion paper, we redefine multi-hop MRC to solve the ambiguity of its current definition by providing three different definitions of the steps. Inspired by the assessment of student learning in education, we introduce a new term of In-depth multi-hop reasoning task with three additional evaluations: step evaluation, coreference evaluation, and entity linking evaluation. In addition, we also examine the existing multi-hop datasets based on our proposed definitions. We observe that there is potential to extend the existing multi-hop datasets by including more intermediate evaluations to the task. To prevent reasoning shortcuts, multi-hop MRC datasets should focus more on providing a clear definition for the steps in the reasoning process and preparing gold data to evaluate them.
0 Replies
