Are Machine Reading Comprehension Systems Robust to Context Paraphrasing?

Anonymous

Are Machine Reading Comprehension Systems Robust to Context Paraphrasing?

Anonymous

17 Apr 2023ACL ARR 2023 April Blind SubmissionReaders: Everyone

Abstract: Investigating the behaviour of pre-trained Machine Reading Comprehension (MRC) models under various types of test-time perturbations can shed light on the enhancement of their robustness and generalisation capability, despite the superhuman performance they have achieved on existing benchmark datasets. In this paper, we study the robustness of contemporary MRC systems to context paraphrasing, i.e., whether these models are still able to correctly answer the questions once the reading passages have been paraphrased. To this end, we systematically design a pipeline to semi-automatically generate perturbed MRC instances which ultimately lead to the creation of a paraphrased test set. We conduct experiments on this dataset with six state-of-the-art neural MRC models and we find that even the minimum performance drop of all these models exceeds 28%, whereas human performance remains high. These results demonstrate that the existing high-performing MRC systems are still far away from real language understanding.

Paper Type: short

Research Area: Question Answering

0 Replies

Loading