Probing Difficulty and Discrimination of Natural Language Questions With Item Response TheoryDownload PDF

Anonymous

16 Nov 2021 (modified: 05 May 2023)ACL ARR 2021 November Blind SubmissionReaders: Everyone
Abstract: Item Response Theory (IRT) has been extensively used to characterize question difficulty for human subjects in domains including cognitive psychology and education (Primi et al.,2014; Downing, 2003). In this work, we explore IRT to characterize the difficulty and discrimination of natural language questions in Question-Answering datasets. We use HotPotQA for illustration. Our analysis reveals significant variations along these traits, as well as interdependence between them. Additionally, we explore predictive models for directly estimating these traits from the text of the questions and answers. Our experiments show that it is possible to predict both difficulty and discrimination parameters for new questions, and these traits are correlated with features of questions, answers, and associated contexts. Our findings can have significant implications for the creation of new datasets and tests on the one hand and strategies such as active learning and curriculum learning on the other.
0 Replies

Loading