HindiRC: A Dataset for Reading Comprehension in Hindi
Abstract: We release HindiRC dataset, a new publicly available Reading Comprehension dataset for grade based comprehension. Popular works on reading comprehension don't make a distinction between level of difficulty of understanding across grades. However, HindiRC has passages divided according to different grades of primary education. The questions are curated by hand and are natural. This is the first of its kind to be released for Hindi reading comprehension and the first one to be naturally curated for question answering in Hindi. We compare several similarity metrics' performance on the data and give a class wise analysis. We also use the similarity metrics to characterize linguistic features of each grade.
0 Replies
Loading