Curriculum: A Broad-Coverage Benchmark for Linguistic Phenomena in Natural Language UnderstandingDownload PDF

Anonymous

16 Jan 2022 (modified: 05 May 2023)ACL ARR 2022 January Blind SubmissionReaders: Everyone
Abstract: In the age of large transformer language models, linguistic benchmarks play an important role in diagnosing models' abilities and limitations on natural language understanding. However, current benchmarks show some significant shortcomings. In particular, they do not provide insight into how well a language model captures distinct linguistic phenomena essential for language understanding and reasoning. In this paper, we introduce Curriculum, a new large-scale NLI benchmark for evaluation on broad-coverage linguistic phenomena. We show that our benchmark for linguistic phenomena serves as a more difficult challenge for current state-of-the-art models. Our experiments also provide insight into the limitation of existing benchmark datasets. In addition, we find that sequential training on selected linguistic phenomena effectively improves generalizing performance on adversarial NLI under limited training examples.
Paper Type: long
0 Replies

Loading