Abstract: Natural language inference (NLI) has been widely used as a task to train and evaluate models for language understanding. However, the ability of NLI models to perform inferences that require understanding of figurative languages such as idioms and metaphors remains understudied. We introduce the IMPLI (Idiomatic and Metaphoric Paired Language Inference) dataset consisting of over 25K semi-automatically generated and 1.5K hand-written English sentence pairs based on idiomatic and metaphoric phrases. We use \dataset to evaluate NLI models based on RoBERTa fine-tuned on the MNLI dataset, and show that while they can reliably detect entailment relationship between figurative phrases with their literal definition, they perform poorly on examples where the phrases are designed to not entail the paired definition. This dataset suggests the limits of current NLI models with regard to understanding figurative language and provides a benchmark for future improvements in this direction.
0 Replies