AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models

Eric Wallace, Jens Tuyls, Junlin Wang, Sanjay Subramanian, Matt Gardner, Sameer Singh

14 May 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: Neural NLP models are increasingly accu- rate but are imperfect and opaque—they break in counterintuitive ways and leave end users puzzled at their behavior. Model interpreta- tion methods ameliorate this opacity by pro- viding explanations for specific model pre- dictions. Unfortunately, existing interpreta- tion codebases make it difficult to apply these methods to new models and tasks, which hin- ders adoption for practitioners and burdens in- terpretability researchers. We introduce Al- lenNLP Interpret, a flexible framework for in- terpreting NLP models. The toolkit provides interpretation primitives (e.g., input gradients) for any AllenNLP model and task, a suite of built-in interpretation methods, and a library of front-end visualization components. We demonstrate the toolkit’s flexibility and util- ity by implementing live demos for five in- terpretation methods (e.g., saliency maps and adversarial attacks) on a variety of models and tasks (e.g., masked language modeling using BERT and reading comprehension us- ing BiDAF). These demos, alongside our code and tutorials, are available at https://allennlp. org/interpret.

0 Replies