Keywords: Explainable AI, Named Entity Recognition, Language Models, Natural Language Processing
Abstract: In the last decades, deep learning approaches achieved impressive results in many research fields,
such as Computer Vision and Natural Language Processing (NLP).
NLP in particular has greatly benefit from unsupervised methods that allow to learn distributed representation of language.
On the race for better performances Language Models have reached hundred of billions parameters nowadays.
Despite the remarkable results, deep models are still far from being fully exploited in real world applications.
Indeed, these approaches are black-boxes, i.e. they are not interpretable by design nor explainable, which is often crucial to make decisions in business.
Several task-agnostic methods have been proposed in literature to explain models' decisions.
Most techniques rely on the "local" assumption, i.e. explanations are made example-wise.
In this paper instead, we present a post-hoc method to produce highly interpretable global rules to explain NLP classifiers.
Rules are extracted with a data mining approach on a semantically enriched input representation, instead of using words/wordpieces solely.
Semantic information yields more abstract and general rules that are both more explanatory and less complex, while being also better at reflecting the model behaviour.
In the experiments we focus on Named Entity Recognition, an NLP task where explainability is under-investigated.
We explain the predictions of BERT NER classifiers trained on two popular benchmarks, CoNLL03 and Ontonotes, and compare our model against LIME.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)
12 Replies
Loading