Keywords: Explainability, Orthogonalization, Diversity, LSTMs, Transformers
Abstract: Attention-based mechanisms [6] have recently gained popularity due to the boost in performance and robustness
that they offer across problems in the Natural Language Processing domain. Another convenient feature of those
models is that the attention distribution can be visually analyzed to flag words and phrases that trigger the model’s
decision-making. Although this approach has been widely adopted by the research community [7] as a way to improve
the model’s explainability, some scientists argue that this method provides neither faithful nor plausible insights [8].
Authors of [9] suggest that this might be due to the lack of variability in the hidden state representations. To overcome
this problem, they introduce diversity-driven training and the orthogonalization for LSTMs to increase the variability of
the hidden states.
In this work, we attempt to reproduce the results that were carried out in [9] and extend it with multilingual datasets [10].
Additionally, we apply their idea to Transformer models [11] and ensure a reproducible environment by containerizing
the software 1 with Docker [12]. We have managed to reproduce most of their results, however, some datasets were not
easily available due to the licensing issue, or were too large to process.
Paper Url: https://openreview.net/forum?id=r3R_osip5G
4 Replies
Loading