Abstract: In many application scenarios, practitioners not only aim to maximize predictive performance but also seek faithful explanations for the predictions. Rationales selected by faithful feature attribution methods provide insights into how different parts of the input contribute to the model prediction. Previous studies have explored how different factors affect faithfulness, however, these studies are mainly in the context of monolingual English models. On the other hand, the differences in explanation faithfulness between multilingual and monolingual models have yet to be explored. In this paper, we provide a comprehensive study on comparing the faithfulness between these two types of models. Our extensive experiments covering five languages and five popular feature attribution methods, showing that faithfulness varies between multilingual and monolingual models. For example, multilingual mBERT is more faithful than monolingual BERT, while multilingual RoBERTa is less faithful than monolingual RoBERTa. We show that the larger the multilingual model, the less faithful its rationales are compared to its counterpart monolingual model. Finally, we find that the faithfulness disparity is related to differences between multilingual and monolingual tokenizers, that when the tokenizers of multilingual models split words more aggressively, their faithfulness is closer to their monolingual counterparts. Our code will be publicly released for reproducibility.
Paper Type: long
Research Area: Multilinguality and Language Diversity
Contribution Types: Model analysis & interpretability
Languages Studied: English, Chinese, Hindi, Spanish, French, and Multilingual models
Consent To Share Submission Details: On behalf of all authors, we agree to the terms above to share our submission details.
0 Replies
Loading