Keywords: rationalization, causal inference, probability of causation, interpretability and trustworthiness, natural language processing, electronic health records
TL;DR: This paper leverages two causal desiderata, non-spuriousness and efficiency, to identify true rationales in the real-world natural language processing tasks.
Abstract: With recent advances in natural language processing, rationalization becomes an essential self-explaining diagram to disentangle the black box by selecting a subset of input texts to account for the major variation in prediction. Yet, existing association-based approaches on rationalization cannot identify true rationales when two or more rationales are highly intercorrelated, and thus provide a similar contribution to prediction accuracy, so-called spuriousness. To address this limitation, we novelly leverage two causal desiderata, non-spuriousness and efficiency, into rationalization from the causal inference perspective. We formally define the probability of causation in the rationale model with its identification established as the main component of learning necessary and sufficient rationales. The superior performance of our causal rationalization is demonstrated on real-world review and medical datasets with extensive experiments compared to state-of the-art methods.