Keywords: explainability, interpretability, language model
TL;DR: We investigate how effectively LM-generated free-text rationalest can provide human utility for human-AI collaboration, i.e., assist humans in solving NLP tasks.
Abstract: Recently, there has been growing interest in using language models (LMs) for human-AI collaboration. To explain their reasoning processes to humans, state-of-the-art LMs have been shown to fluently generate free-text rationales (FTRs) in natural language, e.g., via chain-of-thought prompting. Still, it remains unclear how effectively these generated FTRs can provide human utility for human-AI collaboration, i.e., assist humans in solving NLP tasks. To investigate what makes an FTR useful to humans, this paper analyzes the relationships between human utility and various LM/FTR properties. First, although LMs are often finetuned/prompted to jointly generate task labels and FTRs, we find that LMs’ task performance has little correlation with human utility, whereas LM size is a positive predictor of human utility. Second, we observe that certain FTR property pairs are strong positive predictors of human utility, e.g., high-utility FTRs tend to both be concise and contain novel information. Third, we show that high-utility FTRs for a given task instance can provide transferable knowledge that helps humans generalize to solving new instances. By shedding light on the nature of FTRs’ human utility in practical settings, our findings can help guide future work on designing LMs and FTR generation strategies for stronger human-AI collaboration.