Tell me why!—Explanations support learning relational and causal structureDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: Explanation, RL, Language, Relations, Causality
Abstract: Explanations play a considerable role in human learning, especially in areas that remain major challenges for AI—forming abstractions, and learning about the relational and causal structure of the world. Here, we explore whether machine learning models might likewise benefit from explanations. We outline a family of relational tasks that involve selecting an object that is the odd one out in a set (i.e., unique along one of many possible feature dimensions). Odd-one-out tasks require agents to reason over multi-dimensional relationships among a set of objects. We show that agents do not learn these tasks well from reward alone, but achieve >90% performance when they are also trained to generate language explaining object properties or why a choice is correct or incorrect. In further experiments, we show how predicting explanations enables agents to generalize appropriately from ambiguous, causally-confounded training, and even to meta-learn to perform experimental interventions to identify causal structure. We show that explanations help overcome the tendency of agents to fixate on simple features, and explore which aspects of explanations make them most beneficial. Our results suggest that learning from explanations is a powerful principle that could offer a promising path towards training more robust and general machine learning systems.
One-sentence Summary: We show that explanations help agents to understand relational tasks, to generalize appropriately out of distribution, and to meta-learn to identify causal structure.
Supplementary Material: zip
13 Replies

Loading