Abstract: The ideas of aleatoric and epistemic uncertainty are widely used to reason about the probabilistic predictions of machine-learning models. We identify incoherence in existing discussions of these ideas and suggest this stems from the aleatoric-epistemic view being insufficiently expressive to capture all the distinct quantities that researchers are interested in. To address this we present a decision-theoretic perspective that relates rigorous notions of uncertainty, predictive performance and statistical dispersion in data. This serves to support clearer thinking as the field moves forward. Additionally we provide insights into popular information-theoretic quantities, showing they can be poor estimators of what they are often purported to measure, while also explaining how they can still be useful in guiding data acquisition.
Lay Summary: If a machine-learning model has some predictive uncertainty, we might want to reason about where that uncertainty comes from. A popular thought is that we can break down predictive uncertainty into two parts: an "aleatoric" (or "chance") part relating to a sense of inherent unpredictability in the world, and an "epistemic" (or "knowledge") part relating to the model's lack of knowledge. We show that this perspective is too simplistic in the context of machine learning, preventing an appropriately nuanced understanding of key ideas. To address this we revisit foundational ideas from decision theory and use them to provide a new perspective on uncertainty in machine learning. Our hope is that this will help avoid some of the confusions that we believe have been caused by the existing perspective, and in turn improve researchers' ability to understand and design machine-learning methods.
Link To Code: https://github.com/fbickfordsmith/rethinking-aleatoric-epistemic
Primary Area: Probabilistic Methods
Keywords: aleatoric, epistemic, uncertainty
Submission Number: 6990
Loading