Position: Use Sparse Autoencoders to Discover Unknowns

Kenny Peng; Rajiv Movva; Jon Kleinberg; Emma Pierson; Nikhil Garg

Position: Use Sparse Autoencoders to Discover Unknowns

Kenny Peng, Rajiv Movva, Jon Kleinberg, Emma Pierson, Nikhil Garg

Published: 30 Apr 2026, Last Modified: 24 Jun 2026ICML 2026 Position Paper Track regularEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: We give a conceptual distinction: negative SAE results study tasks that act on known concepts, while positive results discover unknown concepts

Abstract: While sparse autoencoders (SAEs) have generated significant excitement, a series of negative results have added to skepticism about their usefulness. Here, we establish a conceptual distinction that reconciles competing narratives surrounding SAEs. We argue that even if SAEs may be less effective for *acting on known concepts*, SAEs are especially powerful tools for *discovering unknown concepts*. This distinction separates existing negative results from positive results, and suggests several classes of SAE applications. Specifically, we outline use cases for SAEs in (i) ML interpretability, explainability, fairness, auditing, and safety, and (ii) social and health sciences.

Lay Summary: While sparse autoencoders (SAEs) have generated significant excitement, a series of negative results have added to skepticism about their usefulness. Here, we establish a conceptual distinction that reconciles competing narratives surrounding SAEs. We argue that even if SAEs may be less effective for *acting on known concepts*, SAEs are especially powerful tools for *discovering unknown concepts*. This distinction separates existing negative results from positive results, and suggests several classes of SAE applications. Specifically, we outline use cases for SAEs in (i) ML interpretability, explainability, fairness, auditing, and safety, and (ii) social and health sciences.

Primary Area: Model Understanding, Explainability, Interpretability, and Trust

Keywords: interpretability, sparse autoencoders

Originally Submitted PDF: pdf

Submission Number: 639

Loading