What Does a Chemical Language Model Know About Molecules?

Christian Kenneth; Etowah Adams; Liam Bai; Gerard JP van Westen

What Does a Chemical Language Model Know About Molecules?

Christian Kenneth, Etowah Adams, Liam Bai, Gerard JP van Westen

Published: 11 Jun 2026, Last Modified: 20 Jun 2026Mech Interp Workshop ICML 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Concept Discovery (e.g., SAEs, dictionary learning), Interpretability for Knowledge Discovery

Other Keywords: chemical language model

TL;DR: We use sparse autoencoders to decode the underlying molecular representations learned by MolFormer.

Abstract: Chemical language models (cLMs) are widely assumed to learn surface-level syntactic patterns rather than learning meaningful molecular semantics. Here, we apply sparse autoencoders (SAEs) to MolFormer, an encoder-only cLM, to mechanistically examine how molecular representations are built across layers. We discover that early layers rely on position-tracking latents to parse molecular grammar, while later layers encode atom-in-substructure and pharmacologically relevant features. Additionally, we show that non-canonical SMILES produce more disruptive representation shifts than invalid SMILES, driven by position-latent disruption propagating across layers. To support further exploration, we develop InterMol, an interactive visualizer for SAE activations on molecular strings and structures.

Submission Number: 263

Loading