Learning monosemantic features in multitask DNA regulatory sequence models via sparse autoencoder decomposition

Published: 06 Oct 2025, Last Modified: 06 Oct 2025NeurIPS 2025 2nd Workshop FM4LS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: genomics, interpretability, gene regulation, sparse autoencoders
TL;DR: We applied sparse autoencoders to decompose learned representations of a multitask DNA regulatory sequence model to discover monosemantic concepts corresponding to known regulatory motifs; an interactive web interface made available for exploration.
Abstract: Deep learning models for regulatory genomics achieve high predictive performance across diverse molecular phenotypes, yet their internal representations remain opaque. Here, we apply sparse autoencoders (SAEs) to decompose learned representations of Borzoi, a state-of-the-art CNN-transformer that predicts genome-wide transcriptional and epigenetic profiles from DNA sequence. Training TopK-SAEs on activations from Borzoi's early convolutional layers, we discover monosemantic regulatory features that correspond to transcription factor (TF) and RNA binding protein (RBP) motifs and transposable element sequences. We identify hundreds of significant position weight matrices that map SAE-discovered features to established TF binding sites through motif discovery using MEME suite against known TF databases. This work demonstrates that SAEs can systematically decompose regulatory genomics models into biologically interpretable components.
Submission Number: 55
Loading