A mechanistically interpretable neural network for regulatory genomics

Alex M Tseng; Gökcen Eraslan; Tommaso Biancalani; Gabriele Scalia

A mechanistically interpretable neural network for regulatory genomics

Alex M Tseng, Gökcen Eraslan, Tommaso Biancalani, Gabriele Scalia

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: interpretability, mechanistic interpretability, attention, convolution, regulatory genomics

TL;DR: We designed a neural network for DNA sequences, which reveals motifs and syntactical rules from its weights and activations after training.

Abstract: Deep neural networks excel in mapping genomic DNA sequences to associated readouts (e.g., protein–DNA binding). Beyond prediction, the goal of these networks is to reveal to scientists the underlying motifs (and their syntax) which drive genome regulation. Traditional methods that extract motifs from convolutional filters suffer from the uninterpretable dispersion of information across filters and layers. Other methods which rely on importance scores can be unstable and unreliable. Instead, we designed a novel mechanistically interpretable architecture for regulatory genomics, where motifs and their syntax are directly encoded and readable from the learned weights and activations. We provide theoretical and empirical evidence of our architecture's full expressivity, while still being highly interpretable. Through several experiments, we show that our architecture excels in de novo motif discovery and motif instance calling, is robust to variable sequence contexts, and enables fully interpretable generation of novel functional sequences.

Supplementary Material: zip

Primary Area: interpretability and explainable AI

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8095

Loading