Enigma: An Efficient Model for Deciphering Regulatory Genomics

Published: 02 Mar 2026, Last Modified: 02 Mar 2026MLGenX 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Genomic sequence-to-function models have emerged as powerful tools for deciphering cis-regulatory grammar to advance our understanding of disease biology and guide therapeutic development. Recent advances have been driven by multi-task training of large transformer-based models on thousands of genome tracks. However, these performance gains have come at significant computational cost for both training and inference, hindering large-scale applications and slowing future model development. Here, rather than continuing to scale model size and add more training tracks, we focus on architectural efficiency and train on a substantially smaller, curated set of genome tracks. Our model Enigma achieves competitive performance with current state-of-the-art models at single-base resolution while substantially reducing computational cost. On zero-shot variant effect prediction benchmarks, Enigma outperforms the leading open-source model, the Borzoi ensemble, while using 10.9% of its compute and improving resolution from 32 bases to a single base. Compared to AlphaGenome, Enigma achieves 90.4 - 97.3% of its performance using 7.5% of its estimated compute. These improvements in efficiency can facilitate further development of models for regulatory genomics. We demonstrate this by fine-tuning Enigma on predicting three new molecular phenotypes (ChIP-seq, RNA half-life, and translation efficiency) achieving or exceeding the performance of state-of-the-art task-specific models. We are providing Enigma for non-commercial use to benefit the broader research field.
Track: Main track
Keywords: sequence-to-function modeling, genomics, regulatory genomics
AI Policy Confirmation: I confirm that this submission clearly discloses the role of AI systems and human contributors and complies with the ICLR 2026 Policies on Large Language Model Usage and the ICLR Code of Ethics.
Submission Number: 60
Loading