Back to BERT in 2026: ModernGENA as a Strong, Efficient Baseline for DNA Foundation Models

Published: 03 Mar 2026, Last Modified: 09 Mar 2026ICLR 2026 Workshop FM4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: genomic language models, DNA foundation models, ModernGENA, ModernBERT, Transformer encoder, masked language modeling, long-context genomics, inference efficiency, FlashAttention, BPE tokenization, Nucleotide Transformer benchmark, genomics representation learning, foundation models
TL;DR: ModernGENA shows that going “back to BERT” with modern engineering yields a strong accuracy–speed trade-off baseline for DNA foundation models.
Abstract: Recent progress in DNA language models has been increasingly driven by large and complex systems, which can obscure the impact of improvements to standard NLP architectures. In this work, we study whether and how a modernized BERT-style backbone (ModernBERT) can be adapted to genomic sequence modeling to improve computational efficiency, training stability, and long-context handling. Under controlled experimental settings, we benchmark efficiency across a range of sequence lengths and evaluate downstream performance on the Nucleotide Transformer benchmark. The resulting model, ModernGENA, achieves a strong efficiency–quality trade-off and ranks among the top-performing models in our evaluation suite. To support reproducibility and to provide a solid default reference point for future architectural work in genomics, we release the full implementation and configuration of ModernGENA as an open, reusable baseline.
Submission Number: 133
Loading