Back to BERT in 2026: ModernGENA as a Strong, Efficient Baseline for DNA Foundation Models

Published: 02 Mar 2026, Last Modified: 10 Mar 2026Gen² 2026 PosterEveryoneRevisionsCC BY 4.0
Track: Full / long paper (5-8 pages)
Keywords: genomic language models, DNA foundation models, ModernGENA, ModernBERT, Transformer encoder, masked language modeling, long-context genomics, inference efficiency, FlashAttention, BPE tokenization, Nucleotide Transformer benchmark, genomics representation learning, foundation models
TL;DR: ModernGENA shows that going “back to BERT” with modern engineering yields a strong accuracy–speed trade-off baseline for DNA foundation models.
Abstract: Recent progress in DNA language models has been increasingly driven by large and complex systems, which can obscure the impact of improvements to standard NLP architectures. In this work, we study whether and how a modernized BERT-style backbone (ModernBERT) can be adapted to genomic sequence modeling to improve computational efficiency, training stability, and long-context handling. Under controlled experimental settings, we benchmark efficiency across a range of sequence lengths and evaluate downstream performance on the Nucleotide Transformer benchmark. The resulting model, ModernGENA, achieves a strong efficiency–quality trade-off and ranks among the top-performing models in our evaluation suite. To support reproducibility and to provide a solid default reference point for future architectural work in genomics, we release the full implementation and configuration of ModernGENA as an open, reusable baseline.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 56
Loading