Keywords: Genomics, Hyena, Foundation Models, Large Language Models, Mixture of Experts
TL;DR: HyenaMoE: A Hybrid and Scalable Architecture for Efficient Genomic Modeling
Abstract: DNA sequences serve as the fundamental blueprint of cellular life, encoding critical information for gene regulation, protein synthesis, and a broad spectrum of essential biological processes. Owing to their sequential structure, DNA sequences bear similarities to natural language, motivating the adaptation of large language model architectures and the pretraining–finetuning paradigm in genomics. This has led to the emergence of genomic foundation models that perform well across a wide range of downstream tasks. Nonetheless, current approaches face structural limitations. Transformer-based models possess strong representational capacity for local contexts, making them well-suited for tasks involving short sequences. However, their scalability is limited by the quadratic complexity of attention mechanisms. In contrast, methods based on state space models offer high computational efficiency and can process long-range genomic inputs, but they generally perform less strongly than Transformer counterparts on shorter sequences. To address these limitations, we introduce HyenaMoE, a unified hybrid architecture designed for genomic modeling using 3-mer tokenization. HyenaMoE combines efficient HyenaLite blocks for long-range dependency modeling with attention layers enhanced by Mixture-of-Experts routing, enabling scalable capacity expansion and more efficient allocation of model resources across diverse inputs. This design supports a favorable balance between model expressiveness and computational efficiency. Experiments on three representative benchmarks demonstrate that HyenaMoE achieves state-of-the-art performance across a diverse array of genomic prediction tasks.
Supplementary Material: pdf
Primary Area: foundation or frontier models, including LLMs
Submission Number: 6114
Loading