TL;DR: Applying Mamba's selective scan along the sequence dimension of DNA multiple sequence alignments shows state-of-the-art performance in long-context variant effect prediction and genomic benchmarks tasks with decreased computational complexity.
Abstract: We introduce MSAMamba, a novel architecture designed to address the context-length limitation of existing DNA multiple sequence alignment (MSA) models. Traditional transformers struggle with the vast context lengths inherent in MSA genome data, mainly due to the quadratic complexity of self-attention at large batch sizes. MSAMamba leverages a selective scan operation along the sequence dimension and separates sequence length and MSA dimension processing to enhance efficiency while accounting for MSA-level inductive biases. This architecture enables scalable analysis of long DNA sequences, increasing the training context length of previous methods by 8x. In addition, we develop a row-sparse training method that significantly reduces the computational overhead of the selective scan operation during model training. We demonstrate that MSAMamba achieves performance on par with state-of-the-art (SOTA) transformer-based models in variant effect prediction tasks and exceeds their performance at larger context lengths. We also demonstrate that our model excels in GenomicBenchmarks tasks. Our results indicate that MSAMamba mitigates the computational challenges of long-context DNA MSA analysis and sets a new standard for scalability and efficiency in genomic modeling.
Submission Number: 151
Loading