MSAMamba: Adapting Subquadratic Sequence Models To Long-Context DNA MSA Analysis

Vishrut Thoutam; Dina Ellsworth

MSAMamba: Adapting Subquadratic Sequence Models To Long-Context DNA MSA Analysis

Vishrut Thoutam, Dina Ellsworth

14 Aug 2024 (modified: 29 Sept 2024)IEEE ICIST 2024 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

TL;DR: Applying Mamba's selective scan along the sequence dimension of DNA multiple sequence alignments shows state-of-the-art performance in long-context variant effect prediction and genomic benchmarks tasks with decreased computational complexity.

Abstract: We introduce MSAMamba, a novel architecture designed to address the context-length limitation of existing DNA multiple sequence alignment (MSA) models. Traditional transformers struggle with the vast context lengths inherent in MSA genome data, mainly due to the quadratic complexity of self-attention at large batch sizes. MSAMamba leverages a selective scan operation along the sequence dimension and separates sequence length and MSA dimension processing to enhance efficiency while accounting for MSA-level inductive biases. This architecture enables scalable analysis of long DNA sequences, increasing the training context length of previous methods by 8x. In addition, we develop a row-sparse training method that significantly reduces the computational overhead of the selective scan operation during model training. We demonstrate that MSAMamba achieves performance on par with state-of-the-art (SOTA) transformer-based models in variant effect prediction tasks and exceeds their performance at larger context lengths. We also demonstrate that our model excels in GenomicBenchmarks tasks. Our results indicate that MSAMamba mitigates the computational challenges of long-context DNA MSA analysis and sets a new standard for scalability and efficiency in genomic modeling.

Submission Number: 151

Loading