Zero-Shot Extrapolation in State-Space Models for Long-Range Genomics

Published: 05 Mar 2025, Last Modified: 16 Apr 2025ICLR 2025 AI4NA PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: tiny / short paper (up to 3 pages)
Keywords: Genomics, Long Range, Extrapolation, Zero-Shot, State Space Models, Transformers, Large Language Models, DNA Language Model
TL;DR: We show SSMs can zero-shot extrapolate two orders of magnitude beyond their original context length without performance loss, unlike their transformer counterparts. We show this up to 1Mbp, facilitated by our hidden state transfer mechanism.
Abstract: Long-range dependencies are crucial for interpreting genomic structure and function, yet conventional transformer-based genomics models often fail to generalize beyond their training window even when employing sophisticated positional embeddings. We show that State-Space Models (SSMs) can zero-shot extrapolate two orders of magnitude beyond their original context length, thus capturing distal regulatory interactions required for gene expressions without specialized fine-tuning. With our hidden-state transfer mechanism, we can efficiently process ultralong genomic sequences (1Mbp) on a single GPU—providing a scalable, generalizable, and resource-efficient alternative to transformers.
Submission Number: 21
Loading