Janus: An Efficient and Expressive Subquadratic Architecture for Modeling Biological Sequences

Published: 21 Jun 2024, Last Modified: 26 Jul 2024ES-FoMo-II 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: State Space Models, Biological sequence Modeling, Proteomics, Genomics, Gated Convolutions
TL;DR: Janus is an efficient subquadratic model that achieves SOTA performance across genomic, CRISPR, and proteomic domains with up to 23,636x fewer parameters.
Abstract: Deep learning tools such as convolutional neural networks (CNNs) and transformers have spurred great advancements in computational biology. However, existing methods are constrained architecturally in context length, computational complexity, and model size. This paper introduces Janus, a sub-quadratic architecture for sequence modeling, which combines projected gated convolutions and structured state spaces to achieve local and global context with single-nucleotide resolution. Janus outperforms CNN-, GPT-, BERT-, and long convolution-based models in many tested genomics tasks without pre-training and with 4x-781x fewer parameters. In the proteomics domain, Janus similarly outperforms pre-trained attention-based models, including ESM-1B and TAPE-BERT, on remote homology prediction without pre-training and while using 3,308x-23,636x fewer parameters. Janus couples these performance improvements with reduced wall-clock times, showing up to 50x speed-up compared to ESM1b and 7x speed-up compared to DistilProtBert for sequences of length up to 16,384.
Submission Number: 50
Loading