ShortListing Model: A Streamlined Simplex Diffusion for Biological Sequence Generation

Published: 05 Mar 2025, Last Modified: 05 Mar 2025MLGenX 2025EveryoneRevisionsBibTeXCC BY 4.0
Track: Main track (up to 8 pages)
Abstract:

Generative modeling of discrete variables is challenging yet crucial for applications in natural language processing and biological sequence design. We introduce the Shortlisting Model (SLM), a novel simplex-based diffusion model inspired by progressive candidate pruning. SLM operates on simplex centroids, reducing complexity and enhancing scalability. Additionally, SLM incorporates a flexible implementation of classifier-free guidance, enhancing unconditional generation performance. Extensive experiments in DNA promoter and enhancer design, and protein design demonstrate SLM's competitive performance and scalability.

Submission Number: 28
Loading