Token-Level Guided Discrete Diffusion for Membrane Protein Design

NeurIPS 2025 Workshop FPI Submission19 Authors

Published: 23 Sept 2025, Last Modified: 25 Nov 2025FPI-NEURIPS2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Main Track
Keywords: Discrete diffusion, per-token sampling, membrane protein designMeMDLM designs functional membrane proteins with per-token diffusion guidance and is the first diffusion model experimentally validated for transmembrane insertion
TL;DR: MeMDLM designs functional membrane proteins with per-token diffusion guidance and is the first diffusion model experimentally validated for transmembrane insertion
Abstract: Reparameterized diffusion models (RDMs) have recently matched autoregressive methods in protein generation, motivating their application to membrane proteins, which contain interleaved soluble and transmembrane (TM) regions. We present the MeMbrane Diffusion Language Model (MeMDLM), a fine-tuned RDM-based protein language model for controllable membrane protein design. MeMDLM-generated sequences recapitulate the TM residue density and structural features of natural proteins and outperform state-of-the-art diffusion baselines in motif scaffolding with lower perplexity, higher BLOSUM-62 scores, and improved pLDDT confidence. To introduce specific functional properties in our designs, we develop Per-Token Guidance (PET), a classifier-guided sampling strategy that solubilizes sequences while preserving conserved TM domains, reducing TM density without disrupting functional cores. Importantly, MeMDLM designs validated in TOXCAT β-lactamase wet lab assays insert into membranes, distinguishing high-quality from poor designs. Together, MeMDLM and PET establish the first experimentally validated diffusion-based framework for rational membrane protein generation.
Submission Number: 19
Loading