Demystifying Protein Generation with Hierarchical Conditional Diffusion Models

Published: 24 Sept 2025, Last Modified: 15 Oct 2025NeurIPS2025-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: Hierarchical Conditional Diffusion
Abstract: Generating novel and functional protein sequences is critical to a wide range of applications in biology. Recent advancements in conditional diffusion models have shown impressive empirical performance in protein generation tasks. However, reliable generation of proteins remains an open research question in de novo protein design, especially when it comes to conditional diffusion models. Considering the biological function of a protein is determined by multi-level structures, we propose a novel multi-level conditional diffusion model that integrates both sequence-based and structure-based information for efficient end-to-end protein design guided by specified functions. By generating representations at different levels simultaneously, our framework can effectively model the inherent hierarchical relations between different levels, resulting in an informative and discriminative representation of the generated protein. We also propose Protein-MMD (Maximum Mean Discrepancy), a new reliable evaluation metric, to evaluate the quality of generated protein with conditional diffusion models. Our new metric is able to capture both distributional and functional similarities between real and generated protein sequences while ensuring conditional consistency. Using conditional protein generation tasks with benchmark datasets, we demonstrate the efficacy of the proposed protein generation framework and evaluation metric.
Submission Number: 223
Loading