TL;DR: We resolve challenges with discrete energy-based models using smoothed data manifolds and single-step denoising and apply our method to antibody protein generation.
Abstract: We resolve difficulties in training and sampling from discrete energy-based models (EBMs) by learning a smoothed energy landscape, sampling the smoothed data manifold with Langevin Markov chain Monte Carlo, and projecting back to the true data manifold with one-step denoising. Our Smoothed Discrete Sampling formalism combines the attractive properties of EBMs and improved sample quality of score-based models, while simplifying training and sampling by requiring only a single noise scale. We demonstrate the robustness of our approach on generative modeling of antibody proteins and successfully express and purify 97% of generated designs in a single round of laboratory experiments.
1 Reply
Loading