Abstract: Developing models that excel simultaneously at robust classification and high-fidelity generative modeling remains a significant challenge. While hybrid approaches like Joint Energy-Based Models (JEM) offer a path by interpreting classifiers as energy-based models (EBMs), they often rely on SGLD-based training for the generative component, which suffers from instability and poor sample quality. To address this, we propose a novel training framework that integrates adversarial training principles for both discriminative robustness and stable generative learning within a unified JEM-based architecture. Our approach introduces two key innovations: (1) replacing traditional SGLD-based EBM learning with a more stable AT-based strategy that optimizes the energy function using a Binary Cross-Entropy objective discriminating real data from contrastive samples generated via PGD attacks, and (2) a two-stage training procedure with decoupled data augmentation strategies for the discriminative and generative components. Extensive experiments across CIFAR10, CIFAR100, and RestrictedImageNet datasets demonstrate that our method consistently maintains competitive robust accuracy while substantially improving generative quality compared to existing hybrid models. In addition, our model's improved generative capabilities directly transfer to producing higher quality counterfactual examples, which contributes to better model explainability. Our work presents a promising direction for building robust, stable, and high-performing joint discriminative and generative models.
Loading