Sum-Product-Attention Networks: Leveraging Self-Attention in Energy-Based Probabilistic CircuitsDownload PDF

Published: 26 Jul 2022, Last Modified: 03 Nov 2024TPM 2022Readers: Everyone
Keywords: sum-product networks, attention, energy-based models
TL;DR: We propose a novel energy-based generative model that integrates probabilistic circuits with the self-attention mechanism of Transformers
Abstract: Energy-based models (EBMs) have been hugely successful both as generative models and likelihood estimators. However, the standard way of sampling for EBMs is inefficient and highly dependent on the initialization procedure. We introduce Sum-Product-Attention Networks (SPAN), a novel energy-based generative model that integrates probabilistic circuits with the self-attention mechanism of Transformers. SPAN uses self-attention to select the most relevant parts of Probabilistic circuits (PCs), here sum-product networks (SPNs), to improve the modeling capability of EBMs. We show that while modeling, SPAN focuses on a specific set of independent assumptions in every product layer of the SPN. Our empirical evaluations show that SPAN outperforms energy-based and classical generative models, as well as state-of-the-art probabilistic circuit models in out-of-distribution detection. Further evaluations show that SPAN also generates better quality images when compared to EBMs and PCs.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/sum-product-attention-networks-leveraging/code)
1 Reply

Loading