Keywords: probabilistic circuits, structured sparsity
Abstract: Probabilistic circuits (PCs) are a tractable representation of probability distributions allowing for exact and efficient computation of likelihoods and marginals. Recent advancements have focused on improving the scalability and expressiveness of PCs by leveraging their sparse properties or tensorized operations. However, no existing method fully exploits both aspects simultaneously.
In this paper, we propose a novel structured sparse parameterization for the sum blocks in PCs. By replacing dense matrices with sparse Monarch matrices, we significantly reduce memory and computation costs, enabling scalable training of PCs. From a theory perspective, our method arises naturally from circuit multiplication; from a practical perspective, the structured sparsity of Monarch matrices facilitates efficient tensorization and parallelization. Experimental results demonstrate that our approach not only achieves state-of-the-art performance on challenging tasks, including density estimation on ImageNet32 and language model distillation, but also demonstrates superior computational efficiency, achieving the same performance with less computation as measured by the number of floating-point operations (FLOPs) during training.
Submission Number: 22
Loading