Abstract: Estimating the symbolic or analytical form of probability density functions (PDFs) from observed samples is a fundamental challenge in statistical and computational modelling. This process is critical for deriving interpretable and generalizable relationships characterizing the underlying phenomenon. Traditionally, this estimation depends strongly on domain expertise and prior field-specific knowledge, with experts selecting appropriate functional forms or parametric families based on empirical evidence and theoretical understanding. The coefficients of these forms are then typically determined through parameter estimation. In this paper, we develop a framework to estimate symbolic expressions of unnormalized distributions from their observed samples. We integrate deep generative models with symbolic regression (SR), incorporating inductive biases, such as, factorizing large distributions, to keep the problem tractable. The deep generative models we examine include likelihood-based models, viz., flow models, and score-based models. Experiments show the effectiveness of the proposed framework for estimating density functions for multivariate toy distributions as well as lattices from computational Physics, namely, XY model and $\phi^4$ theory. When applied to the renormalization problem in $\phi^4$ theory, it discovers new expressions for action function intractable by traditional analytic approaches, thereby providing physicists with a novel tool for theoretical analysis.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Guillaume_Rabusseau1
Submission Number: 7788
Loading