Prior Specification for Exposure-based Bayesian Matrix Factorization

Zicong Zhu; Issei Sato

Prior Specification for Exposure-based Bayesian Matrix Factorization

Zicong Zhu, Issei Sato

Published: 21 Oct 2025, Last Modified: 21 Oct 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: The rapid development of the Internet has resulted in a surge of information, particularly with the rise of recommender systems (RSs). One of the most significant challenges facing existing RS models is data sparsity. To address problems related to sparse data, Bayesian models have been applied to RS systems because of their effectiveness with small sample sizes. However, the performance of Bayesian models is heavily influenced by the choice of prior distributions and hyperparameters. Recent research has introduced an analytical method for specifying prior distributions in generic Bayesian models. The major concept is a statistical technique called Prior Predictive Matching~(PPM), which optimizes hyperparameters by aligning virtual statistics generated by the prior with observed data. This approach aims to reduce the need for repeated and costly posterior inference and enhance overall Bayesian model performance. However, our evaluation of this theoretical method reveals considerable deviations in prior specification estimates as data sparsity increases. In this study, we present an enhanced method for specifying priors in Bayesian matrix factorization models. We improve the estimators by implementing an exposure-based model to better simulate data scarcity. Our method demonstrates significant accuracy improvements in hyperparameter estimation during synthetic experiments. We also explore the feasibility of applying this method to real-world datasets and provide insights into how the model's behavior adapts to varying levels of data sparsity.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: ## Revisions - We revised the illustration of BMF model in Appendix B. - We added more interpretation on the difference between our proposed exposure-based model and the existing one in Section 3. - We adjusted the layout of Table 1 and Figure 1, 2, 3.

Assigned Action Editor: ~Seungjin_Choi1

Submission Number: 4854

Loading