Population Priors for Matrix Factorization

Published: 24 Dec 2024, Last Modified: 24 Dec 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We develop an empirical Bayes prior for probabilistic matrix factorization. Matrix factorization models each cell of a matrix with two latent variables, one associated with the cell's row and one associated with the cell's column. How to set the priors of these two latent variables? Drawing from empirical Bayes principles, we consider estimating the priors from data, to find those that best match the populations of row and column latent vectors. Thus we develop the twin population prior. We develop a variational inference algorithm to simultaneously learn the empirical priors and approximate the corresponding posterior. We evaluate this approach with both synthetic and real-world data on diverse applications: movie ratings, book ratings, single-cell gene expression data, and musical preferences. Without needing to tune Bayesian hyperparameters, we find that the twin population prior leads to high-quality predictions, outperforming manually tuned priors.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=3U6yYhaQI6
Changes Since Last Submission: Prepared the camera-ready version: - Added authors and their affiliations - Updated URL to publicly available github repository for the implementation - Added the acknowledgment section - Minor spelling corrections in the supplementary material.
Code: https://github.com/blei-lab/TwinEB
Supplementary Material: zip
Assigned Action Editor: ~Tom_Rainforth1
Submission Number: 2894
Loading