Abstract: We develop an empirical Bayes prior for probabilistic matrix factorization.
Matrix factorization models each cell of a matrix with two latent variables, one
associated with the cell's row and one associated with the cell's column.
How to set the priors of these two latent variables?
Drawing from empirical Bayes principles, we consider estimating the priors from data, to
find those that best match the populations of row and column latent vectors.
Thus we develop the twin population prior.
We develop a variational inference algorithm to simultaneously learn the empirical priors
and approximate the corresponding posterior. We evaluate this approach with both
synthetic and real-world data on diverse applications: movie ratings, book ratings, single-cell gene expression data, and musical preferences. Without needing to tune Bayesian hyperparameters, we
find that the twin population prior leads to high-quality predictions, outperforming manually tuned priors.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=3U6yYhaQI6
Changes Since Last Submission: Prepared the camera-ready version:
- Added authors and their affiliations
- Updated URL to publicly available github repository for the implementation
- Added the acknowledgment section
- Minor spelling corrections in the supplementary material.
Code: https://github.com/blei-lab/TwinEB
Supplementary Material: zip
Assigned Action Editor: ~Tom_Rainforth1
Submission Number: 2894
Loading