Keywords: density ratio estimation, scaled bregman divergence, mutual information estimation, representation learning
Abstract: Estimating the discrepancy between two densities ($p$ and $q$) is central to machine learning. Most frequently used methods for the quantification of this discrepancy capture it as a function of the ratio of the densities $p/q$. In practice, closed-form expressions for these densities or their ratio are rarely available. As such, estimating density ratios accurately using only samples from $p$ and $q$ is of high significance and has led to a flurry of recent work in this direction. Among these, binary classification based density ratio estimators have shown great promise and have been extremely successful in specialized domains. However, estimating the density ratio using a binary classifier, when the samples from the densities are well separated, remains challenging. In this work, we first show that the state-of-the-art solutions for such well-separated cases have limited applicability, may suffer from theoretical inconsistencies or lack formal guarantees and therefore perform poorly in the general case. We then present an alternative framework for density ratio estimation that is motivated by the scaled-Bregman divergence. Our proposal is to scale the densities $p$ and $q$ by another density $m$ and estimate $\log p/q$ as $\log p/m - \log q/m$. We show that if the scaling measures are constructed such that they overlap with $p$ and $q$, then a single multi-class logistic regression can be trained to accurately recover $p/m$ and $q/m$ on samples from $p, q$ and $m$. We formally justify our method with the scaled-Bregman theorem and show that it does not suffer from the issues that plague the existing solutions.
One-sentence Summary: Improving density ratio estimation when densities are too far apart by estimating log p/q as log p/m - log q/m with a multi-class classifier
Supplementary Material: zip
13 Replies
Loading