Abstract: In the Multiple Instance Learning (MIL) scenario, the training data consists of instances
grouped into bags. Bag labels specify whether
each bag contains at least one positive instance, but instance labels are not observed.
Recently, Haußmann et al [10] tackled the
MIL instance label prediction task by introducing the Multiple Instance Learning
Gaussian Process Logistic (MIL-GP-Logistic)
model, an adaptation of the Gaussian Process
Logistic Classification model that inherits its
uncertainty quantification and flexibility. Notably, they give a fast mean-field variational
inference procedure. However, due to their
use of the logit link, they do not maximize the
variational inference ELBO objective directly,
but rather a lower bound on it. This approximation, as we show, hurts predictive performance. In this work, we propose the Multiple
Instance Learning Gaussian Process Probit
(MIL-GP-Probit) model, an adaptation of the
Gaussian Process Probit Classification model
to solve the MIL instance label prediction
problem. Leveraging the analytical tractability of the probit link, we give a variational
inference procedure based on variable augmentation that maximizes the ELBO objective directly. Applying it, we show MIL-GP-Probit is
more calibrated than MIL-GP-Logistic on all
20 datasets of the benchmark 20 Newsgroups
dataset collection, and achieves higher AUC
than MIL-GP-Logistic on an additional 51
out of 59 datasets. Finally, we show how the
probit formulation enables principled bag label predictions and a Gibbs sampling scheme.
This is the first exact inference scheme for
any Bayesian model for the MIL scenario
Loading