Keywords: multiple models, unlabeled data, mixture models
TL;DR: We propose a framework to compare classifiers using both labeled and unlabeled data by relying on a mixture modeling technique.
Abstract: It remains difficult to select a machine learning model from a set of candidates in the absence of a large, labeled dataset. To address this challenge, we propose a framework to compare multiple models that leverages three aspects of modern machine learning settings: multiple machine learning classifiers, continuous predictions on all examples, and abundant unlabeled data. The key idea is to estimate the joint distribution of classifier predictions using a mixture model, where each component corresponds to a different class. We present preliminary experiments on a large health dataset and conclude with future directions.
Submission Number: 65
Loading