Statistical Analysis of an Adversarial Bayesian Weak Supervision Method

Steven An

Statistical Analysis of an Adversarial Bayesian Weak Supervision Method

Steven An

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY-SA 4.0

Keywords: weak supervision, bayesian, adversarial, convergence, consistency

TL;DR: Convergence analysis and experiments of a new label model.

Abstract: Programmatic Weak Supervision (PWS) aims to reduce the cost of constructing large high quality labeled datasets often used in training modern machine learning models. A major component of the PWS pipeline is the label model, which amalgamates predictions from multiple noisy weak supervision sources, i.e. labeling functions (LFs), to label datapoints. While most label models are either probabilistic or adversarial, a recently proposed label model achieves strong empirical performance without falling into either camp. That label model constructs a polytope of plausible labelings using the LF predictions and outputs the "center" of that polytope as its proposed labeling. In this paper, we attempt to theoretically study that strategy by proposing Bayesian Balsubramani-Freund (BBF), a label model that implicitly constructs a polytope of plausible labelings and selects a labeling in its interior. We show an assortment of statistical results for BBF: log-concavity of its posterior, its form of solution, consistency, and rates of convergence. Extensive experiments compare our proposed method against twelve baseline label models over eleven datasets. BBF compares favorably to other Bayesian label models and label models that don't use datapoint features -- matching or exceeding their performance on eight out of eleven datasets.

Supplementary Material: zip

Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)

Submission Number: 26089

Loading