Diffusion-Driven Two-Stage Active Learning for Low-Budget Semantic Segmentation

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Low-Budget Active Learning, Semantic Segmentation, Diffusion Models
TL;DR: A two‐stage active learning pipeline that uses diffusion‐based feature sampling and entropy‐augmented disagreement to pick the most informative pixels under extreme labeling constraints.
Abstract: Semantic segmentation demands dense pixel-level annotations, which can be prohibitively expensive -- especially under extremely constrained labeling budgets. In this paper, we address the problem of low-budget active learning for semantic segmentation by proposing a novel two-stage selection pipeline. Our approach leverages a pre-trained diffusion model to extract rich multi-scale features that capture both global structure and fine details. In the first stage, we perform a hierarchical, representation-based candidate selection by first choosing a small subset of representative pixels per image using MaxHerding, and then refining these into a diverse global pool. In the second stage, we compute an entropy‐augmented disagreement score (eDALD) over noisy multi‐scale diffusion features to capture both epistemic uncertainty and prediction confidence, selecting the most informative pixels for annotation. This decoupling of diversity and uncertainty lets us achieve high segmentation accuracy with only a tiny fraction of labeled pixels. Extensive experiments on four benchmarks (CamVid, ADE-Bed, Cityscapes, and Pascal-Context) demonstrate that our method significantly outperforms existing baselines under extreme pixel‐budget regimes. Our code is available at https://github.com/jn-kim/two-stage-edald.
Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)
Submission Number: 14652
Loading