Keywords: Multi-label classification; Noisy multi-label learning; Diffusion model; Co-occurrence-aware
TL;DR: We propose a Co-Occurrence-Aware Diffusion (CAD) model that improves noisy multi-label learning by using generative methods, achieving over 5% performance improvement.
Abstract: Noisy labels often compel models to overfit, especially in multi-label classification tasks. Existing methods for noisy multi-label learning (NML) primarily follow a discriminative paradigm, which relies on noise transition matrix estimation or small-loss strategies to correct noisy labels. However, they remain substantial optimization difficulties compared to noisy single-label learning. In this paper, we propose a Co-Occurrence-Aware Diffusion (CAD) model, which reformulates NML from a generative perspective. We treat features as conditions and multi-labels as diffusion targets, optimizing the diffusion model for multi-label learning with theoretical guarantees. Benefiting from the diffusion model's strength in capturing multi-object semantics and structured label matrix representation, we can effectively learn the posterior mapping from features to true multi-labels. To mitigate the interference of noisy labels in the forward process, we guide generation using pseudo-clean labels reconstructed from the latent neighborhood space, replacing original point-wise estimates with neighborhood-based proxies. In the reverse process, we further incorporate label co-occurrence constraints to enhance the model's awareness of incorrect generation directions, thereby promoting robust optimization. Extensive experiments on both synthetic (Pascal-VOC, MS-COCO) and real-world (NUS-WIDE) noisy datasets demonstrate that our approach outperforms state-of-the-art methods.
Supplementary Material: zip
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 802
Loading