Divide and Conquer: Learning Label Distribution with Subtasks

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0
Abstract: Label distribution learning (LDL) is a novel learning paradigm that emulates label polysemy by assigning label distributions over the label space. However, recent LDL work seems to exhibit a notable contradiction: 1) existing LDL methods employ auxiliary tasks to enhance performance, which narrows their focus to specific applications, thereby lacking generalizability; 2) conversely, LDL methods without auxiliary tasks rely on losses tailored solely to the primary task, lacking beneficial data to guide the learning process. In this paper, we propose S-LDL, a novel and minimalist solution that generates subtask label distributions, i.e., a form of extra supervised information, to reconcile the above contradiction. S-LDL encompasses two key aspects: 1) an algorithm capable of generating subtasks without any prior/expert knowledge; and 2) a plug-andplay framework seamlessly compatible with existing LDL methods, and even adaptable to derivative tasks of LDL. Our analysis and experiments demonstrate that S-LDL is effective and efficient. To the best of our knowledge, this paper represents the first endeavor to address LDL via subtasks.
Lay Summary: Teaching ML models to handle ambiguous labels, where an input might belong to multiple categories in different proportions, is challenging. Current approaches either: 1) rely on extra, task-specific knowledge that limits their broader use, or 2) use simple methods that miss valuable learning signals. We introduce $\mathcal{S}$-LDL, a flexible solution that automatically creates "helper tasks" (like breaking down into simpler sub-problems) to guide the learning with no expert input needed. Imagine teaching someone to identify all ingredients in a smoothie by first practicing with different fruit pairs (strawberry-banana, then mango-pineapple-strawberry), before tackling the full blend. $\mathcal{S}$-LDL works seamlessly with existing methods and even adapts to more related tasks. Experiments show it’s both effective and efficient. This is the first work to tackle such problems using generated subtasks.
Link To Code: https://github.com/SpriteMisaka/PyLDL
Primary Area: General Machine Learning->Supervised Learning
Keywords: label distribution learning, subtask, label polysemy
Submission Number: 5864
Loading