Make Small Data Great Again: Learning from Partially Annotated Data via Policy Gradient for Multi-Label Classification Tasks
Traditional supervised learning methods are heavily reliant on human-annotated datasets. However, obtaining comprehensive human annotations proves challenging in numerous tasks, especially multi-label tasks. Therefore, we investigate the understudied problem of partially annotated multi-label classification. This scenario involves learning from a multi-label dataset where only a subset of positive classes is annotated. This task encounters challenges associated with a scarcity of positive annotations and severe label imbalance. To overcome these challenges, we propose Partially Annotated reinforcement learning with a Policy Gradient algorithm (PAPG), a framework combining the exploration capabilities of reinforcement learning with the exploitation strengths of supervised learning. By introducing local and global rewards to address class imbalance issues and employing an iterative training strategy equipped with data enhancement, our framework showcases its effectiveness and superiority across diverse classification tasks.