Keywords: Multi-Agent,Multi-Label,Bandit, Training-Optimization
Abstract: We propose a novel collaborative multi-agent optimization framework for adaptive training in multi-label image classification, fundamentally advancing beyond static decision rules and isolated automation. Our method deploys a set of distributed, task-specific agents, each responsible for dynamically orchestrating critical training components—including data augmentation, optimization methods, learning rate schedules, and loss functions—according to evolving visual-semantic relationships and training states. Each agent employs an advanced non-stationary multi-armed bandit algorithm, integrating both $\epsilon$-greedy and upper confidence bound strategies, to judiciously balance exploration with exploitation throughout the training lifecycle. A hierarchical composite reward mechanism synergizes overall classification accuracy, rare class recognition, and training stability, fostering both independent optimization and implicit collaborative behavior among agents. The framework further leverages refined techniques such as dual-rate exponential moving average smoothing and structured mixed-precision training to enhance robustness and computational efficiency. Extensive experiments across benchmarks including Pascal VOC, COCO, Yeast, and Mediamill demonstrate that our approach achieves superior mean average precision and rare-class F1 scores compared to state-of-the-art methods, while also exhibiting rapid convergence and remarkable cross-domain generalization. Our results indicate that collaborative multi-agent adaptive optimization offers a scalable and principled solution for self-optimizing deep learning in complex multi-label scenarios.
Supplementary Material: zip
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 4498
Loading