Region-Awared Transformer with Asymmetric Loss in Multi-Label Classification

Published: 01 Jan 2023, Last Modified: 01 Oct 2024ICASSP 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multi-label image classification (MLIC) deals with assigning multiple labels to each image, a easy task for human being while still a open problem in machine learning. The greatest challenge in MLIC lies in that different target objects in one image keep distinct viewpoints and scales. One effective way is to borrow the label-related information to guide the selection of interesting region, which will act an important role in classification. By leveraging the attention mechanism in transformer, we propose a region-awared transformer to focus on top related regions and neglect background interference. Furthermore, our approach can cope with the positive-negative imbalance by assigning them different exponential decay factors of positive and negative samples separately. Experiments on MS-COCO show a competitive performance against other state-of-the-art methods.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview