A Voronoi Density-Based Locally Unique Network for Fine-Grained Multi-Label Classification

Published: 2025, Last Modified: 21 Oct 2025IEEE Trans. Circuits Syst. Video Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multi-label image classification aims to classify all categories in images simultaneously. When current multi-label classification methods meet fine-grained objects in a single image, the extreme inter-class similarity and over-prediction problems are two major challenges that hinder model performance. To solve the above two problems, we propose Voronoi density based Locally Unique Network (VoLUNet). First, due to high correlation between predictions of different classes, following the Kolmogorov-Arnold Network (KAN), we design the Weak Inter-class Correlation Classifier (WIC-Classifier) to replace linear weights setting in MLP architecture, promoting the potential of fine-grained discrimination. Second, we propose a Local Non-Maximum Suppression (Local-NMS) loss to multi-label classification model, predicting only one unique class with high prediction value for each local region. Third, different classes may have different pixel proportions and Local-NMS loss will be imbalanced for diverse fine-grained classes, we design the Voronoi Density based Superpixel Module (VDSM) to balance the quantities of local feature vectors with different classes. Finally, comprehensive experiments are conducted on four datasets, TreeSatAI, GeoLifeCLEF, FothemNet and ShipRSImageNet, and our VoLUNet can significantly improve the classification performance compared to current state-of-the-art models. Codes of this paper are public available at https://github.com/cv516Buaa/BinghaoLiu/tree/main/VoLUNet
Loading