PBECount: Prompt-Before-Extract Paradigm for Class-Agnostic Counting

Published: 01 Jan 2025, Last Modified: 30 Oct 2025AAAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the field of class-agnostic counting (CAC), counting only objects of interest that are similar to exemplars in multi-class scenarios has been a challenging task. To address this challenge, recent research has proposed the extract-and-match paradigm based on the vision transformer (ViT) architecture. However, although this paradigm can improve the accuracy of exemplar-similar object identification, it overly emphasizes the role of the ViT structure. To address this shortcoming, this work introduces a more generalized prompt-before-extract paradigm on top of the extract-and-match paradigm and designs a pure convolutional neural network (CNN) model named PBECount. In addition, an innovative loss function, a post-processing strategy, and a dynamic threshold method are proposed to enhance the detection performance of the proposed model when the probability maps are used as ground truth during model training. The experimental results on the FSC-147 and CARPK datasets demonstrate that the proposed PBECount can identify whether unknown class objects are similar to exemplars and outperform the state-of-the-art CAC methods in terms of accuracy and generalization.
Loading