Abstract: Camouflaged instance segmentation is a challenging task due to the various aspects such as color, structure, lighting, etc., of object instances embedded in complex backgrounds. Although the current DETR-based scheme simplifies the pipeline, it suffers from a large number of object queries, leading to many false positive instances. To address this issue, we propose an adaptive query selection mechanism. Our research reveals that a large number of redundant queries scatter the extracted features of the camouflaged instances. To remove these redundant queries with weak correlation, we evaluate the importance of the object query from the perspectives of information entropy and volatility. Moreover, we observed that occlusion and overlapping instances significantly impact the accuracy of the selection mechanism. Therefore, we design a boundary location embedding mechanism that incorporates fake instance boundaries to obtain better location information for more accurate query instance matching. We conducted extensive experiments on two challenging camouflaged instance segmentation datasets, namely COD10K and NC4K, and demonstrated the effectiveness of our proposed model. Compared with the OSFormer, our model significantly improves the performance by 3.8\% AP and 5.6\% AP with less computational cost, achieving the state-of-the-art of 44.8 AP and 48.1 AP with ResNet-50 on the COD10K and NC4K test-dev sets, respectively.
Relevance To Conference: This work contributes to multimedia/multimodal processing by addressing the challenging task of camouflaged instance segmentation within complex backgrounds. The proposed adaptive query selection mechanism aims to improve the accuracy of object instance detection by reducing false positives and removing redundant queries. By evaluating the importance of object queries based on information entropy and volatility, the model optimizes feature extraction and enhances the discrimination between relevant and irrelevant queries. Additionally, the integration of a boundary location embedding mechanism addresses issues related to occlusion and overlapping instances, leading to more accurate query instance matching. Through extensive experiments on challenging datasets, the effectiveness of the proposed model is demonstrated, achieving state-of-the-art performance with reduced computational costs. Overall, this research contributes to advancing multimedia/multimodal processing by improving the efficiency and accuracy of camouflaged instance segmentation, which is crucial for various applications including image and video understanding in complex environments.
Supplementary Material: zip
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Generation] Multimedia Foundation Models
Submission Number: 1755
Loading