CAPS-Net: A Cross-scale Patch-Attention Prototype-guided Siamese Network for Effective Anomaly Detection
Keywords: Prototype-guided Siamese Network; Anomaly Detection; Cross-scale Patch-Attention
TL;DR: Propose a Siamese network to compare target and reference samples, learning residual features across different scales and conditions to detect abnormal regions and patterns.
Abstract: Detecting and localizing abnormal artifacts is critical for ensuring product quality in industrial manufacturing. However, such anomalies are inherently rare and difficult to capture, resulting in imbalanced datasets that hinder effective model training. Moreover, anomalies often exhibit subtle, diverse, and dynamic visual characteristics, further complicating detection and localization. To address these challenges, we propose the Cross-scale patch-Attention Prototype-guided Siamese Network (CAPS-Net)—a novel framework that leverages a Siamese network to compare target and reference samples, learning residual features across scales and conditions to identify unknown anomalous regions. CAPS-Net introduces two key components: the Scale-Adaptive Channel Attention (SACA) and Cross-scAle Patch Attention (CAPA) modules. These components collaboratively enhance multi-scale feature interactions, enabling the model to detect unknown anomalies of varying sizes in visually complex environments. In addition, the model incorporates multiple anomaly generation techniques, including multiscale prototypes, to better distinguish abnormal patterns from normal data. As a result, CAPS-Net is capable of detecting both tiny and large-sized defects simultaneously. We evaluatedd the proposed method on the BeanTech Anomaly Detection (BTAD) dataset, demonstrating that CAPS-Net consistently outperforms state-of-the-art methods in both unsupervised and supervised settings.
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: true
Submission Guidelines: true
Anonymous Url: true
No Acknowledgement Section: true
Submission Number: 14363
Loading