Multi-Attribute Consistency Driven Visual Language Framework for Surface Defect Detection

Bin Kang, Bin Chen, Junjie Wang, Weizhi Xian, Huifeng Chang

Published: 2024, Last Modified: 13 Nov 2024ICME 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Visual Language Pre-training models encounter significant challenges stemming from the scarcity of data and the presence of ambiguous cues in industrial defect detection tasks. In this work, we propose a multi-attribute consistency-driven defect detection (MACD) framework to optimize text prompts in a coarse-to-fine trajectory. To bridge differences in domain knowledge, we build a structured attribute repository that contains descriptions of various defects’ inherent attributes. Based on this, we propose a multi-attribute consistency (MAC) module that can adequately model the global alignment between sentences with multiple attributes and defect images. Furthermore, we design a refined cross-alignment (RCA) module to determine the fine-grained correspondence between each attribute and the region within the image. Finally, the proposed method is experimentally validated on two benchmarks, resulting in significant performance improvements in a wide range of defective scenarios.