FAD-TQ: Industrial Fine-grained Anomaly Detection with Thinking Quality

ICLR 2026 Conference Submission17108 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Anomaly detection, Industrial defect inspection, GRPO
TL;DR: We proposed a new RL method and benchmark for fine-grained anomaly detection
Abstract: Recent research in industrial anomaly detection (IAD) has shifted beyond binary classification and segmentation, increasingly focusing on process-level, interpretable reasoning about the type and cause of anomalies. While multimodal large language models (MLLMs) have enabled this reformulation through visual question answering, current anomaly detection methods still suffer from two major limitations: the limited capacity of reward functions to capture intricate complexities and the reliance on generating supervised fine-tuning (SFT) data. Hence, we propose FAD-TQ, a lightweight reinforcement learning framework for finegrained anomaly detection with thinking quality. Built upon the Group Policy Gradient paradigm, it eliminates the reference model and KL regularization to reduce rollout overhead and directly optimize the original reinforcement learning objective. To enable fine-grained guidance over the reasoning process, we design a thinking quality reward composed of two components: an efficiency reward that penalizes redundant reasoning, and a relevance reward that encourages taskaligned, coherent thought trajectories. Furthermore, we introduce MVTec-LOCOAD-Pair3C, a principled evaluation protocol built on the existing dataset. By defining three decision types—normal, structural anomaly, and logical anomaly, rather than binary classification. Extensive experiments demonstrate that FAD-TQ improves interpretability, accuracy, streamlined reasoning and training efficiency with reduced computational costs. It demonstrates the potential of using smallscale benchmarks to evaluate MLLM capabilities in IAD. We hope this framework and evaluation protocol can serve as an example for future research on processlevel reasoning in anomaly detection.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 17108
Loading