UIP-AD: Learning Unified Intrinsic Prototypes for Multimodal Anomaly Detection

UIP-AD: Learning Unified Intrinsic Prototypes for Multimodal Anomaly Detection

ICLR 2026 Conference Submission16616 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Anomaly Detection; Multi-modal

Abstract: Multimodal anomaly detection, which combines appearance (RGB) and geometric (3D) data, is crucial for enhancing industrial inspection accuracy. However, prevailing fusion strategies, whether based on separate memory banks or direct feature-level integration, struggle to model the joint probability distribution of multimodal normality, leaving them vulnerable to cross-modal inconsistencies. In this paper, we address this by shifting the paradigm from feature-stitching to learning a unified, conceptual representation via Unified Intrinsic Prototypes (UIPs). Our framework dynamically extracts a single, compact set of these prototypes from a deeply-fused feature space to holistically represent the joint appearance-geometry distribution of a given sample. These powerful UIPs then guide a novel, principled reconstruction stage, where parallel decoders are optimized to enforce the learned consistency across modalities. Anomalies that violate this consistency fail the reconstruction process and are exposed as large, localizable errors. Extensive experiments show that our framework establishes a new state-of-the-art on multiple challenging benchmarks, including MVTec 3D-AD, Real-IAD D³, and Eyecandies, validating the superiority of our approach on single-class tasks and offering a robust, principled solution for 3D industrial inspection.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 16616

Loading