Abstract: Recent advancements have shown the potential of lever-aging both point clouds and images to localize anomalies. Nevertheless, their applicability in industrial manufacturing is often constrained by significant drawbacks, such as the use of memory banks, which leads to a substantial in-crease in terms of memory footprint and inference times. We propose a novel light and fast framework that learns to map features from one modality to the other on nominal samples and detect anomalies by pinpointing inconsisten-cies between observed and mapped features. Extensive ex-periments show that our approach achieves state-of-the-art detection and segmentation performance in both the stan-dard and few-shot settings on the MVTec 3D-AD dataset while achieving faster inference and occupying less memory than previous multimodal AD methods. Furthermore, we propose a layer pruning technique to improve memory and time efficiency with a marginal sacrifice in performance.
Loading