HaloAE: A Local Transformer Auto-Encoder for Anomaly Detection and Localization Based on HaloNet

Emilie Mathian, Huidong Liu, Lynnette Fernandez-Cuesta, Dimitris Samaras, Matthieu Foll, Liming Chen

Published: 01 Jan 2023, Last Modified: 10 Jan 2025VISIGRAPP (5: VISAPP) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Unsupervised anomaly detection and localization is a crucial task in many applications, e.g., defect detection in industry, cancer localization in medicine, and requires both local and global information as enabled by the self-attention in Transformer. However, brute force adaptation of Transformer, e.g., ViT, suffers from two issues: 1) the high computation complexity, making it hard to deal with high-resolution images; and 2) patch-based tokens, which are inappropriate for pixel-level dense prediction tasks, e.g., anomaly localization,and ignores intra-patch interactions. We present HaloAE, the first auto-encoder based on a local 2D version of Transformer with HaloNet allowing intra-patch correlation computation with a receptive field covering 25% of the input image. HaloAE combines convolution and local 2D block-wise self-attention layers and performs anomaly detection and segmentation through a single model. Moreover, because the loss function is generally a weighted sum of sever