Abstract: Mirror detection aims to discover mirror regions in images to avoid misidentifying reflected objects. Existing methods mainly mine clues from spatial domain. We observe that the frequencies inside and outside the mirror region are distinctive. Besides, the low-frequency representing the feature semantics can help to locate the mirror region, and the high-frequency representing the details can refine it. Motivated by this, we introduce frequency guidance and propose the dual domain perception progressive refinement network (DPRNet) to mine dual-domain information. Specifically, we first decouple the images into high-frequency and low-frequency components by Laplace pyramid and vision Transformer, respectively, and design the frequency interaction alignment (FIA) module to integrate frequency features to initially localize the mirror region. To handle scale variations, we propose the multi-order feature perception (MOFP) module to adaptively aggregate adjacent features with progressive and gating mechanisms. We further propose the separation-based difference fusion (SDF) module to establish associations between entities and imagings and discover the correct boundary to mine the complete mirror region. Extensive experiments show that DPRNet outperforms the state-of-the-art method by an average of 3% with only about one-fifth of the parameters and FLOPs on four datasets. Our DPRNet also achieves promising performance on remote sensing and camouflage scenarios, validating its generalization. The code is available at https://github.com/winter-flow/DPRNet.
Loading