Latent Feature Pyramid Network for Object Detection

Published: 01 Jan 2023, Last Modified: 28 Aug 2025IEEE Trans. Multim. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Object detection methods based on Convolution Neural Networks (CNN) usually utilize feature pyramid networks to detect objects with various scales. The state-of-the-art feature pyramid networks improve detection accuracy by enhancing multi-level feature representations. Fusing multi-level features is the most effective manner to enhance the feature representations. However, the existing feature pyramid networks usually fuse multi-level features by element-wise operations. It leads to the lack of long-range dependencies in the feature fusion. To address the problem, we propose a simple yet efficient feature pyramid network named latent feature pyramid network (LFPN). LFPN can enhance the feature representations by modeling inner-scale and cross-scale long-range dependencies through conducting inner-scale and cross-scale feature fusion in the latent space. Comprehensive experiments are performed on two challenge object detection datasets: MS COCO and Pascal VOC. The experimental results show consistent improvements on various feature pyramid networks, backbones, and object detectors, which demonstrates the effectiveness and generality of our LFPN.
Loading