Semantic Alignment for Effective Feature Fusion in Real-Time Object Detection

Hyungseop Lee; Jiho Lee; Woochul Kang

Semantic Alignment for Effective Feature Fusion in Real-Time Object Detection

Hyungseop Lee, Jiho Lee, Woochul Kang

16 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: object detection, multi-scale feature fusion, feature alignment, cross-level attention, lightweight attention

TL;DR: We propose a broadly applicable and lightweight semantic alignment module that improves multi-scale feature fusion via linear cross-level attention paired with a spatial bottleneck design.

Abstract: Feature fusion networks are essential components in modern object detectors, aggregating multi-scale features from hierarchical levels to detect objects of varying sizes. However, a significant challenge is that fusing features from different levels often leads to semantic inconsistency due to their distinct representations. While many prior works have attempted to address this, they often incur substantial computational and parameter overhead, limiting their real-time applicability, and in some cases lack generality across different detection architectures. In this work, we propose a novel lightweight semantic alignment module called Feature Interaction NEtwork (FINE). This module refines low-level features by integrating high-level contextual cues via a cross-level attention mechanism prior to fusion. To minimize overhead, FINE combines a kernel-based linear attention with a novel spatial bottleneck design. This design drastically reduces the attention sequence length while preserving the channel-wise semantics essential for effective semantic alignment. FINE is generally applicable to various detectors, including Faster R-CNN, YOLO series, and RT-DETR, and consistently improves detection accuracy without compromising efficiency.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 6418

Loading