Memory-efficient cross-modal attention for RGB-X segmentation and crowd counting

Published: 01 Jan 2025, Last Modified: 28 Jul 2025Pattern Recognit. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•Proposes CSCA, a plug-and-play module for multimodal crowd counting and segmentation.•Introduces efficient spatial attention and channel recalibration for better performance.•Achieves state-of-the-art in RGB-Depth, RGB-Thermal, and RGB-Polarization scenarios.
Loading