Abstract: Highlights•A novel semantic sampling module is proposed to preserve more foreground points.•The geometric point-image mapping module is designed to improve the utilization of dense image features.•A bi-directional attention module fuses multi-modal features effectively.•Our method shows outstanding detection performance compared to both single- and multi-modal detectors.
Loading