Abstract: To address the challenges faced by current public transportation systems, researchers have proposed the Autonomous Modular Bus (AMB), in which individual bus units can connect or disconnect while in motion, allowing passengers to transfer between units, significantly improving convenience and transportation efficiency. However, the high precision required for docking on public roads presents a major challenge, as existing autonomous driving systems struggle to achieve centimeter-level accuracy and exhibit sensitivity to target distances, particularly in perception modules where errors are pronounced. This paper introduces a novel distance-adaptive high-precision sensing module that fuses data from LiDAR and cameras. By utilizing an ensemble learning approach based on attention mechanisms, the proposed module enhances the perception accuracy of autonomous systems. Experiments conducted on the publicly available nuScenes dataset and a self-collected dataset show improvements over state-of-the-art methods. Additionally, the proposed method demonstrates strong adaptability across different distances, making it suitable for all docking scenarios in AMB applications and laying the groundwork for the practical implementation of AMB systems.