EATNet: Efficient Axial Transformer Network for End-to-end Autonomous Driving

Weihuang Chen, Fanjie Kong, Liming Chen, Shen'ao Wang, Zhiping Wang, Hongbin Sun

Published: 2024, Last Modified: 18 Jul 2025ITSC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: In recent years, end-to-end autonomous driving has garnered significant attention from researchers and has witnessed rapid advancements. However, existing methods en-counter challenges such as high computational demands, slow training and inference speeds, which hinder their real-world deployment. To tackle this issue, we introduce the Efficient Axial Transformer Network (EATNet), a lightweight multi-modal autonomous driving framework based on cross-axial Transformers. By effectively integrating lidar and multi-view RGB features, this model utilizes an enhanced lightweight cross-axial Transformer to minimize model size and computational requirements. Extensive experiments demonstrate that EATNet, with only a quarter of the parameters of comparable multi-modal models, achieves competitive or even superior performance on the closed-loop CARLA simulator compared to other baselines.