Energy-Aware Imitation Learning for Steering Prediction Using Events and Frames

Hu Cao; Jiong Liu; Xingzhuo Yan; Rui Song; Yan Xia; Walter Zimmer; Guang Chen; Alois Knoll

Energy-Aware Imitation Learning for Steering Prediction Using Events and Frames

Hu Cao, Jiong Liu, Xingzhuo Yan, Rui Song, Yan Xia, Walter Zimmer, Guang Chen, Alois Knoll

20 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Event camera; Multi-modal fusion; Energy function; Steering prediction

Abstract: In autonomous driving, relying solely on frame-based cameras can lead to inaccuracies caused by factors like long exposure times, high-speed motion, and challenging lighting conditions. To address these issues, we introduce a bio-inspired vision sensor known as the event camera. Unlike conventional cameras, event cameras capture sparse, asynchronous events that provide a complementary modality to mitigate these challenges. In this work, we propose an energy-aware imitation learning framework for steering prediction that leverages both events and frames. Specifically, we design an Energy-driven Cross-modality Fusion Module (ECFM) and an energy-aware decoder to produce reliable and safe predictions. Extensive experiments on two public real-world datasets, DDD20 and DRFuser, demonstrate that our method outperforms existing state-of-the-art (SOTA) approaches. The codes will be released upon acceptance.

Primary Area: applications to robotics, autonomy, planning

Submission Number: 24207

Loading