Abstract: We introduce RedMotion, a transformer model for motion prediction in self-driving vehicles that learns environment representations via redundancy reduction. Our first type of redundancy reduction is induced by an internal transformer decoder and reduces a variable-sized set of local road environment tokens, representing road graphs and agent data, to a fixed-sized global embedding. The second type of redundancy reduction is obtained by self-supervised learning and applies the redundancy reduction principle to embeddings generated from augmented views of road environments. Our experiments reveal that our representation learning approach outperforms PreTraM, Traj-MAE, and GraphDINO in a semi-supervised setting. Moreover, RedMotion achieves competitive results compared to HPTR or MTR++ in the Waymo Motion Prediction Challenge. Our open-source implementation is available at: https://github.com/kit-mrt/future-motion
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Added experiments on the Argoverse 2 Forecasting dataset, minor text changes, and some additional references.
Code: https://github.com/kit-mrt/future-motion
Assigned Action Editor: ~Yonatan_Bisk1
Submission Number: 2123
Loading