Encoding the Intrinsic Interaction Information for Vehicle Trajectory Prediction

Published: 2024, Last Modified: 08 Nov 2025IEEE Trans. Intell. Veh. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Due to their strong dependence on high-definition (HD) maps, mainstream methods cannot make accurate predictions of vehicle trajectories under missing map conditions or in dynamically changing scenarios. In this article, the interaction features intrinsic in traffic scenes are utilized and the implicit traffic priors related to vehicle motion in the scene are extracted to improve the accuracy of vehicle trajectory prediction when no HD map is available. First, a graph attention network is constructed to encode the motion state and local interactions between vehicles at each historical timestamp. Then a bi-axial Transformer is introduced to alternately update the global interaction and vehicle motion features. At the same time, a multi-scale structure is proposed to fuse the high-level behavior logic and low-level motion primitives of the agent vehicle. Finally, a trajectory decoder is used to output multi-modal vehicle trajectories. The proposed model was trained and evaluated using the Argoverse1 Forecasting dataset. The experimental results show that all metrics of this method were better than the no-HD map ablations of mainstream prediction models, and even better than some mainstream prediction methods with HD map encoding. In addition, a Lidar-based physical perception platform was set up on an experimental vehicle, and the generalization capability of the proposed method was validated in real traffic scenes.
Loading