Resource Allocation for V2V Communications Using Multi-Agent Reinforcement Learning

Published: 2025, Last Modified: 21 Jan 2026IJCNN 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In a vehicular network, multiple vehicle-to-vehicle (V2V) links could support efficient and reliable V2V communications among vehicles by reusing the frequency spectrum preoccupied by vehicle-to-infrastructure (V2I) links. However, it is difficult and impractical to collect accurate instantaneous channel state information (CSI) in a fast-changing network environment under centralized resource management. To address these complex circumstances involving highly dynamic channel fading and variations of interference, we propose Multiple V2V communications on Twin-Delayed DDPG (MV2V-TD3) for allocating spectrum resources in a decentralized manner. Each V2V link or each vehicle acts as an agent to choose V2I sub-band and transmission power with partial CSI of the vehicular network. The simulation results shows MV2V-TD3 has rapid adaptability to the dynamic vehicular environment and better training performance than other Multi-agent Reinforcement Learning (MARL) methods.The experiments are simulated with the setting of four V2V agents on a map featuring four-lane roads covering nine block streets. Extensive experimental results indicate that V2V agents could effectively and quickly learn to cooperate with each other. Compared to other baseline algorithms, MV2V-TD3 achieves a higher success ratio (SR) for payload delivery along with the improved transmission rate of V2V communications.
Loading