Joint Precoding and Phase Shift Design for RIS-Aided Cell-Free Massive MIMO With Heterogeneous-Agent Trust Region Policy

Published: 2025, Last Modified: 09 Jan 2026IEEE Trans. Veh. Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Cell-free (CF) massive multiple-input multiple-output (mMIMO) utilizes multiple distributed access points (APs) to achieve high spectral efficiency (SE). However, challenging propagation environments can degrade communication performance due to substantial penetration loss. Integrating a reconfigurable intelligent surface (RIS) into CF mMIMO can mitigate these issues by adjusting the phase and amplitude of the incident signals and adjusting the coefficients of its elements, providing a cost-effective and energy-efficient solution. This paper focuses on optimizing the joint precoding design of RIS-aided CF mMIMO systems to maximize the sum SE. This involves refining the precoding matrix at the APs and the reflection coefficients at the RIS. We introduce a fully distributed heterogeneous-agent reinforcement learning (HARL) algorithm that incorporates trust region policy optimization (TRPO). Unlike conventional multi-agent reinforcement learning (MARL) methods that rely on centralized training and execution, our HATRPO algorithm uses only local channel state information, reducing the need for high backhaul capacity by 13%. Simulation results demonstrate that our HATRPO algorithm significantly improves the sum SE in various scenarios.
Loading