Reinforcement Learning for Power Management in Low-margin Optical Networks

Published: 2024, Last Modified: 09 Jan 2026ICTON 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We explore the Q-learning and Proximal Policy Optimization (PPO) algorithms for solving the Routing, Wavelength, and Power Allocation (RWPA) problem. The results indicate that reinforcement learning can significantly optimize launch power in low-margin optical networks, with Q-learning achieving better results than PPO, delivering improvement in GSNR of over $175 \%$ and over $48 \%$ in small and large networks, respectively.
Loading