Physics-Informed Parametric Bandits for Beam Alignment in mmWave Communications

Hao Qin; Thang Duong; Ming Li; Chicheng Zhang

Physics-Informed Parametric Bandits for Beam Alignment in mmWave Communications

Hao Qin, Thang Duong, Ming Li, Chicheng Zhang

Published: 06 Jun 2025, Last Modified: 08 Jul 2025ICML Workshop on ML4WirelessEveryoneRevisionsBibTeXCC BY 4.0

Keywords: mmWave communications, bandit

TL;DR: We develop a physics-informed parametric bandit algorithm to solve the beam alignment problem and present experimental results demonstrating that our algorithm outperforms existing bandit methods.

Abstract: In millimeter wave (mmWave) communications, aligning the transmitter and receiver beams is crucial to reduce the significant path loss. As scanning the entire directional space is inefficient, designing an efficient and robust method to identify the correct optimal beamforming direction is essential. Many existing works use bandit algorithms for beam alignment but rely on unimodality or multimodality assumptions on the reward structure, as well as assuming the horizon is sufficiently long. However, such assumptions may not hold in practice and cause such algorithms to converge to choosing suboptimal beams. In this work, we propose the physics-informed algorithms *PR-ETC* and *PR-Greedy* that exploit the existence of a dominant path (e.g., LoS path), an assumption that is perhaps more realistic in practice, which has a connection to the Phase Retrieval Bandit problem. Through simulated experiments using the DeepMIMO dataset (Alkhateeb, 2019), we demonstrate that both algorithms outperform existing approaches across 4,952 bandit instances.

Submission Number: 29

Loading