Reinforcement Learning with Physics-Informed Symbolic Program Priors for Zero-Shot Wireless Indoor Navigation

Tao Li; Haozhe Lei; Mingsheng Yin; Yaqi Hu

Reinforcement Learning with Physics-Informed Symbolic Program Priors for Zero-Shot Wireless Indoor Navigation

Tao Li, Haozhe Lei, Mingsheng Yin, Yaqi Hu

Published: 22 Jun 2025, Last Modified: 27 Jul 2025IBRL @ RLC 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: neuro-symbolic reinforcement learning, physics-informed reinforcement learning, zero-shot generalization, inductive biases, indoor navigation

TL;DR: We use symbolic programs to encode physics priors as the inductive biases to guide RL processes.

Abstract: When using reinforcement learning (RL) to tackle physical control tasks, inductive biases that encode physics priors can help improve sample efficiency during training and enhance generalization in testing. However, the current practice of incorporating these helpful physics-informed inductive biases inevitably runs into significant manual labor and domain expertise, making them prohibitive for general users. This work explores a symbolic approach to distill physics-informed inductive biases into RL agents, where the physics priors are expressed in a domain-specific language (DSL) that is human-readable and naturally explainable. Yet, the DSL priors do not translate directly into an implementable policy due to partial and noisy observations and additional physical constraints in navigation tasks. To address this gap, we develop a physics-informed program-guided RL (PiPRL) framework with applications to indoor navigation. PiPRL adopts a hierarchical and modularized neuro-symbolic integration, where a meta symbolic program receives semantically meaningful features from a neural perception module, which form the bases for symbolic programming that encodes physics priors and guides the RL process of a low-level neural controller. Extensive experiments demonstrate that PiPRL consistently outperforms purely symbolic or neural policies and reduces training time by over 26\%.

Submission Number: 11

Loading