GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis

Yushi Cao; Zhiming Li; Tianpei Yang; Hao Zhang; YAN ZHENG; Yi Li; Jianye HAO; Yang Liu

GALOIS: Boosting Deep Reinforcement Learning via Generalizable Logic Synthesis

Yushi Cao, Zhiming Li, Tianpei Yang, Hao Zhang, YAN ZHENG, Yi Li, Jianye HAO, Yang Liu

Published: 31 Oct 2022, Last Modified: 06 Apr 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: deep reinforcement learning, program synthesis

Abstract: Despite achieving superior performance in human-level control problems, unlike humans, deep reinforcement learning (DRL) lacks high-order intelligence (e.g., logic deduction and reuse), thus it behaves ineffectively than humans regarding learning and generalization in complex problems. Previous works attempt to directly synthesize a white-box logic program as the DRL policy, manifesting logic-driven behaviors. However, most synthesis methods are built on imperative or declarative programming, and each has a distinct limitation, respectively. The former ignores the cause-effect logic during synthesis, resulting in low generalizability across tasks. The latter is strictly proof-based, thus failing to synthesize programs with complex hierarchical logic. In this paper, we combine the above two paradigms together and propose a novel Generalizable Logic Synthesis (GALOIS) framework to synthesize hierarchical and strict cause-effect logic programs. GALOIS leverages the program sketch and defines a new sketch-based hybrid program language for guiding the synthesis. Based on that, GALOIS proposes a sketch-based program synthesis method to automatically generate white-box programs with generalizable and interpretable cause-effect logic. Extensive evaluations on various decision-making tasks with complex logic demonstrate the superiority of GALOIS over mainstream baselines regarding the asymptotic performance, generalizability, and great knowledge reusability across different environments.

Supplementary Material: pdf

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/galois-boosting-deep-reinforcement-learning/code)

20 Replies

Loading