Differentiable Logic Machines

Matthieu Zimmer; Xuening Feng; Claire Glanois; Zhaohui JIANG; Jianyi Zhang; Paul Weng; Dong Li; Jianye HAO; Wulong Liu

Differentiable Logic Machines

Matthieu Zimmer, Xuening Feng, Claire Glanois, Zhaohui JIANG, Jianyi Zhang, Paul Weng, Dong Li, Jianye HAO, Wulong Liu

Published: 20 Jul 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: The integration of reasoning, learning, and decision-making is key to build more general artificial intelligence systems. As a step in this direction, we propose a novel neural-logic architecture, called differentiable logic machine (DLM), that can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems, where the solution can be interpreted as a first-order logic program. Our proposition includes several innovations. Firstly, our architecture defines a restricted but expressive continuous relaxation of the space of first-order logic programs by assigning weights to predicates instead of rules, in contrast to most previous neural-logic approaches. Secondly, with this differentiable architecture, we propose several (supervised and RL) training procedures, based on gradient descent, which can recover a fully-interpretable solution (i.e., logic formula). Thirdly, to accelerate RL training, we also design a novel critic architecture that enables actor-critic algorithms. Fourthly, to solve hard problems, we propose an incremental training procedure that can learn a logic program progressively. Compared to state-of-the-art (SOTA) differentiable ILP methods, DLM successfully solves all the considered ILP problems with a higher percentage of successful seeds (up to 3.5x). On RL problems, without requiring an interpretable solution, DLM outperforms other non-interpretable neural-logic RL approaches in terms of rewards (up to 3.9%). When enforcing interpretability, DLM can solve harder RL problems (e.g., Sorting, Path) than other interpretable RL methods. Moreover, we show that deep logic programs can be learned via incremental supervised training. In addition to this excellent performance, DLM can scale well in terms of memory and computational time, especially during the testing phase where it can deal with much more constants (>2x) than SOTA.

Submission Length: Long submission (more than 12 pages of main content)

Changes Since Last Submission: We rephrased our conclusion (modifications marked in blue) and updated the tables to highlight the best methods. We added additional explanations asked by the reviewers (marked in blue). -- Camera-ready: removed the color and added the authors & acknowledgement.

Assigned Action Editor: ~Olivier_Pietquin1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 639

Loading