Convex and Bilevel Optimization for Neuro-Symbolic Inference and Learning

Charles Andrew Dickens; Changyu Gao; Connor Pryor; Stephen Wright; Lise Getoor

Convex and Bilevel Optimization for Neuro-Symbolic Inference and Learning

Charles Andrew Dickens, Changyu Gao, Connor Pryor, Stephen Wright, Lise Getoor

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: NeSy, Neuro-Symbolic, Neurosymbolic, Optimization, Bilevel optimization, Convex optimization, Energy-based models, Deep learning

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We address a key challenge for neuro-symbolic systems by leveraging techniques from convex and bilevel optimization to develop a general first-order gradient based optimization framework for end-to-end neural and symbolic parameter learning.

Abstract: We address a key challenge for neuro-symbolic (NeSy) systems by leveraging convex and bilevel optimization techniques to develop a general first-order gradient-based framework for end-to-end neural and symbolic parameter learning. Specifically, we formulate NeSy learning as a bilevel program, and we employ Moreau smoothing and a graduated value-function approach to support learning with a constrained lower-level inference problem. The applicability of our learning framework is demonstrated with NeuPSL, a state-of-the-art NeSy architecture. To achieve this, we propose a primal and dual formulation of NeuPSL inference as a strongly convex linearly constrained quadratic program and show learning gradients are functions of the optimal dual variables. Based on this formulation, we develop a corresponding dual block coordinate descent algorithm that naturally exploits warm-starts. This leads to over $100 \times$ learning runtime improvements over the current state-of-the-art NeuPSL inference method. Finally, we provide extensive empirical evaluations across $8$ datasets covering a range of prediction tasks and demonstrate our learning framework achieves up to a $16$% point prediction performance improvement over the current standard learning process.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 5927

Loading