Keywords: neurosymbolic learning, weak supervision, latent representations
Abstract: We study the problem of learning neural classifiers in a neurosymbolic setting where the hidden gold labels of input instances must satisfy a logical formula. Learning in this setting proceeds by first computing (a subset of) the possible combinations of labels that satisfy the formula and then computing a loss using those combinations and the classifiers’ scores. However, the space of label combinations can grow exponentially, making learning difficult. We propose the first technique that prunes this space by exploiting the intuition that instances with similar latent representations are likely to share the same label. While this intuition has been widely used in weakly supervised learning, its application in our setting is challenging due to label dependencies imposed by logical constraints. We formulate the pruning process as an integer linear program that discards inconsistent label combinations while respecting logical structure. Our approach is orthogonal to existing training algorithms and can be seamlessly integrated with them. Experiments on three state-of-the-art neurosymbolic engines, Scallop, Dolphin, and ISED, demonstrate up to 74% accuracy gains across diverse tasks, highlighting the effectiveness of leveraging the representation space in neurosymbolic learning.
Supplementary Material: zip
Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)
Submission Number: 20861
Loading