Keywords: question decomposition, compositional reasoning, reasoning robustness, self-consistency, chain-of-thought, formal structures, category theory
TL;DR: Operads formalize question decomposition; the induced notion of "operadic consistency" predicts QA accuracy across models and datasets better than temperature-based self-consistency does.
Abstract: Question decomposition, i.e. breaking a complex query into simpler sub-queries whose answers are composed to produce a final answer, is a widely used strategy for improving LLM reasoning, yet it currently lacks a rigorous mathematical foundation. In this paper, we propose operads --- mathematical structures that model many-in, one-out operations and compositions thereof --- as a natural framework for describing question decomposition. We define the *questions operad* $\mathcal{Q}$, in which operations correspond to question templates and composition corresponds to substitution of sub-answers, and show how QA models can be interpreted as algebras over $\mathcal{Q}$. Beyond reframing existing practice, this operadic perspective points toward new methods --- in particular, a notion of *reasoning robustness*, which measures consistency of a QA model's answers across all partial collapses of a question decomposition tree. In experiments across eight models and four multi-hop QA datasets, we find that *operadic consistency* --- a scalar instantiation of reasoning robustness --- is strongly correlated with accuracy, whereas temperature-based self-consistency is not, suggesting that the operadic notion captures a distinct and useful signal. We argue that operads are the natural mathematical home for question decomposition, and that invariants such as reasoning robustness open new directions for analyzing and improving the reliability of multi-step reasoning.
Paper Type: Long (8 pages)
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 32
Loading