Keywords: Agentic workflows, Large Language Models, Multi-Agent Systems, Chemistry Reasoning, Tool-Calling
TL;DR: We build a verifiable, tool-calling agentic workflow for chemistry and show that reasoning-trace distillation improves small-model performance in structured reasoning but not in general QA settings.
Abstract: Reasoning models have increasingly been used to perform complex tasks in open ended environments. A challenge facing such efforts is domain specific tuning, often requiring large quantities of data, and verifiability. We can construct a high-performance reasoning agentic workflow for chemistry that is a) verifiable and b) extensible through the use of tools.
We further show that distilling the outputs of the resulting workflow into smaller models results in lighter workflows that are still performant.
Archival Option: The authors of this submission want it to appear in the archival proceedings.
Submission Number: 165
Loading