Efficient Structured Reasoning Framework for SLMs via Dynamic Directed Acyclic Graph Construction

Efficient Structured Reasoning Framework for SLMs via Dynamic Directed Acyclic Graph Construction

ACL ARR 2026 January Submission2590 Authors

03 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Small Language Models (SLMs), Test-time Scaling, Directed Acyclic Graph (DAG), Structured Reasoning, Cognitive Modeling

Abstract: Small Language Models (SLMs) offer substantial cost efficiency but often struggle with complex reasoning tasks due to their limited parameter capacity. While methods such as Chain-of-Thought (CoT) and ReAct have proven effective for Large Language Models (LLMs), their effectiveness remains limited in the context of SLMs. Tree of Thoughts (ToT) enhances reasoning by enabling multi-path exploration, but poses practical limitations for SLMs due to its high computational overhead and limited extensibility for external tool integration. In this paper, we propose an Efficient Structured Reasoning Framework via Dynamic DAG Construction to enhance the reasoning capabilities of SLMs. Inspired by cognitive insights, the framework modularizes human problem-solving into “Atomic Thinking” components and dynamically reassembles them into a task-specific Directed Acyclic Graph (DAG), effectively pruning the reasoning search space. In addition, the framework compensates for the inherent limitations of SLMs by augmenting them with external knowledge through a modular architecture that supports node-level tool integration. Applied to 20B-scale SLMs, our framework achieves GPT-4o-level performance across various benchmarks with an average of seven model calls per problem, and outperforms ToT in both token efficiency and inference speed under equal accuracy.

Paper Type: Long

Research Area: AI/LLM Agents

Research Area Keywords: Interpretability and Analysis of Models for NLP, Efficient/Low-Resource Methods for NLP, Language Modeling

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models

Languages Studied: english

Submission Number: 2590

Loading