When Two is Enough: CoT–PoT Ensembling for Efficient Self-Consistency in LLM Reasoning

Raman Saparkhan; Majd Hawasly; Md Rizwan Parvez; Mohammad Raza

When Two is Enough: CoT–PoT Ensembling for Efficient Self-Consistency in LLM Reasoning

Raman Saparkhan, Majd Hawasly, Md Rizwan Parvez, Mohammad Raza

15 Sept 2025 (modified: 05 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: self consistency, reasoning, chain of thought, program of thought, large language models

TL;DR: We propose a hybrid CoT–PoT ensembling framework that improves self-consistency accuracy while drastically reducing sampling cost—often needing only two samples per task.

Abstract: Self-consistency (SC) is a popular technique for improving the reasoning accuracy of large language models by aggregating multiple sampled outputs, but it comes at a high computational cost due to extensive sampling. We introduce a hybrid ensembling approach that leverages the complementary strengths of two distinct modes of reasoning: Chain-of-Thought (CoT) and Program-of-Thought (PoT). We describe a general framework for combining these two forms of reasoning in self-consistency, as well as particular strategies for both full sampling and early-stopping. We show that CoT-PoT ensembling not only improves overall accuracy, but also drastically reduces the number of samples required in comparison with the most efficient SC method. In particular, the majority of tasks can be addressed with *only two* samples, which has not been possible with any prior SC methods.

Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)

Submission Number: 6084

Loading