PeerCoT: Structured Multi-Agent Chain-of-Thought Collaboration for Error Localization in LLM Reasoning

Published: 08 Mar 2026, Last Modified: 25 Apr 2026ICLR 2026 Workshop LLM ReasoningEveryoneRevisionsBibTeXCC BY 4.0
Track: long paper (up to 10 pages)
Keywords: Multi-Agent Reasoning, Chain-of-Thought, Error Localization, Interpretability
TL;DR: PeerCoT introduces a structured multi-agent Chain-of-Thought exchange protocol and synthetic corrupted benchmarks to evaluate and measure step-level error localization in LLM reasoning.
Abstract: Large Language Model (LLM) agents exhibit emergent reasoning abilities through debate, critique, and self-reflection. However, most multi-agent systems only exchange final outputs between agents, which limits transparency and hinders the ability to diagnose and improve reasoning processes. PeerCoT introduces a structured and symmetric Chain-of-Thought (CoT) exchange protocol. In this system, peer agents transparently share reasoning traces, provide labeled critiques, and perform minimal-edit revisions before aggregation. This structure enables explicit measurement of process-level error-type identification, called Error Localization Success (ELS), within an agent’s reasoning. We introduce and release AQUA-RAT-Corrupted and GSM8K-Corrupted, structured benchmarks synthetically designed to evaluate error localization and correction in multi-agent reasoning. PeerCoT achieves 64.1% accuracy in AQUA-RAT-Corrupted and 53.15% in GSM8K-Corrupted. PeerCoT maintains competitive accuracy and transparency compared to the baseline models while providing explicit error taxonomy, critique, and ELS. Beyond outcome-level performance, the structured critique protocol corrects 30.43% of initially incorrect solutions in merged outputs. By aligning cooperative critique with fine-grained reasoning supervision, PeerCoT introduces explicit error identification in collaborative reasoning.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 114
Loading