Consensus Matrix: A Role-Specialized Multi-Agent Framework for Structured Collaborative Decision-Making in Agentic Visual Media Workflows

Bingli Zhang; Xinyu Wang; Hsiang Lun Kao; Guozhong Zhang; Yijian wu; CHENKAI GAO; Yifan Wang; zhengda; Ning Lyu; Kaijie Chen

Consensus Matrix: A Role-Specialized Multi-Agent Framework for Structured Collaborative Decision-Making in Agentic Visual Media Workflows

Bingli Zhang, Xinyu Wang, Hsiang Lun Kao, Guozhong Zhang, Yijian wu, CHENKAI GAO, Yifan Wang, zhengda, Ning Lyu, Kaijie Chen

Published: 05 Jun 2026, Last Modified: 05 Jun 2026CVPR 2026 AAVM Workshop OralEveryoneRevisionsCC BY 4.0

Keywords: multi-agent systems, consensus learning, reinforcement learning, visual media evaluation, clinical decision support, Kendall's W, role specialization, agentic AI

TL;DR: A role-specialized multi-agent framework that quantifies inter-agent agreement via Kendall's W and uses it to drive adaptive, evidence-grounded deliberation in agentic visual media and clinical workflows.

Abstract: Agentic visual media workflows---spanning multi-agent video quality assessment, creative content evaluation, and clinical decision support---demand structured collaboration among role-specialized agents, yet existing multi-agent systems rely on simple voting or averaging and offer no principled measure of agreement quality. We present the Consensus Matrix, a general role-specialized multi-agent framework that quantifies and optimizes inter-agent agreement in complex, high-stakes workflows. Our framework instantiates N role-specialized LLM agents, each producing a structured opinion comprising preference scores, a confidence estimate, role-specific concerns, and a grounded evidence chain. These outputs populate a shared consensus matrix; agreement is measured via Kendall's coefficient of concordance W, which drives an adaptive feedback loop: when W falls below threshold, targeted feedback is directed at the most discordant agents. Unlike systems with fixed aggregation policies, our full system includes a reinforcement learning (RL) coordinator that learns to select round-to-round interaction strategies, accelerating convergence while preserving decision traceability; the coordinator is modular and can be omitted when computational budget is limited. We instantiate and validate the framework on oncology MDT deliberation---a demanding testbed where role diversity, evidence grounding, and consensus quality are all clinically critical. Across five medical benchmarks (MedQA, PubMedQA, DDXPlus, MedBullets, SymCat), our system achieves 87.5% average accuracy, outperforming the strongest baseline by 3.7 percentage points, with a mean concordance of W=0.823 and clinician appropriateness ratings of 8.9/10.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 8

Loading