Jury-and-Judge Chain-of-Thought for Uncovering Toxic Data in 3D Visual Grounding

Kaixiang Huang; Qifeng Zhang; Jin Wang; Jingru Yang; Yang Zhou; Huan Yu; Guodong Lu; Shengfeng He

Jury-and-Judge Chain-of-Thought for Uncovering Toxic Data in 3D Visual Grounding

Kaixiang Huang, Qifeng Zhang, Jin Wang, Jingru Yang, Yang Zhou, Huan Yu, Guodong Lu, Shengfeng He

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Chain-of-Thought, 3D Vision Grounding, LLM-as-a-Judge

TL;DR: We propose Refer-Judge, a Jury-and-Judge Chain-of-Thought framework that leverages MLLMs to uncover toxic data in 3D visual grounding

Abstract: 3D Visual Grounding (3DVG) faces persistent challenges due to coarse scene-level observations and logically inconsistent annotations, which introduce ambiguities that compromise data quality and hinder effective model supervision. To address these challenges, we introduce Refer-Judge, a novel framework that harnesses the reasoning capabilities of Multimodal Large Language Models (MLLMs) to identify and mitigate toxic data. At the core of Refer-Judge is a Jury-and-Judge Chain-of-Thought paradigm, inspired by the deliberative process of the judicial system. This framework targets the root causes of annotation noise: jurors collaboratively assess 3DVG samples from diverse perspectives, providing structured, multi-faceted evaluations. Judges then consolidate these insights using a Corroborative Refinement strategy, which adaptively reorganizes information to correct ambiguities arising from biased or incomplete observations. Through this two-stage deliberation, Refer-Judge significantly enhances the reliability of data judgments. Extensive experiments demonstrate that our framework not only achieves human-level discrimination at the scene level but also improves the performance of baseline algorithms via data purification. Code is available at https://github.com/Hermione-HKX/Refer_Judge.

Supplementary Material: zip

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 20214

Loading