Dual-Chain Reasoning: Enhancing Multimodal Document VQA Through Positive and Negative Reasoning Paths
Abstract: While Multimodal Large Language Models excel at many reasoning tasks, they face limitations in complex scenarios due to reasoning path divergence and cognitive overload. Current approaches predominantly focus on strengthening correct reasoning pathways while overlooking the critical need to identify and rectify erroneous ones. We introduce the Dual-Chain Reasoning (DCR) framework, a metacognition-inspired approach that addresses these fundamental challenges through two synergistic reasoning chains. The positive chain performs hierarchical task decomposition, systematically breaking down complex problems into manageable components. Simultaneously, the negative chain actively identifies error patterns and corrects logical fallacies, creating a comprehensive verification system. This bidirectional architecture enables iterative optimization through continuous cognitive verification, allowing the model to refine its reasoning process dynamically. Experimental results on ScienceQA and DocVQA benchmarks demonstrate DCR’s effectiveness, achieving accuracy improvements of 2.42% and 1.28% respectively over baseline models.
External IDs:dblp:conf/icig/ZhangWGYZW25
Loading