Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach

Shijian Deng; Wentian Zhao; Yu-Jhe Li; Kun Wan; Daniel Miranda; Ajinkya Kale; Yapeng Tian

Efficient Self-Improvement in Multimodal Large Language Models: A Model-Level Judge-Free Approach

Shijian Deng, Wentian Zhao, Yu-Jhe Li, Kun Wan, Daniel Miranda, Ajinkya Kale, Yapeng Tian

Published: 08 Jul 2025, Last Modified: 26 Aug 2025COLM 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Self-Improvement, Multimodal Large Language Models

TL;DR: A novel judge-free self-improvement framework for multimodal large language models (MLLMs) efficiently enhances reliability by controlling hallucinations without costly model-level verification loops.

Abstract: Self-improvement in multimodal large language models (MLLMs) is crucial for enhancing their reliability and robustness. However, current methods often rely heavily on MLLMs themselves as judges, leading to high computational costs and potential pitfalls like reward hacking and model collapse. This paper introduces a novel, model-level judge-free self-improvement framework. Our approach employs a controlled feedback mechanism while eliminating the need for MLLMs in the verification loop. We generate preference learning pairs using a controllable hallucination mechanism and optimize data quality by leveraging lightweight, contrastive language-image encoders to evaluate and reverse pairs when necessary. Evaluations across public benchmarks and our newly introduced IC dataset, designed to challenge hallucination control, demonstrate that our model outperforms conventional techniques. We achieve superior precision and recall with significantly lower computational demands. This method offers an efficient pathway to scalable self-improvement in MLLMs, balancing performance gains with reduced resource requirements.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html

Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html

Submission Number: 27

Loading