Bi-directional dual contrastive adapting method for alleviating hallucination in visual question answering
Abstract: Highlights•A novel decoding method that can integrate seamlessly into the existing multimodal large language models improves the accuracy of model generation and reduces the hallucinations.•A simple but effective bi-directional attention mechanism and dual contrastive adapting in predictions strategy demonstrate the positive impact of extended image information on the model.•Comprehensive experiments on various datasets and benchmarks demonstrate the effectiveness of the proposed training-free method and show meaningful results in terms of measurable outcomes.
Loading