Mitigating Lies in Vision-Language ModelsDownload PDF

Published: 05 Dec 2022, Last Modified: 05 May 2023MLSW2022Readers: Everyone
Abstract: In this work, we bring new insights into the honesty of vision-language models, particularly in visual question answering (VQA). After a throughout revisit of the existing ‘lie’ behavior in pure language models, our work makes an unprecedented extension of ’lies’ to vision-language models. The results indicate that the lie prefixes have a more obvious misleading effect on vision-language models than on language models. We also propose a novel visual prefix and prove that the consistent vision-language prefix is more threatening to vision-language models. To defend the models from the stated ’lies’, we put forward an unsupervised framework based on Gaussian mixture modeling and obtain improvement with 3% against the language prefix and 12% against the vision-language prefix.
1 Reply