Mutant Texts: A Technique for Uncovering Unexpected Inconsistencies in Large-Scale Vision-Language Models
Abstract: Recently, Vision-Language Models (VLMs) trained on large-scale noisy data have shown strong generalization abilities on many downstream tasks. In this paper, we introduce a new technique for uncovering unexpected inconsistencies in VLMs, which lead to the formulation of new research questions on how to improve VLMs. Specifically, we propose that performance on original texts should be compared with that of ‘mutant texts’, carefully-designed variants of the original texts. In contrast to text perturbations used to study robustness, ‘mutant texts’ represent large changes in the original texts that impact semantics. We present two types of example mutant texts: one-word-only (OWO) mutants, which replace the original text with one of the words it contains and plus-one-word (POW) mutants, which add a word to the original text. The mutant texts allow us to discover the existence of dominating words in texts that correspond to images. The embedding of a dominating words is closer to the image embedding than the embedding of the entire original text. The existence of dominating words reflects underlying inconsistency in a VLM’s embedding space, a possible source of risk for bias undetected without the mutant text technique.
Loading