Debias your VLM with Counterfactuals: A Unified Approach

Yi Li; Nuno Vasconcelos

Debias your VLM with Counterfactuals: A Unified Approach

Yi Li, Nuno Vasconcelos

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: societal considerations including fairness, safety, privacy

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: vision-language, foundation models, bias mitigation, fairness, image editing

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: Task & domain-agnostic framework to debias vision-language models with counterfactual data from text & image editing

Abstract: Recent advances in vision-language research have produced numerous foundation models that excel in tasks such as image classification, image-text retrieval, and image captioning. However, these models are shown to exploit spurious correlations in biased training data, raising fairness concerns for discrimination against underprivileged groups. In this work, we propose CVLD, a unified framework for quantifying and mitigating vision-language biases in a task and domain-agnostic setting. By defining a causal intervention module that produces counterfactual image-text pairs, we apply causal fairness metrics to capture the discrepancy between model predictions on original and counterfactual distributions. Building on the universal fairness notion, we propose a set of bias-free adaptation techniques to mitigate the bias of pre-trained VL models by optimizing their robustness to interventions on the protected attribute, requiring minimal modification to the naive training pipeline. CVLD demonstrates robust debiasing results on image classification, retrieval and captioning using adaptation datasets of varying sizes, validating the importance of counterfactual data in studying vision-language bias.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6651

Loading