Abstract: This work aims to reproduce and extend the findings of "Discovering and Mitigating Visual Biases through Keyword Explanation" by Kim et al.(2024). The paper proposes the B2T framework, which detects and mitigates visual biases by extracting keywords from generated captions. By identifying biases in datasets, B2T contributes to the prevention of discriminatory behavior in vision-language models. We aim to investigate the five key claims from the original paper, namely that B2T (i) is able to identify whether a word represents a bias, (ii) can extract these keywords from captions of mispredicted images, (iii) outperforms other bias discovery models, (iv) can improve CLIP zero-shot prompting with the discovered keywords, and (v) identifies labeling errors in a dataset. To reproduce their results, we use the publicly available codebase and our re-implementations. Our findings confirm the first three claims and partially validate the fourth. We reject the fifth claim, due to the failure to identify pertinent labeling errors. Finally, we enhance the original work by optimizing the efficiency of the implementation, performing a bias analysis without a classifier, and assessing the generalizability of B2T on a new dataset.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Jinwoo_Shin1
Submission Number: 4309
Loading