Abstract: Multi-modal large language models (MLLMs) have been rapidly developed and widely used in various fields, but the (potential) stereotypical bias in the model has not been studied. In this study, we present pioneering research aimed at understanding the presence and implications of stereotypical bias in three widely-used open-source MLLMs: LLaVA-v1.5, MiniGPT-v2, and CogVLM. Specifically, we explore stereotypical bias in MLLMs from two modalities (vision and language), considering three scenarios (occupation, descriptor, and persona), and two attributes (gender and race). We find that 1) MLLMs demonstrate notable stereotypical biases across various scenarios, with LLaVA-v1.5 and CogVLM emerging as the most biased models; 2) these stereotypical biases can be rooted in both the training datasets and pre-trained models' inherent biases; and 3) leveraging specific prompt prefixes demonstrates considerable performance in reducing stereotypical bias, though their effectiveness is inconsistent. Overall, our work serves as a crucial step toward understanding and addressing stereotypical bias in MLLMs. We appeal to the community's attention to the stereotypical bias inherent in the rapidly-evolving MLLMs and to actively contribute to the development of unbiased and responsible multi-modal AI systems.
Paper Type: long
Research Area: Ethics, Bias, and Fairness
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
0 Replies
Loading