What's in a name? The Influence of Personal Names on Spatial Reasoning in BLOOM Large Language Models
Keywords: BLOOM, Bias, Large Language Model
Abstract: Large language models have been shown to exhibit reasoning capability. But the ability of these models to truly comprehend the reasoning task is not yet clear. An ideal model capable of reasoning would not be affected by the names of the entities over which the relations are defined. In this paper, we consider an algorithmically generated spatial reasoning task over the names of persons. We show that the choice of names has a significant impact on the reasoning accuracy of BLOOM large language models. Using popular names from different countries of the world, we show that BLOOM large language models are susceptible to undesirable variations in reasoning ability even though the underlying logical reasoning challenge does not depend on these names. We further identify that the conditional log probability scores characterizing the uncertainty in prediction produced by BLOOM models are not well-calibrated and cannot be used to detect such reasoning errors. We then suggest a new approach based on model self-explanations and iterative model introspection that performs better than BLOOM conditional log probability scores in detecting such errors and may help alleviate the bias exhibited by these models.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Social Aspects of Machine Learning (eg, AI safety, fairness, privacy, interpretability, human-AI interaction, ethics)
TL;DR: BLOOM models are susceptible to undesirable variations in reasoning ability depending on the choice of personal names even though the reasoning task does not depend on the choice of names.
Supplementary Material: zip
5 Replies
Loading