Keywords: Applications, LLM/AI agents, Trust
Abstract: Large Language Model (LLM) agents are increasingly making choices on behalf of humans in different scenarios such as recommending news stories, searching for relevant related research papers, or deciding which product to buy. What drives LLMs' choices in subjective decision-making scenarios, where reasonable humans could have made different choices exercising their free will? In this work, we explore how LLMs' latent trust in (and preferences for) brand identities of the information source (e.g., author / publisher of news stories or research papers), credentials of the information source (e.g., reputation/dis-reputation badges and measures such as awards or PageRank), endorsements from other influential sources (e.g., recommendations from critics and reviewers) impacts the choices of agents powered by the LLMs. Our extensive experiments using 10 LLMs from 6 major providers provide the following insights. LLMs tend to prefer articles from reputed information sources. They also recognize domain expertise of information sources. We show that prompting alone does not help reduce favoritism towards preferred sources. Our work makes the case for better understanding the origins of LLMs' latent trust / preferences (i.e., during pre-training or through fine-tuning and instruction tuning) and for better control over these implicit biases (i.e., eliminate undesired biases and align desired biases with humans or societies represented by the LLM agents).
Submission Number: 162
Loading