Keywords: Pluralistic alignment, LLMs, perspectives, prompting
TL;DR: We propose, operationalize and use a framework for perspective identification to show, in a fine-grained manner, that LLMs do not adequately reproduce human diversity of opinions.
Abstract: Pluralistic representation and generation in LLMs is becoming increasingly relevant because of the importance of showcasing the diversity of opinion. Models are known to reduce the diversity of training data and to exhibit homogeneity when generating. However, this issue has been demonstrated primarily on multiple-choice questionnaires or using aggregated, high-level characteristics for free-form text. Such approaches do not adequately capture the variety of human perspectives expressed in opinionated text, nor do they identify the missing aspects of human text driving the pluralistic gap in LLM-generated text. In this paper, we aim to analyze model pluralism by extracting and comparing latent perspectives from human and LLM-generated text. We propose a two-tiered framework which identifies which aspects of perspectives are underrepresented in LLM generation. We evaluate an instance of our framework on on highly opinionated data from book reviews and find that while LLMs reduce pluralism in human text across levels of abstraction, simple prompting techniques such as persona prompting can alleviate the pluralistic gap in subjective aspects.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 135
Loading