Poor Alignment and Steerability of Large Language Models: Evidence Using 30,000 College Admissions Essays
Keywords: Large Language Models (LLMs), college admissions, social identity, synthetic text, sociodemographic
TL;DR: LLM-generated essays—whether identity-prompted or not—remain linguistically distinct from human-written ones, highlighting the limits of model alignment and steerability in high-stakes contexts like college admissions.
Abstract: People increasingly use large language models (LLMs) for formal writing, raising two key questions: Who do LLMs write like (model alignment), and can prompting change that (model steerability)? We investigate these questions in the high-stakes context of college admissions by comparing essays from 30,000 applicants with two types of LLM-generated essays: one based only on the essay question and another with added demographic information. Across all models and analytical approaches, we find that LLM-generated essays—whether identity-prompted or not—remain linguistically distinct from human-authored ones. Identity prompting fails to align model outputs with the linguistic patterns of the corresponding demographic groups across sex, race, first-generation status, and geography. In fact, identity-prompted and unprompted LLM essays are more similar to each other than to human-written texts, reinforcing homogenization rather than reducing it. These findings reveal fundamental limitations in the alignment and steerability of current LLMs and raise concerns about their use in high-stakes settings like college admissions.
Submission Number: 16
Loading