Track: Track 2: Socio-Economical and Future Visions
Keywords: human-LLM co-writing, email, contextual benchmark, social reasoning, communication game
TL;DR: in a future where LLMs are writers and readers of emails, human-LLM co-writing is necessary for people to communicate effectively
Abstract: Email communication increasingly involves large language models (LLMs), but we lack intuition on how they will read, write, and optimize for nuanced social goals. We introduce HR Simulator, a game where communication is the core mechanic: players act as a Human Resources officer and write emails to resolve socially challenging workplace scenarios. An analysis of over 600 human and LLM emails with LLMs-as-judge reveals evidence for larger LLMs becoming more homogeneous in their email quality judgments, suggesting an emerging set of shared LLM norms and values. LLM-only emails outperform human emails under LLM judges (e.g., 23.5% vs. 48--54% success rate), but rewriting human drafts with models reliably improves over human-only and can sometimes beat LLM-only (e.g., from 40% to nearly 100% in one scenario). Rewrites make human emails more formal and empathetic, which likely contributes to the hybrid advantage. Our results demonstrate the efficacy of communication games as instruments to measure communication in the era of LLMs, and posit human--LLM co-writing as the most effective form of communication in that future.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Ari_Holtzman1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 14
Loading