Position: LLM alignment data should be regulated as mass media

João Gonçalves

Position: LLM alignment data should be regulated as mass media

João Gonçalves

Published: 02 Jun 2026, Last Modified: 10 Jun 2026Pluralistic-Alignment 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Alignment, post-training, regulation, mass media

Abstract: Most efforts to regulate and estimate the societal impacts of Large Language Models (LLMs) are aimed at model outputs. This makes regulation difficult, because outputs are stochastic and highly conditioned on diverse user prompts. This position paper draws from media and communication literature to argue that the regulatory focus has been misplaced, and that alignment datasets (e.g., supervised fine-tuning and preference pairs) should be regulated at the same level as mass media content such as newspaper articles or television advertising. Post-training alignment data has a direct influence on all user interactions with a model, representing the same one-to-many communication flow as traditional mass media. At the same time, mass media regulation has balanced for decades the need for audience protection with room for pluralist perspectives, providing a source of learning and inspiration for LLM regulation. Regulating post-training alignment data as mass media content is the most direct and actionable route for pluralism and accountability in LLM development and deployment.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 52

Loading