Measuring Rhetorical Style in Scientific Writing with LLM Personas

Published: 23 Sept 2025, Last Modified: 22 Nov 2025LAWEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM Personas, AI for Metascience, Preference Models, LLM-as-Judge, Scientific Hype, Computational Social Science
Abstract: The rise of AI has fueled growing concerns about ``hype'' in machine learning papers, yet a reliable way to quantify rhetorical style independently of substantive content has remained elusive. Because strong empirical results can justify stronger claims, it is often unclear whether bold language reflects genuine evidence or merely rhetorical style. We introduce a counterfactual, LLM-based framework to disentangle rhetorical style from substantive content: multiple LLM rhetorical personas generate counterfactual writings from the same substantive content, an LLM judge compares them through pairwise evaluations, and the outcomes are aggregated using a Bradley--Terry model. Applying this method to 8,485 ICLR submissions sampled from 2017 to 2025, we generate more than 250,000 counterfactual writings and provide a large-scale quantification of rhetorical style in ML papers. Visionary framing significantly predicts downstream attention, including citations and media coverage, even after controlling for peer-review evaluations. We also observe a sharp rise in rhetorical strength after 2023, aligning with the growing use of LLM-based writing assistance. The reliability of our framework is validated by its robustness to the choice of personas and the high correlation between LLM judgments and human annotations. Our work demonstrates that LLMs can serve as instruments for improving how ML research is evaluated.
Submission Type: Research Paper (4-9 Pages)
Submission Number: 127
Loading