Abstract: Coding remains one of the most fundamental modes of interaction between humans and machines. With the rapid advancement of \textit{Large Language Models} (LLMs), code generation capabilities have begun to significantly reshape programming practices. This development prompts a central question: \textit{Have LLMs transformed code style, and how can such transformation be characterized?}
In this paper, we present a pioneering study that investigates the impact of LLMs on code style from the perspectives of naming conventions, complexity and maintainability, and structural similarity.
By analyzing code from over 19,000 GitHub repositories linked to arXiv papers published between 2020 and 2025, we identify measurable trends in the evolution of coding style that align with characteristics of LLM-generated code.
For instance, the proportion of snake\_case variable names in Python code increased from 47\% in Q1 2023 to 51\% in Q1 2025. Furthermore, we extend our analysis to examine whether LLM-generated content influences their subsequent code generation behavior. Our experimental results provide the first large-scale empirical evidence that LLMs affect real-world programming style.
Paper Type: Long
Research Area: Computational Social Science and Cultural Analytics
Research Area Keywords: Computational Social Science and Cultural Analytics
Contribution Types: NLP engineering experiment, Data analysis
Languages Studied: English
Submission Number: 3055
Loading