Abstract: The rise of large language models (LLMs) has led to dramatic improvements across a wide range of natural language tasks.
These advancements have extended into the domain of code, facilitating complex tasks such as code generation, translation, summarization, and repair. However, their utility for real-world deployment in-the-wild has only recently been studied, on Software Engineering (SWE) tasks such as GitHub issue resolution. In this study, we examine the code reasoning techniques that underlie the ability to perform such tasks, and examine the paradigms used to drive their performance. Our contributions in this paper are: (1) the first dedicated survey on code reasoning for code tasks, highlighting overarching strategies, hybrid and agentic approaches; (2) a taxonomy of various techniques used to drive code reasoning; (3) a comprehensive overview of performance on common benchmarks and showcase new, under-explored benchmarks with high potential in SWE; (4) an exploration on how core properties of code can be used to explain different reasoning techniques; and (5) gaps and under-explored areas for future research.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: code models, reasoning, software engineering agents, code tasks, LLMs
Contribution Types: Surveys
Languages Studied: English
Submission Number: 6898
Loading