CodeWiki: Evaluating AI’s Ability to Generate Holistic Documentation for Large-Scale Codebases

CodeWiki: Evaluating AI’s Ability to Generate Holistic Documentation for Large-Scale Codebases

ACL ARR 2026 January Submission211 Authors

22 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Automated Documentation Generation, Documentation Generation

Abstract: Comprehensive software documentation is crucial yet costly to produce. Despite recent advances in large language models (LLMs), generating holistic, architecture-aware documentation at the repository level remains challenging due to complex and evolving codebases that exceed LLM context limits. Existing automated methods struggle to capture rich semantic dependencies and architectural structure. We present $\textbf{CodeWiki}$, a unified framework for automated repository-level documentation across seven mainstream programming languages. CodeWiki combines top-down hierarchical decomposition with a divide-and-conquer agent system to preserve architectural context and scale documentation generation, and a bottom-up synthesis that integrates textual descriptions with visual artifacts such as architecture and data-flow diagrams. We also introduce $\textbf{CodeWikiBench}$, a benchmark with hierarchical rubrics and LLM-based evaluation protocols. Experiments show that CodeWiki achieves a 68.79\% quality score with proprietary models, outperforming the closed-source DeepWiki baseline by 4.73\%, with especially strong gains on scripting languages. CodeWiki is released as open source to support future research.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: code generation and understanding

Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models

Languages Studied: English

Submission Number: 211

Loading