Translating Legal Text to Code with LLMs: A Benchmark and Evaluation Framework

Translating Legal Text to Code with LLMs: A Benchmark and Evaluation Framework

ACL ARR 2025 May Submission2223 Authors

18 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Catala is a domain-specific programming language for tax law, meant to facilitate the translation of legal text into exectuable computer code, thanks to a syntax close to that of legal language and reasoning. Legal statutes paired with their Catala translation have been published online periodically, but manual translation remains labor-intensive. In this work, we develop a benchmark for the evaluation of Catala code generation from legal text, including a training set to fine-tune Large Language Models. To assess the quality of the generated code, we introduce an evaluation framework extending current metrics for code generation. Our experiments with few-shot learning, as well as fine-tuned models, suggest the feasibility of automating legal code generation, and contrast with prior attempts to translate legal language into a formal representation.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: NLP Applications, Machine Translation, Language Modeling, Generation

Contribution Types: NLP engineering experiment, Data resources

Languages Studied: French

Submission Number: 2223

Loading