Keywords: Civil Law Benchmark, Legal Reasoning, Large Language Model, Legal Tech
Abstract: The rapid advancement of large language models (LLMs) has expanded their potential in the legal domain. However, existing legal benchmarks remain largely English-centric and oriented toward common law, leaving a critical gap in evaluating LLMs for civil law systems that govern most jurisdictions worldwide. To address this gap, we introduce Vietnamese Legal Benchmark (VLegal-Bench), a cognitively grounded benchmark designed for the hierarchical and codified structure of Vietnamese law. Although instantiated in Vietnamese legislation, VLegal-Bench provides a replicable evaluation framework for civil law systems characterized by complex statutory hierarchies and frequent amendments. Inspired by Bloom’s taxonomy, VLegal-Bench assesses multiple levels of legal understanding through tasks that mirror real-world legal assistant use cases, including legal question answering, multi-step reasoning, and scenario-based problem solving. The benchmark contains 10,450 expert-annotated samples, each cross-validated against authoritative legal sources to ensure fidelity to practical legal workflows. By offering the first standardized legal benchmark for Vietnamese, VLegal-Bench enables systematic assessment of LLMs in civil law contexts and supports the development of more reliable and interpretable AI-assisted legal systems.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Resources and Evaluation, NLP Applications
Contribution Types: Data resources, Data analysis
Languages Studied: Vietnamese
Submission Number: 2463
Loading