Qzhou-Law: An Open Source Series of Chinese Legal Large Language Models

Published: 13 Dec 2025, Last Modified: 16 Jan 2026AILaw26EveryoneRevisionsBibTeXCC BY-NC-SA 4.0
Keywords: Post-Training, Legal Large Language Model, Legal Benchmark
Paper Type: Full papers
TL;DR: We train a series of legal LLMs with a new series of SOTA performance on legal benchmarks through a effective three-stage post-training method.
Abstract: Both general large language models (LLMs) and Chinese legal LLMs lack legal capability for practical application due to the limitations of available training datasets and effective legal training methods. To further enhance the legal capabilities of LLMs, we introduce a series of legal large language models (LLMs), Qzhou-Law 7B/14B/32B/72B. The core innovations of these model series are that (1) we construct a large-scale legal instruction-tuning dataset and (2) we explore a new training method to train a legal-specific LLM in three phases better. To build this dataset, we curated 853,000 legal instructions with appropriate data processing and augmented them with legal consultation data using an article-based IRAC (Issue, Rule, Application, Conclusion) technique. We demonstrate that our three-stage training approach yields better results than training only on the legal instruction dataset. Our trained models achieve a new series of state-of-the-art (SOTA) performances on both LawBench and LexEval, and we are the first to test the effect of scaling model size in this setting. Increasing the size of the models consistently yields stronger results. To evaluate the timely changes in laws, regulations, and relevant knowledge, we collected 1.4K questions from the National Unified Legal Professional Qualification Examination (NULPQE) between 2018 and 2024. Our models outperform those of our competitors on NULPQE. We make our models and the NULPQE dataset publicly available to facilitate future research in applying LLMs within the legal domain.
Poster PDF: pdf
Submission Number: 58
Loading