Evaluating LLM In-Context Few-Shot Learning on Legal Entity Annotation Task

Evaluating LLM In-Context Few-Shot Learning on Legal Entity Annotation Task

ICLR 2026 Conference Submission20258 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, Named Entity Recognition, Legal, Few-shot Learning, Data Annotation

Abstract: The emergence of Large Language Models (LLMs) has attracted attention due to their powerful in-context few-shot learning capability. Recent studies present significant results regarding its usage in document annotation tasks; in some cases, the model is comparable to human annotators. In our work, we evaluate LLM's in-context few-shot learning capability on a legal NER, assessing its usage in an annotation task process with humans. To do so, our study is based on the most extensive corpus known in Portuguese dedicated to the legal NER. In our experiments, we tried six different LLMs in various setups, and the results showed that an LLM can produce highly accurate annotations; the best model achieved an F1 score of 0.76. Moreover, through a detailed manual inspection of the divergence cases, we identified opportunities for improvement in the annotation process that produced the corpus, with a significant portion of these issues being correctly addressed by the evaluated models. Thus, our results show that the models can assist annotators, reducing the time, effort, and errors produced during the annotation process.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 20258

Loading