An Ensemble of LLMs Finetuned with LoRA for NER in Portuguese Legal Documents

Rafael Oleques Nunes; Letícia Maria Puttlitz; Antônio Oss Boll; Andre Suslik Spritzer; Carla Maria Dal Sasso Freitas; Dennis Giovani Balreira; Anderson Rocha Tavares

An Ensemble of LLMs Finetuned with LoRA for NER in Portuguese Legal Documents

Rafael Oleques Nunes, Letícia Maria Puttlitz, Antônio Oss Boll, Andre Suslik Spritzer, Carla Maria Dal Sasso Freitas, Dennis Giovani Balreira, Anderson Rocha Tavares

Published: 01 Jan 2024, Last Modified: 18 May 2025BRACIS (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Given the high computational costs of traditional fine-tuning methods and the goal of improving performance,this study investigate the application of low-rank adaptation (LoRA) for fine-tuning BERT models to Portuguese Legal Named Entity Recognition (NER) and the integration of Large Language Models (LLMs) in an ensemble setup. Focusing on the underrepresented Portuguese language, we aim to examine the reliability of extractions enabled by LoRA models and glean actionable insights from the results of both LoRA and LLMs operating in ensembles. Achieving F1-scores of 88.49% for the LeNER-Br corpus and 81.00% for the UlyssesNER-Br corpus, LoRA models demonstrated competitive performance, approaching state-of-the-art standards. Our research demonstrates that incorporating class definitions and counting votes per class substantially improves LLM ensemble results. Overall, this contribution advances the frontiers of AI-powered legal text mining, proposing small models and initial prompt engineering to low-resource conditions that are scalable for broader representation.

Loading