Abstract: Given the high computational costs of traditional fine-tuning methods and the goal of improving performance,this study investigate the application of low-rank adaptation (LoRA) for fine-tuning BERT models to Portuguese Legal Named Entity Recognition (NER) and the integration of Large Language Models (LLMs) in an ensemble setup. Focusing on the underrepresented Portuguese language, we aim to examine the reliability of extractions enabled by LoRA models and glean actionable insights from the results of both LoRA and LLMs operating in ensembles. Achieving F1-scores of 88.49% for the LeNER-Br corpus and 81.00% for the UlyssesNER-Br corpus, LoRA models demonstrated competitive performance, approaching state-of-the-art standards. Our research demonstrates that incorporating class definitions and counting votes per class substantially improves LLM ensemble results. Overall, this contribution advances the frontiers of AI-powered legal text mining, proposing small models and initial prompt engineering to low-resource conditions that are scalable for broader representation.
Loading