Keywords: Large Language Models, LLM, Deep Learning, NLP, Data Annotation, Data Markup, Text Classification, Incremental learning
TL;DR: The paper shows LLMs can match or surpass human annotators in validating text classifier outputs, using text or probabilistic methods. With fine-tuning and RAG, LLMs offer faster, scalable, high-quality annotation.
Abstract: Machine learning models for text classification are trained to predict a
class for a given text. To do this, training and validation samples must
be prepared: a set of texts is collected, and each text is assigned a class.
These classes are usually assigned by human annotators with different
expertise levels, depending on the specific classification task. Collecting
such samples from scratch is labor-intensive because it requires finding
specialists and compensating them for their work; moreover, the number
of available specialists is limited, and their productivity is constrained
by human factors. While it may not be too resource-intensive to collect
samples once, the ongoing need to retrain models (especially in incremental
learning pipelines He et al. (2021)) to address data drift (also called model
drift IBM (2024)) makes the data collection process crucial and costly over
the model’s entire lifecycle. This paper proposes several approaches to
replace human annotators with Large Language Models (LLMs) to test
classifier predictions for correctness, helping ensure model quality and
support high-quality incremental learning.
Track: ML track
Submission Number: 7
Loading