Abstract: In the rapidly changing real-world scenarios, data drift and ``cold-start'' issues present significant challenges for the development of machine learning models, along with the high cost and resource scarcity of domain experts.
Traditional compact models fine-tuned on small number of domain-specific examples often outperform generic LLMs, despite the fine-tuned models struggling with rapid data changes.
This study introduces ALERTS, an ensemble system designed to address these data challenges. The system comprises 1) an LLM to enhance early-stage performance and adapt to sudden data drifts, 2) an Active Learning (AL)-assisted compact model iteratively fine-tuned on annotations from daily human expert workflows, and 3) a switch mechanism that evaluates both models in real-time and selects the best-performing ones.
We conducted empirical studies to understand the performance between LLMs and AL-assisted compact models, then evaluated our system's effectiveness through AL simulations of real-world scenarios.
Our work offers a novel framework for developing robust language model systems across various dynamic real-world scenarios.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: active learning, benchmarking, automatic evaluation
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: English
Submission Number: 5812
Loading