ALERTS: Active Learning and Ensemble LLM Real-Time Switch for Real-World Data Drift Challenges

ACL ARR 2024 June Submission5812 Authors

16 Jun 2024 (modified: 04 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In the rapidly changing real-world scenarios, data drift and ``cold-start'' issues present significant challenges for the development of machine learning models, along with the high cost and resource scarcity of domain experts. Traditional compact models fine-tuned on small number of domain-specific examples often outperform generic LLMs, despite the fine-tuned models struggling with rapid data changes. This study introduces ALERTS, an ensemble system designed to address these data challenges. The system comprises 1) an LLM to enhance early-stage performance and adapt to sudden data drifts, 2) an Active Learning (AL)-assisted compact model iteratively fine-tuned on annotations from daily human expert workflows, and 3) a switch mechanism that evaluates both models in real-time and selects the best-performing ones. We conducted empirical studies to understand the performance between LLMs and AL-assisted compact models, then evaluated our system's effectiveness through AL simulations of real-world scenarios. Our work offers a novel framework for developing robust language model systems across various dynamic real-world scenarios.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: active learning, benchmarking, automatic evaluation
Contribution Types: NLP engineering experiment, Approaches to low-resource settings
Languages Studied: English
Submission Number: 5812
Loading