"Call My Big Sibling (CMBS)'' – A Confidence-Based Strategy to Combine Small and Large Language Models for Cost-Effective Text Classification

ACL ARR 2024 December Submission1221 Authors

15 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Transformers have achieved cutting-edge results, with Large Language Models (LLMs) being considered the SOTA in several NLP tasks. However, the literature has not yet fully demonstrated that LLMs are always superior to first-generation Transformers (a.k.a. Small Language Models (SLMs)) in all NLP tasks and scenarios. This study compares three SLMs (BERT, RoBERTa, and BART) with open LLMs (LLaMA 3.1, Mistral, Falcon) across 9 sentiment analysis and 4 topic classification datasets. The results indicate that open LLMs can moderately outperform or tie with SLMs in all tested datasets, though only when fine-tuned, at a very high computational cost. Given this very high cost for only moderate effectiveness gains (3.1% on average), the applicability of these models in practical cost-critical scenarios is questioned. In this context, we propose "Call My Big Sibling'' (CMBS), a confidence-based strategy that smoothly combines calibrated SLMs with open LLMs based on prediction certainty. Documents with high (calibrated) confidence are classified by the cheaper SLM, while uncertain documents are directed to LLMs in zero-shot, in-context, or partially-tuned versions. Experiments show that CMBS outperforms SLMs and is very competitive with fully tuned LLMs in terms of effectiveness at a fraction of the latter's cost, offering a much better cost-effectiveness balance.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: Machine Learning for NLP, Language Modeling
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: English
Submission Number: 1221
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview