Adapters for hyperspecific tasks: Small-scale LLMs writing better API calls

Adapters for hyperspecific tasks: Small-scale LLMs writing better API calls

ACL ARR 2025 February Submission2544 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: MoE models and other complex routing methods have garnered a lot of attention. Procuring those models is a costly endeavour, which is complex to distribute. This paper shows that simple hard routing has a lot of potential in Tool-Use QA applications, allowing for more distributed training and decentralization, and has potential applications elsewhere. The task considered is based on a proposed with separated the ReACT stages for separately trained models: Planner, Caller and Summariser. Previous approaches have largely relied on zero-shot comprehension of the nature of specific APIs based on the provided documentation. Zero-shot abilities of small models are limited and studies show that most commonly failures occur at the Caller stage of the pipeline. Therefore this study shifts away from zero-shot assumptions by using a hard routing-based strategy utilizing expert adapters for each category of APIs. The experimentation has shown that this pipeline can allow the 7 Billion model, to beat much larger, modern and closed-source models used in a zero-shot scenario on this task.

Paper Type: Short

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: open-domain QA,parameter-efficient-training,retrieval-augmented generation,NLP in resource-constrained settings,few-shot QA,conversational QA

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 2544

Loading