Abstract: We present FC-ICL, a retrieval-reranking framework that enhances large language models' (LLMs) function calling capabilities through optimized in-context demonstration selection. Addressing the critical challenge of semantic-contextual alignment in tool invocation tasks, our method combines efficient vector retrieval with task-specialized BERT reranking. The framework introduces three innovations: (1) a dynamic margin pairwise loss that aligns demonstration relevance with downstream tool-calling utility, (2) hybrid retrieval pipelines balancing lexical precision and semantic recall, and (3) reasoning-enhanced prompting templates enforcing structured decision logging. Evaluations in six variants of the Qwen-2.5 model demonstrate state of the art performance, achieving 0.900 fine-grained accuracy in tool argument extraction (+18% vs. BM25 baselines). Ablation studies reveal 12.4-37% error reduction in parameter type matching and 15% higher utility scores over single-stage retrieval approaches. In particular, Qwen3-8B with FC-ICL achieves 0.8973 fine-grained accuracy, surpassing zero-shot baselines by 60% in complex API invocation scenarios.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: Dialogue and Interactive Systems, Language Modeling,
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: English
Submission Number: 8028
Loading