Revisiting Instruction Fine-tuned Model Evaluation to Guide Industrial Applications

Manuel Faysse; Gautier Viaud; CELINE HUDELOT; Pierre Colombo

Revisiting Instruction Fine-tuned Model Evaluation to Guide Industrial Applications

Manuel Faysse, Gautier Viaud, CELINE HUDELOT, Pierre Colombo

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 MainEveryoneRevisionsBibTeX

Submission Type: Regular Short Paper

Submission Track: Language Modeling and Analysis of Language Models

Submission Track 2: Resources and Evaluation

Keywords: Instruction Finetuning, Evaluation Metrics, Large Language Models

TL;DR: We show LLM-based metrics to better fit evaluation requirements introduced by IFT models, and quantify the trade-offs that emerge in industrial practical settings.

Abstract: Instruction Fine-Tuning (IFT) is a powerful paradigm that strengthens the zero-shot capabilities of Large Language Models (LLMs), but in doing so induces new evaluation metric requirements. We show LLM-based metrics to be well adapted to these requirements, and leverage them to conduct an investigation of task-specialization strategies, quantifying the trade-offs that emerge in practical industrial settings. Our findings offer practitioners actionable insights for real-world IFT model deployment.

Submission Number: 3163

Loading