Speculative Behavior: An Approach to Large Language Model Evaluation and Optimization

NeurIPS 2023 Workshop ATTRIB Submission26 Authors

Published: 27 Oct 2023, Last Modified: 08 Dec 2023ATTRIB PosterEveryoneRevisionsBibTeX
Keywords: Large Language Model (LLM), Evaluation, Optimization, Model Behavior, Speculative Behavior, Behavioral Science, Black-box Services, AutoML
TL;DR: The paper presents an idea for measuring the behavior of Large Language Models (LLMs) and proposes the concept of Speculative Behavior Equivalence (SBE) and Speculative Behavior-Based Optimization problem (CSBO) for optimizing LLM behavior.
Abstract: Trained Large Language Models (LLMs) have gained significant interest due to their ability to interpret natural language instructions and address a wide range of tasks with high proficiency. However, in practice, these models pose multiple challenges. On one hand, it is exceedingly difficult to control and ensure that the model's behavior remains consistent, harmless, and safe. On the other hand, the most advanced models are delivered via APIs as black-box services, making it challenging to guarantee their proper behavior. Addressing these challenges has become an urgent concern, especially in environments where a model's response can impact safety and trustworthiness. Many recent studies focus on the evaluation of models using benchmarks based on community-curated datasets. However, this form of evaluation is prone to data leakage and premature dataset obsolescence. Moreover, it doesn't necessarily align with all the specific goals that may be desired. One alternative for aligning specific objectives with the model behavior is fine-tuning, but this process is time-consuming and might be prohibitively expensive for many organizations. In this study, we propose the idea of measuring the model's behavior towards specific objectives through the concept of Speculative Behavior Equivalence (SBE). We introduce a general, agnostic approach that can be adapted to various models and tailored to the unique metrics of individual cases whilst remaining constrained to specific budgets. Additionally, we formulate the Speculative Behavior-Based Optimization problem (CSBO), which presents an opportunity to leverage AutoML techniques in the field of LLMs for optimizing behavior.
Submission Number: 26