Thrust: Adaptively Propels Large Language Models with External Knowledge

Xinran Zhao; Hongming Zhang; Xiaoman Pan; Wenlin Yao; Dong Yu; Jianshu Chen

Thrust: Adaptively Propels Large Language Models with External Knowledge

Xinran Zhao, Hongming Zhang, Xiaoman Pan, Wenlin Yao, Dong Yu, Jianshu Chen

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX

Keywords: knowledge-intensive natural language processing, pre-trained language models, instance-level adaptive knowledge usage

TL;DR: We propose a simple and effective imetric, Thrust, to perform the adaptive knowledge injection, which is shown to be a good indicator of models' knowledgeability and can improve their performance of utilizing external knowledge.

Abstract: Although large-scale pre-trained language models (PTLMs) are shown to encode rich knowledge in their model parameters, the inherent knowledge in PTLMs can be opaque or static, making external knowledge necessary. However, the existing information retrieval techniques could be costly and may even introduce noisy and sometimes misleading knowledge. To address these challenges, we propose the instance-level adaptive propulsion of external knowledge (IAPEK), where we only conduct the retrieval when necessary. To achieve this goal, we propose to model whether a PTLM contains enough knowledge to solve an instance with a novel metric, Thrust, which leverages the representation distribution of a small amount of seen instances. Extensive experiments demonstrate that Thrust is a good measurement of models' instance-level knowledgeability. Moreover, we can achieve higher cost-efficiency with the Thrust score as the retrieval indicator than the naive usage of external knowledge on 88% of the evaluated tasks with 26% average performance improvement. Such findings shed light on the real-world practice of knowledge-enhanced LMs with a limited budget for knowledge seeking due to computation latency or costs.

Supplementary Material: zip

Submission Number: 3136

Loading