Analyzing the factual knowledge of parameter efficient instruction tuned mid-size Large Language Models

NeurIPS 2023 Workshop ICBINB Submission5 Authors

Published: 27 Oct 2023, Last Modified: 01 Dec 2023ICBINB 2023EveryoneRevisionsBibTeX
Keywords: Large Language Models, Parameter Efficient Instruction Tuning, Factual Knowledge
TL;DR: This paper analyzes the factual knowledge of LLM when instruction tuned using LoRA technique to complete Wikipedia triplets.
Abstract: Large Language Models (LLM) have significantly improved Natural Language Processing (NLP) by enhancing the accuracy, efficiency, and versatility of various NLP applications, from text generation to language translation, due to their ability to capture and leverage vast amounts of linguistic and factual knowledge. While LLM have pushed the boundaries, they typically need to be further instruction tuned to get improved performance on niche applications. In this paper, we focus on analyzing the factual knowledge of LLM keeping in mind the practical aspects of using LLM by: 1) training only a small injection model (having ≈ 0.05 % of the parameters of the base LLM) using the Low Rank Adapation (LoRA) parameter efficient technique, and 2) restricting our study to Llama-2-13b-chat and StableBeluga-13B, which are two mid-size LLM having 13 billion parameters and are based on the LLama 2 architecture. The injection model is instruction tuned for Knowledge Base (KB) construction on the LM-KBC 2023 challenge dataset, which contains subject-relation-object triplets of Wikipedia entities across 21 different factual relations. Our empirical analysis shows that even after instruction tuning, the LLM are: 1) deficient in foundational knowledge of many must-know areas like Geography, 2) unable to effectively use the context supplied in the prompt, and 3) fragile to subtle changes in prompt at inference. The source code for our experiments can be found at: https://github.com/Ffc1234/NIPS_ICBINB_ submission
Submission Number: 5
Loading