Analyzing the factual knowledge of parameter efficient instruction tuned mid-size Large Language Models
Keywords: Large Language Models, Parameter Efficient Instruction Tuning, Factual Knowledge
TL;DR: This paper analyzes the factual knowledge of LLM when instruction tuned using LoRA technique to complete Wikipedia triplets.
Abstract: Large Language Models (LLM) have significantly improved Natural Language
Processing (NLP) by enhancing the accuracy, efficiency, and versatility of various
NLP applications, from text generation to language translation, due to their ability
to capture and leverage vast amounts of linguistic and factual knowledge. While
LLM have pushed the boundaries, they typically need to be further instruction
tuned to get improved performance on niche applications. In this paper, we focus
on analyzing the factual knowledge of LLM keeping in mind the practical aspects
of using LLM by: 1) training only a small injection model (having ≈ 0.05 %
of the parameters of the base LLM) using the Low Rank Adapation (LoRA)
parameter efficient technique, and 2) restricting our study to Llama-2-13b-chat and
StableBeluga-13B, which are two mid-size LLM having 13 billion parameters and
are based on the LLama 2 architecture. The injection model is instruction tuned for
Knowledge Base (KB) construction on the LM-KBC 2023 challenge dataset, which
contains subject-relation-object triplets of Wikipedia entities across 21 different
factual relations. Our empirical analysis shows that even after instruction tuning,
the LLM are: 1) deficient in foundational knowledge of many must-know areas
like Geography, 2) unable to effectively use the context supplied in the prompt,
and 3) fragile to subtle changes in prompt at inference. The source code for our
experiments can be found at: https://github.com/Ffc1234/NIPS_ICBINB_
submission
Submission Number: 5
Loading