Can LLMs Infer Domain Knowledge from Code Exemplars? A Preliminary Study

Published: 01 Jan 2024, Last Modified: 29 Jul 2025IUI Companion 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: As organizations recognize the potential of Large Language Models (LLMs), bespoke domain-specific solutions are emerging, which inherently face challenges of knowledge gaps and contextual accuracy. Prompt engineering techniques such as chain-of-thoughts and few-shot prompting have been proposed to enhance LLMs’ capabilities by dynamically presenting relevant exemplars. Are LLMs able to infer domain knowledge from code exemplars involving similar domain concepts and analyze the data correctly? To investigate this, we curated a synthetic dataset containing 45 tabular databases, each has domain concepts and definitions, natural language data analysis queries, and responses in the form of Python code, visualizations, and insights. Using this dataset, we conducted a within-subjects experiment to evaluate the effectiveness of domain-specific exemplars versus randomly selected, generic exemplars. Our study underscores the significance of tailored exemplars in enhancing LLMs’ accuracy and contextual understanding in domain-specific tasks, paving the way for more intuitive and effective data analysis solutions.
Loading