KnowDomain: Self Knowledge Generative Prompting for Large Language Models in Zero-Shot Domain-Specific QA
Abstract: In recent years, Large Language Models (LLMS) have exhibited remarkable proficiency in comprehending and generating language.
Consequently, LLMs have become an integral part of AI system building. However, it has been observed that in the case of domain-specific QA (DSQA), direct prompting techniques do not fully leverage the capabilities of LLMs, especially in the case of a zero-shot setting, due to the scarcity of annotated data and the nonavailability of tailored retrieval data. To address this gap, we propose a self-knowledge generative prompting technique for DSQA that generates the necessary knowledge for accurate responses using LLMs in the absence of external data. We evaluated our method using LLMs ranging from 3.8B to 70B parameters and observed consistent improvements, with accuracy gains ranging from 4\% to 40\% over the base models. When compared to the best-performing baselines, our approach achieved an average improvement of 6.3\%. Additionally, we observed a cumulative accuracy gain of 177 points across 20 diverse model–dataset combinations, highlighting the method’s robustness. While improvements were generally consistent, performance showed sensitivity to specific task–model interactions. With this work, we present a lightweight, domain-agnostic strategy that enables robust model adaptation with minimal effort and strong empirical gains.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: LLMs, Domain Specific QA, Domain adaptibility, Knowledge generation, Zero shot
Contribution Types: Approaches to low-resource settings
Languages Studied: English
Previous URL: https://openreview.net/forum?id=JDEB3K1HCM
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: No, I want the same area chair from our previous submission (subject to their availability).
Reassignment Request Reviewers: Yes, I want a different set of reviewers
Justification For Not Keeping Action Editor Or Reviewers: We observed that one of the reviewers did not check the data provided and reviewed it without it.
Software: zip
Data: zip
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: N/A
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: N/A
B1 Elaboration: It was a common effort to create the dataset by all of us. Hence, we have not mentioned specifically the creators.
B2 Discuss The License For Artifacts: N/A
B3 Artifact Use Consistent With Intended Use: N/A
B4 Data Contains Personally Identifying Info Or Offensive Content: N/A
B5 Documentation Of Artifacts: Yes
B5 Elaboration: Appendix A
B6 Statistics For Data: Yes
B6 Elaboration: Appendix A
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Section 4.4
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Section 4.4
C3 Descriptive Statistics: Yes
C3 Elaboration: Section 4.4
C4 Parameters For Packages: Yes
C4 Elaboration: NLTK is used and it is mentioned in the code.
D Human Subjects Including Annotators: No
D1 Instructions Given To Participants: N/A
D2 Recruitment And Payment: N/A
D3 Data Consent: N/A
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: N/A
E Ai Assistants In Research Or Writing: Yes
E1 Information About Use Of Ai Assistants: No
E1 Elaboration: It's only for checking the grammar in some parts of Section 1 and Section 4. Any time it was not used for any new content generation.
Author Submission Checklist: yes
Submission Number: 1487
Loading