IndustryGPT: A Large Language Model for Industrial Domain-Specific Question Answering

15 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, Industry
Abstract: Large Language Models (LLMs) show great promise but often fall short in specialized industrial sectors, which demand high-fidelity, domain-specific knowledge and complex reasoning. To address this challenge, we introduce IndustryGPT, a Large Language Model specifically optimized for industrial question answering. This work presents two core contributions: a novel multi-stage fine-tuning Large Language Models are rapidly being adapted for high-stakes professional domains, yet the industrial sector, with its demand for deep expertise and precision, presents unique and formidable challenges. Standard fine-tuning approaches are often insufficient. In this work, we identify a critical paradox in domain-specific SFT: enriching training data with detailed explanations or Chains-of-Thought can counter-intuitively degrade a model's factual accuracy, revealing a fundamental conflict between learning to be correct and learning to be verbose. To resolve this, we propose a novel two-stage fine-tuning framework that first anchors the model's core knowledge using direct question-answer pairs, and only then cultivates its advanced reasoning and explanatory abilities. To rigorously evaluate our method and establish a much-needed standard for the field, we introduce the Industry-QA Benchmark, a comprehensive dataset of over 10,000 questions spanning numerous industrial disciplines, which we will release to the community. We supplement this with a curated industrial subset from SuperGPQA to ensure robust and generalizable evaluation. Our resulting model, IndustryGPT, demonstrates state-of-the-art performance, significantly outperforming strong proprietary and open-source models on our benchmarks. Crucially, it achieves this specialized expertise without any degradation of its general capabilities. This work presents not only a superior model for the industrial domain but also a principled training methodology that resolves a key challenge in developing specialized AI.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 6234
Loading