Automated Knowledge Bank Construction for Business Intelligence LLMs

Published: 04 Jul 2025, Last Modified: 04 Aug 2025KDD 2025 Workshop SKnow-LLM PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generative AI, Business Intelligence, Retrieval-Augmented Generation, Data Discovery, Natural Language Processing
TL;DR: A system that automatically extracts SQL knowledge from existing dashboards to teach LLMs organization-specific data practices, enabling natural language analytics without specialized engineering expertise.
Abstract: This paper presents a novel approach to building automated knowledge banks for Generative Business Intelligence (GenBI) systems, enabling natural language interfaces to organizational data without specialized engineering expertise. We demonstrate how dashboard definitions can be transformed into knowledge repositories that bridge the semantic gap between Large Language Models (LLMs) and organization-specific data contexts. Our methodology extracts SQL from dashboards, generates AI-powered data dictionaries, and indexes business terminology to teach LLMs "your SQL, not just SQL." Implemented for AWS Marketing, this system leverages dashboards as "proven recipes" containing both technical implementation and business context, ensuring alignment without manual documentation. By treating dashboards as crystalized business intelligence—representing validated queries enriched with business terminology—we demonstrate a scalable GenBI approach that maintains nuanced understanding of metrics calculations and business definitions. Validation achieved 94\% dashboard-to-SQL extraction success across all dashboard components, while evaluation showed 83\% accuracy on unseen questions, rising to 97\% for metrics directly visualized in source dashboards—demonstrating how automated knowledge extraction effectively powers natural language analytics while maintaining business context integrity.
Submission Number: 13
Loading