Automating Enterprise Data Engineering with LLMs

Published: 10 Oct 2024, Last Modified: 30 Oct 2024TRL @ NeurIPS 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, data engineering, enterprise, entity matching
TL;DR: We show the challenges of using LLMs to solve data engineering tasks in enterprise scenarios and perform a case study on the task of entity matching.
Abstract: The automation of data engineering tasks is invaluable for enterprises to increase efficiency and reduce the manual effort associated with handling large amounts of data. Large Language Models (LLMs) have recently shown promising results in enabling this automation. However, data engineering tasks in real-world enterprise scenarios are often more complex than their typical formulations in the scientific community. In this paper, we study the challenges that arise when automating real-world enterprise data engineering tasks with LLMs. As part of the paper, we perform a case study on the task of matching incoming payments to open invoices, an instance of the entity matching problem. We also release a hand-crafted dataset based on the actual enterprise scenario to enable the research community to study the complexity of such enterprise tasks.
Submission Number: 47
Loading