Abstract: Data wrangling — the process of preparing raw data for analysis through cleansing, transformation, and enrichment- is a critical
step in the data science pipeline. Its importance is amplified for time-series data, which underpins many applications, with forecasting being one of the most prominent tasks. Yet, current practices remain largely manual, time-consuming, and error-prone, limiting
productivity and scalability. In this paper, we introduce AutoDW-TS, an automated approach to time-series data wrangling powered by
Large Language Models (LLMs). Our method offers an end-to-end pipeline, automating key stages such as table merging, prediction
engineering, cleansing, imputation, and enrichment. To support diverse use cases, we developed multiple systems, including an interactive AutoDW-TS WebApp, Web APIs, and an AI agent. We share insights from developing and deploying these systems, along with results from an extensive evaluation across 38 time-series benchmarks. Our findings show that AutoDW-TS significantly improves forecasting performance, demonstrating its effectiveness and potential to transform time-series data preparation at scale.
Loading