LiveDrill: Multimodal Segment-Triggered Data-to-Text for Time Series Foundation Models

Soumyadipta Sengupta; Amine EL KHAIR; Sebastiaan Buiting; Imane Khaouja; Yahia Salaheldin Shaaban; Abdallah Benzine

LiveDrill: Multimodal Segment-Triggered Data-to-Text for Time Series Foundation Models

Soumyadipta Sengupta, Amine EL KHAIR, Sebastiaan Buiting, Imane Khaouja, Yahia Salaheldin Shaaban, Abdallah Benzine

Published: 23 Sept 2025, Last Modified: 06 Nov 2025BERT2SEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multimodal Text Generation, Time Series Segmentation, TSFM, SLM, Drilling, Oil and Gas

TL;DR: Drilling text report generation from drilling time series sensor data using multimodal text generation and segmentation models

Abstract: Time-series foundation models show strong results on static benchmarks, but their potential in live industrial reporting is only beginning to be explored. In drilling, continuous multivariate sensor streams must be transformed into Daily Drilling Reports (DDRs), where each entry aligns with activity boundaries. Automating this process offers an opportunity to deliver reports that are both timely and consistent, reducing the burden of manual compilation. We present \textbf{LiveDrill}, a streaming pipeline for \textbf{multimodal segment-grounded data-to-text generation}. LiveDrill integrates two modules: a \textbf{Live Segmentation Module} that detects activity transitions in real time, and a \textbf{Multimodal Text Generation Module} that conditions report entries on both sensor signals and the detected segments. This design ensures that generated text is explicitly tied to operational intervals, providing structured updates directly from live data. Evaluation on large-scale field data demonstrates that LiveDrill can reliably capture stable operations and generate coherent DDR entries. Segment-level metrics reveal the sensitivity of boundary detection, highlighting areas where further improvement can yield even stronger results. Overall, LiveDrill demonstrates the feasibility of segment-grounded, multimodal reporting in industrial settings. It opens the door for adapting TSFMs beyond static benchmarks toward practical, boundary-sensitive applications where live sensor data must be translated into actionable narratives.

Submission Number: 12

Loading