Deep fusion of time series and visual data through temporal Features: A soft-sensor model for FeO content in sintering process

Chong Yang, Chunjie Yang

Published: 2025, Last Modified: 15 May 2025Expert Syst. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The ferrous oxide (FeO) content in finished sinter is a key indicator of the thermal reaction state and plays a pivotal role in quality control of the iron ore sintering process. In contrast to conventional chemical analysis methods, data-driven soft sensing models provide real-time monitoring and cost-effective solutions, making them highly suitable for modern industrial applications. However, the sintering process generates multi-source heterogeneous data, which presents significant challenges for efficient information extraction. To address these challenges, we propose a dual-branch deep-fusion architecture that concurrently processes time series and image data, thereby maximizing the utilization of process information to enhance soft-sensor accuracy. The architecture employs a lightweight vision Transformer to extract both instantaneous and dynamic visual features from image sequences, which are subsequently fused with time series data at the feature level to ensure consistency in dynamic representations. This carefully designed end-to-end framework reduces the redundant feature engineering work, especially in the extraction and fusion of multi-modal information. Evaluations on a real-world sintering process dataset demonstrate the robustness, flexibility, and efficiency of the proposed deep-fusion architecture. Moreover, this architecture achieves superior soft-sensor accuracy, with reductions in root-mean-square error ranging from 7.50% to 19.44% and increases in the coefficient of determination between 3.46% and 6.12%, compared to three baseline models.