Improving Predictive Maintenance with the Health-Aware Transformer

Islam M. Momtaz A. Sadek; Eric O. Postma; Rogier Brussee; Juan Sebastian Olier

Improving Predictive Maintenance with the Health-Aware Transformer

Islam M. Momtaz A. Sadek, Eric O. Postma, Rogier Brussee, Juan Sebastian Olier

Published: 15 Oct 2025, Last Modified: 31 Oct 2025BNAIC/BeNeLearn 2025 OralEveryoneRevisionsBibTeXCC BY 4.0

Track: Type A (Regular Papers)

Keywords: Artificial Intelligence, Predictive Maintenance, Remaining Useful Life (RUL), Prognostics and Health Management (PHM), Mahalanobis Distance, Health-Aware Transformer (HAT), Explainable AI (XAI), NASA C-MAPSS Dataset

Abstract: Accurate prediction of the remaining useful life (RUL) of industrial machinery is central to predictive maintenance. The best RUL prediction accuracy reported in the literature is an RMSE of 11.27 on the NASA C-MAPSS benchmark, achieved by models such as the GCU-Transformer. However, these models act as black boxes with limited interpretability, which limits their trust in safety-critical applications. This study presents the Health-Aware Transformer (HAT), an extension of the Gated Convolutional Unit–Transformer that improves prediction accuracy while introducing transparency. HAT integrates a statistical framework based on the Mahalanobis Distance (MD), which quantifies deviations from a multivariate Gaussian baseline and serves as a clear health degradation indicator. The MD guides the model’s attention toward cycles that significantly deviate from a healthy baseline, linking the prediction to observable physical degradation. The Health-Aware Transformer achieves an RMSE of 10.95 and a safety rate of 38 safe predictions out of 100, outperforming existing models including BiLSTM-Attention (13.21), CTVAE (12.41), and the original GCU-Transformer (11.27). By embedding MD into the attention mechanism, HAT enhances predictive accuracy while slightly reducing safety, reflecting the trade-off between precision and conservative early warnings. As a secondary analysis, without the Gated Convolutional Unit–Transformer, the statistical ensemble of regressors based on MD trajectories achieves an RMSE of 15.51 and a safety rate of 53 engines. This interpretable model is well suited for safety-critical contexts requiring conservative predictions. Overall, the study quantifies the gap between transparent statistical models and complex deep learning approaches: accuracy improves from 15.51 (statistical ensemble) to 11.27 (GCU-Transformer) and further to 10.95 (HAT), showing concretely how predictive accuracy increases with model complexity.

Serve As Reviewer: ~Eric_O._Postma1

Submission Number: 66

Loading