Self-Diagnosing Neural Networks: A Causal Framework for Real-Time Anomaly Detection in Training Dynamics

Self-Diagnosing Neural Networks: A Causal Framework for Real-Time Anomaly Detection in Training Dynamics

ICLR 2026 Conference Submission18440 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Training dynamics, Real-time anomaly detection, Self-supervised learning, Masked autoencoders, Spatiotemporal modeling, Open-set recognition, Uncertainty quantification

TL;DR: Activations and gradients are recast as a spatiotemporal video, enabling a causal, open-set diagnostician that detects and classifies training failures in real time, decisively outperforming scalar-curve baselines.

Abstract: Monitoring the training of neural networks via scalar curves often obscures early, subtle indicators of failure within the high-dimensional, nonconvex optimization process. This paper presents a pioneering framework that reconceptualizes neural network training as a high-dimensional spatiotemporal signal. By employing masked autoencoding on internal activations and gradients, a vision-based diagnostician is pretrained to perform open-set classification of training failures in real time, adhering to strict causal constraints. This approach achieves earlier and more reliable detection than conventional scalar-curve or generic video-based baselines across a diverse range of unseen models, datasets, and optimizers. Concretely, synchronized sequences of layer activations and gradients are rendered into internal-state videos. A Dynamics Masked Autoencoder (Dynamics-MAE) learns domain-specific representations of these dynamics, and a Temporal Vision Diagnostician (TeViD) equipped with an evidential learning head maps these video clips to a taxonomy of actionable diagnostic labels (e.g., overfitting, instability, catastrophic forgetting, concept bias). The model is designed to abstain from prediction under significant distribution shifts by classifying inputs as Unknown. The evaluation protocol is tailored for practical monitoring, emphasizing metrics such as time-to-detect, event-time area under the precision-recall curve, and risk-coverage analysis, complemented by a decision-theoretic utility measure. On over 500 held-out training runs that span unseen architectures, datasets, and optimizers (including anomaly types withheld during training), the proposed method attains an event-time area under the precision-recall curve of 0.96 ± 0.01 and triggers alerts a median of 6.2 epochs earlier than rule-based systems at a consistent 5% false-alarm rate. These results suggest a new class of Machine Learning Operations (MLOps) tools capable of perceiving the training process through its internal dynamics, paving the way for self-diagnosing and ultimately self-healing training systems.

Primary Area: learning on time series and dynamical systems

Submission Number: 18440

Loading