On the Fault Tolerance of Self-Supervised Training in Convolutional Neural Networks

Rosario Milazzo, Vincenzo De Marco, Corrado De Sio, Sophie M. Fosson, Lia Morra, Luca Sterpone

Published: 01 Jan 2024, Last Modified: 03 Nov 2024DDECS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Deep neural networks (DNNs) are increasingly used in critical applications from healthcare to autonomous driving. However, their predictions were shown to degrade in the presence of transient hardware faults, leading to potentially catastrophic and unpredictable errors. Consequently, several techniques have been proposed to increase the fault tolerance of DNNs by modifying network structures and/or training procedures, thereby reducing the need for costly hardware redundancy. There are, however, design or training choices whose impact on fault propagation has been overlooked in the literature. In particular, self-supervised learning (SSL), as a pretraining technique, was shown to improve the robustness of the learned features, resulting in better performance in downstream tasks. This study investigates the fault tolerance of several SSL techniques on image classification benchmarks, including several related to Earth Observation. Experimental results suggests that SSL pretraining, alone or in combination with fault mitigation techniques, generally improves DNNs' fault tolerance, although the performance gap vary among datasets and SSL techniques.