Integrating self-attention mechanisms in deep learning: A novel dual-head ensemble transformer with its application to bearing fault diagnosis

Qing Snyder, Qingtang Jiang, Erin E. Tripp

Published: 2025, Last Modified: 05 Nov 2024Signal Process. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•The paper proposes a novel dual-head ensemble Transformer (DHET) algorithm.•The proposed DHET model employs a dual-input time-frequency architecture.•The model integrates the encoder module of the 1D Transformer model and a Vision Transformer model.•The DHET model combines different Transformer blocks specifically designed for 1D and 2D data input.•The proposed DHET notably enhances the performance and capability of the model.•The DHET model The DHET model outperforms CNN-based methods, 1D and Vision Transformers.