A modified vision transformer architecture with scratch learning capabilities for effective fire detection

Hikmat Yar, Zulfiqar Ahmad Khan, Tanveer Hussain, Sung Wook Baik

Published: 2024, Last Modified: 03 Nov 2025Expert Syst. Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We used a novel tokenization mechanism in ViT for effective fire scene classification.•Employing a locality attention mechanism with a tuned three dense layered-block of multi-head unit in MLP.•Extensive experimental evaluations via three benchmark and a self-created datasets.•The proposed model achieved state-of-the-art performance.

External IDs:dblp:journals/eswa/YarKHB24