A modified vision transformer architecture with scratch learning capabilities for effective fire detection

Published: 2024, Last Modified: 03 Nov 2025Expert Syst. Appl. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We used a novel tokenization mechanism in ViT for effective fire scene classification.•Employing a locality attention mechanism with a tuned three dense layered-block of multi-head unit in MLP.•Extensive experimental evaluations via three benchmark and a self-created datasets.•The proposed model achieved state-of-the-art performance.
Loading