SMO-CLIP: Enhancing Anomalous Smoke Density Assessment Using A Hybrid LLM-VLM Approach

Pengfei Li, Muaz Al Radi, Mahmoud Said Elmezain, Abdelfatah Hassan Ahmed, Abderrahmene Boudiaf, Said Boumaraf, Jorge Dias, Hamad Karki, Sajid Javed, Khalid Yousef Al Awadhi, Naoufel Werghi

Published: 01 Jan 2024, Last Modified: 04 Nov 2025ICIP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Flare stacks are among the crucial components in the safety and emission control of petrochemical plants. However, due to the imperceptibility of smoke and contaminants, analyzing these released particles during flare stack operation is one of the top challenges. To stress the problem, our work presents a novel solution called SMO-CLIP that can hybridize knowledge from Vision-Language Models (VLMs), specifically the Contrastive Language Image Pretraining (CLIP) model, with extra insights derived from GPT-4 Large Language Model (LLM). Furthermore, two new tasks, Finegrained Smoke Density Recognition (FSDR) and Coarsegrained Smoke Density Recognition (CSDR) are investigated in this paper to accurately detect and evaluate varying smoke intensities. Notable advancements over current approaches are observed through extensive experiments, demonstrating the superior performance of the proposed approach against state-of-the-art models.

External IDs:dblp:conf/icip/LiREABB0KJAW24