Abstract: Flare stacks are among the crucial components in the safety and emission control of petrochemical plants. However, due to the imperceptibility of smoke and contaminants, analyzing these released particles during flare stack operation is one of the top challenges. To stress the problem, our work presents a novel solution called SMO-CLIP that can hybridize knowledge from Vision-Language Models (VLMs), specifically the Contrastive Language Image Pretraining (CLIP) model, with extra insights derived from GPT-4 Large Language Model (LLM). Furthermore, two new tasks, Finegrained Smoke Density Recognition (FSDR) and Coarsegrained Smoke Density Recognition (CSDR) are investigated in this paper to accurately detect and evaluate varying smoke intensities. Notable advancements over current approaches are observed through extensive experiments, demonstrating the superior performance of the proposed approach against state-of-the-art models.
External IDs:dblp:conf/icip/LiREABB0KJAW24
Loading