Multimodal Fusion of Heterogeneous Representations for Anomaly Classification in Satellite Imagery

Published: 01 Jan 2024, Last Modified: 04 Mar 2025SAC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This study introduces a multimodal approach designed to detect anomaly in satellite imagery. The approach is comprised of three main components: the image processing models (ResNet and Regulated Network), the text processing models (BERT and a graph model), and a combination model that acts as a classifier, seamlessly integrating features extracted from both image and text models. Our work utilizes the SpaceNet8 Challenge dataset, focusing on regions in Louisiana USA, damaged by Hurricane Ida in 2021, and the Ahrweiler district in Germany, impacted by flooding in Western Europe during the same year. The experimental results demonstrate significant performance improvements when integrating image processing models and text processing models, surpassing the baseline CNN models. This study offers insights for future work in anomaly detection and semantic segmentation, emphasizing the effectiveness of a multimodal approach with heterogeneous data.
Loading