Efficient Machine Learning-Based Semantic Segmentation Algorithm for Consumer-Grade AAV Remote Sensing

Published: 2025, Last Modified: 15 Jan 2026IEEE Trans. Consumer Electron. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The computational complexity of the Transformer model grows quadratically with input sequence length. This causes a sharp increase in computational cost and memory consumption for high-resolution remote sensing images. Consequently, its application in consumer-grade autonomous aerial vehicle remote sensing is limited. To address this issue, we propose an efficient machine learning-based semantic segmentation algorithm (EMLSSA). First, EMLSSA incorporates the hash clustering attention (HCAttention) mechanism. It employs the locality-sensitive hashing (LSH) algorithm to group similar features into hash buckets, enabling dynamic token clustering. Subsequently, tokens in the same hash bucket are aggregated by weighted summation. This compresses features and reduces the computational complexity of self-attention. Second, EMLSSA incorporates the frequency multi-layer perceptron (FMLP) mechanism. It combines frequency and spatial domain information, enhancing the ability of the Transformer to perceive local features. Experimental results show that EMLSSA-B4 reduces computational cost by 11.7% on FLAME, PWD, EarthVQA, and Potsdam datasets. Furthermore, it maintains comparable segmentation performance to SegFormer-B4.
Loading