Abstract: We address the problem of agricultural image segmentation by introducing a novel loss formulation called Adaptive Select Loss (ASL), inspired by the Top-k loss strategy. While Top-k loss was originally designed for classification tasks, ASL is specifically tailored for semantic segmentation. It exploits the hierarchical structure of loss computation specific in semantic segmentation–aggregated first at the pixel level, then at the image level–while accounting for the imbalance between precise but scarce image-level annotations and noisy yet abundant pixel-level labels. ASL selectively aggregates loss from a “Few” (Top-k) of the most informative image-level instances and from “Almost all” (remove few) pixel-level data, thereby balancing robustness and sensitivity to noise. To ensure stability during training, we introduce a derivative smoothing mechanism that addresses the convergence issues introduced by the hard selection threshold, particularly when training with small number of images for loss aggregation. Empirically, the proposed approach improves boundary localization and segmentation quality in the presence of annotation noise. We evaluate ASL on three challenging semantic segmentation
tasks–two agricultural and one mixed–using a visual transformer backbone, including hyperspectral data. ASL achieves consistent
performance improvements, with gains of approximately 2.5% on hyperspectral and satellite imagery, and up to 6% on RGB-D data
plant segmentation problem.
Loading