TRAPL: Transformer-Based Patch Learning for Enhancing Semantic Representations Using Aggregated Features to Estimate Patch-Class Distribution

Sander Riisøen Jyhne, Per-Arne Andersen, Ivar Oveland, Morten Goodwin

Published: 2024, Last Modified: 07 Nov 2025SGAI Conf. (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We introduce TRAPL, a Transformer-based Patch Learning technique that enhances semantic representations in segmentation models. TRAPL leverages aggregated features for precise patch-class distribution estimation, gathering features at key layers in the Transformer architecture. The method integrates an auxiliary objective with a convolution-based classifier, enabling robust semantic learning at the patch level. Our experiments demonstrate significant improvements in Intersection-over-Union (IoU) performance across models and datasets. TRAPL is compatible with both flat and hierarchical Transformers, ensuring minimal computational load during training and no extra overhead during inference. Our evaluations across state-of-the-art models and benchmarks demonstrate TRAPL’s effectiveness for improving Transformer-based semantic segmentation.

External IDs:dblp:conf/sgai/JyhneAOG24