Abstract: Highlights•We propose a transformer-based solution for Weakly Supervised Semantic Segmentation.•We utilize the attention weights from the transformer to refine the CAM.•We find different blocks’ attention weights capture distinct feature affinities.•Our method is simple yet effective, showing competitive results on PASCAL VOC 2012.
Loading