Abstract: Deep neural networks have exhibited remarkable performance in a variety of computer vision fields, especially in semantic segmentation tasks. Their success is often attributed to multi-level feature fusion, which enables them to understand both global and local information from an image. However, multi-level features from parallel branches exhibits different scales, which is a universal and unwanted flaw that leads to detrimental gradient descent, thereby degrading performance in semantic segmentation. We discover that scale disequilibrium is caused by bilinear upsampling, which is supported by both theoretical and empirical evidence. Based on this observation, we propose injecting scale equalizers to achieve scale equilibrium across multi-level features after bilinear upsampling. Our proposed scale equalizers are easy to implement, applicable to any architecture, hyperparameter-free, implementable without requiring extra computational cost, and guarantee scale equilibrium for any dataset. Experiments showed that adopting scale equalizers consistently improved the mIoU index across various target datasets, including ADE20K, PASCAL VOC 2012, and Cityscapes, as well as various decoder choices, including UPerHead, PSPHead, ASPPHead, SepASPPHead, and FCNHead.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: (Revision) In the revised manuscript, we reflected the comments provided by reviewers, including several clarifications, comparisons with other methods, analyses of computation costs, and adding other experiments. For this version of the manuscript, the added and changed parts are marked with blue color.
(Camera-Ready) Following the comments of Action Editor, we prepared a revised manuscript for the camera-ready version. Several clarifications, such as claim and scope, are added in the abstract, introduction, experiments, and conclusion sections.
Code: https://github.com/kmbmjn/ScaleEqualizationCode
Assigned Action Editor: ~Hanie_Sedghi1
Submission Number: 2638
Loading