Per-Pixel Classification is Not All You Need for Semantic Segmentation

Bowen Cheng; Alex Schwing; Alexander Kirillov

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Bowen Cheng, Alex Schwing, Alexander Kirillov

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 SpotlightReaders: Everyone

Keywords: semantic segmentation, panoptic segmentation, mask classification

TL;DR: Mask classification permits use of the exact same model, loss, and training for both semantic- and instance-level segmentation while achieving state-of-the-art results on semantic segmentation.

Abstract: Modern approaches typically formulate semantic segmentation as a per-pixel classification task, while instance-level segmentation is handled with an alternative mask classification. Our key insight: mask classification is sufficiently general to solve both semantic- and instance-level segmentation tasks in a unified manner using the exact same model, loss, and training procedure. Following this observation, we propose MaskFormer, a simple mask classification model which predicts a set of binary masks, each associated with a single global class label prediction. Overall, the proposed mask classification-based method simplifies the landscape of effective approaches to semantic and panoptic segmentation tasks and shows excellent empirical results. In particular, we observe that MaskFormer outperforms per-pixel classification baselines when the number of classes is large. Our mask classification-based method outperforms both current state-of-the-art semantic (55.6 mIoU on ADE20K) and panoptic segmentation (52.7 PQ on COCO) models.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

Code: https://github.com/facebookresearch/MaskFormer

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/per-pixel-classification-is-not-all-you-need/code)

13 Replies

Loading