Multi-dataset Pretraining: A Unified Model for Semantic SegmentationDownload PDF

29 Sept 2021 (modified: 22 Oct 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Multi-dataset, semantic segmentation, contrastive learning
Abstract: Collecting annotated data for semantic segmentation is time-consuming and hard to scale up. In this paper, we propose a unified framework, termed as Multi-Dataset Pretraining, to efficiently integrate the fragmented annotations of different datasets. The highlight is that the annotations from different datasets can be shared and consistently boost performance for each specific one. Towards this goal, we propose a pixel-to-prototype contrastive learning strategy over multiple datasets regardless of their taxonomy labels. In this way, the pixel level embeddings with the same labels are well clustered, which we find is beneficial for downstream tasks. In order to model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross-dataset mixing and propose a pixel-to-prototype consistency regularization for better transferability. MDP can be seamlessly extended to semi-supervised setting and utilize the widely available unlabeled data to further boost the feature representation. Experiments conducted on several benchmarks demonstrate its superior performance, and MDP consistently outperforms the pretrained models over ImageNet by a considerable margin.
One-sentence Summary: This paper for the first time proposes a unified semantic segmentation framework to facilitate multi-dataset pretraining, and the pretrained model consistently outperforms the one pretrained over ImageNet by a large margin.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2106.04121/code)
5 Replies

Loading