Abstract: Learning a unified model from multiple datasets is very challenging. In this paper, we propose a multi-dataset detector using the transformer (MDT). To enhance the effectiveness of the fusion of multiple datasets, we propose alternative learning to suppress the noisy data. To speed up the training of big data, we use scale shifting to save computational effort. Experiments on OpenImages, COCO, and Mapillary datasets show that our approach can significantly accelerate training while improving performance on multiple datasets. In the Robust Vision Challenge 2022, our solution won 1st place on the object detection track.
Loading